papers in physics, vol. 6, art. 060004 (2014) received: 3 august 2014, accepted: 3 august 2014 edited by: g. martinez mekler licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060004 www.papersinphysics.org issn 1852-4249 commentary on “critical phenomena in the spreading of opinion consensus and disagreement” franco bagnoli1∗ the authors of ref. [1] study a variation of the voter model in which the neighbors of a cluster of agents agree or disagree with the nearest member of the group according with the agreement inside the group. the article is interesting and well written. the authors show that the phase transitions towards the full consensus obey definite scale relations. i have some points that i think should be considered by the authors. the first one is about fig 1 in their paper. for pd = pc = 1, the system presents three absorbing states: full consensus “black”, full consensus “white” and maximal disagreement, i.e., staggered “black-white”. it seems from the figure that the system reaches the “black” absorbing state at a time about 2300 and 4700, and the staggered one at time about 7500. however, in all these cases the presumed absorbing state is abandoned. this is surely due to the fact that what is shown is only a portion of the lattice; the figure is misleading and should be replaced with a picture of the whole lattice, where the “annihilation” of the random walkers corresponding to the boundaries of the clusters could be observed. the second point is about the correspondence with annihilating random walks. it seems to me that in one dimension for pc = pd = 1, the dynam∗e-mail:franco.bagnoli@unifi.it 1 department of physics and astronomy and csdc, university of florence, via g. sansone 1 50019 sesto fiorentino, italy. ics of the model corresponds to that of annihilating random walkers, like in the voter model, except that in this case there are two types of walkers that do not annihilate among themselves. let me call these two walkers types l and r. the portion of the lattice between two l corresponds to the pattern (10)* (odd rows black, even-rows white), that between two r is a pattern (01)* (even rows black, odd-rows white), that between one l and one r is 0* and that between one r and one l is 1*. r and l can cross, two l or two r annihilate. if this correspondence is true, one should be able to interpret the scaling laws in the context of annihilating random walks. for pc = pd different from 1, one might find some correspondence with branching annihilating random walks and directed percolation. the final concern is about the generalization of the rule to non-regular lattices. social networks are surely not regular, and often show a broad distribution of connectivity. how could the rule be generalized to these cases? [1] a chacoma, d h zanette, critical phenomena in the spreading of opinion consensus and disagreement, pap. phys. 6, 060003 (2014). 060004-1 papers in physics, vol. 1, art. 010004 (2009) received: 7 october 2009, accepted: 7 october 2009 edited by: m. c. barbosa licence: creative commons attribution 3.0 doi: 10.4279/pip.010004 www.papersinphysics.org issn 1852-4249 answer to the commentary on “a note on the consensus time of mean-field majority-rule dynamics” damián h. zanette1∗ in his commentary [1], h. fort begins by pointing out two aspects of the majority-rule (mr) model that, as presented in the main paper [2], may need some clarification. it was suggested in the paper that the equivalence of the mean-field mr dynamics and a random walk can be formulated in terms of the evolution of n+, the number of agents with opinion +1. specifically, in order to write down a master equation for the process, one should find the transition probability that, in a single evolution step, n+ changes to any other n ′+. it can be readily seen that such event requires that the size g of the group of agents chosen to reach consensus at that step satisfies g > 2|∆|, with ∆ = n ′+ − n+. if ∆ is positive (respectively, negative) the number of agents with opinion −1 (respectively, +1) in the group must be exactly equal to |∆|. summing the probabilities for all the possible values of g yields the transition probability for a given ∆. of course, one could cut off and renormalize the probability distribution for the group size, pg, in such a way that g can be at most equal to the population size n . the two regimes in the size dependence of the consensus time would certainly still exist –although explicitly working out analyt∗e-mail: zanette@cab.cnea.gov.ar 1 consejo nacional de investigaciones cient́ıficas y técnicas, centro atómico bariloche and instituto balseiro, 8400 san carlos de bariloche, ŕıo negro, argentina. ical results such as eq. (7) of the main paper may become trickier. however, it is somehow artificial to conceive that, in a social process driven by events which involve agent groups, probabilities depend on the total population size. it sounds more natural to just allow any group size and, in the case that g ≥ n , admit that the population falls into an absorbing, frozen state. also, from an operational viewpoint, a cut-off probability distribution with g ≤ n would be difficult to implement if the population size varies with time. the commentary also addresses a series of generalizations of the mr dynamics –some of which have already been considered in the literature– that certainly add realism to the model. heterogeneity among agents is crucial to more realistically approach any population-based complex system. in the context of the mr model, inflexible and unsettled agents as well as “contrarians” have been considered in a series of papers by s. galam and coworkers (and recently reviewed by castellano et al. [3]). also, we have discussed the effects of several forms of heterogeneity on synchronization dynamics [4, 5], which bear close similarities with opinion formation. overall, the expected consequence of heterogeneity is that full consensus is never reached, but a dynamical state – where the degree of consensus fluctuates with time around a well-defined average– is asymptotically approached. spatially-distributed populations, where agent groups are localized in space, have been taken into 010004-1 papers in physics, vol. 1, art. 010004 (2009) / d. h. zanette account since the first formulations of the mr dynamics [6, 7]. it has been shown that dimensionality has nontrivial consequences in the attainment of consensus. for a fixed group size g, the consensus time grows with the population size n as a power whose exponent depends on the dimension [3]. in one-dimensional arrays, the final state is sensibly dependent on the initial condition, while in higher dimensions, the dynamics coincides with diffusive coarsening. either random or deterministic time-dependent effects, such as stochastic fluctuations in the rules that govern single consensus events or population dynamics, are certainly worth considering. one may ask, for instance, whether sufficiently frequent “births” or “arrivals” of dissenters are able to overcome consensus attainment. to my knowledge, population dynamics has not been addressed in the context of mr model. on the other hand, randomness in each consensus event has already been taken into account [3]. finally, global opinion-formation factors such as mass media, advertising, and propaganda play the role of external fields in the spin-like dynamics of the mr model, as also discussed by s. galam [3]. [1] h fort, commentary on “a note on the consensus time of mean-field majority-rule dynamics”, pap. phys. 1, 010003 (2009). [2] d h zanette, a note on the consensus time of mean-field majority-rule dynamics, pap. phys. 1, 010002 (2009). [3] c castellano, s fortunato, v loreto, statistical physics of social dynamics, rev. mod. phys. 81, 591 (2009). [4] g h paissan, d h zanette, synchronization of phase oscillators with heterogeneous coupling: a solvable case, physica d 237, 818 (2008). [5] d h zanette, interplay of noise and coupling in heterogeneous ensembles of phase oscillators, eur. phys. j. b 69, 269 (2009). [6] p l krapivsky, s redner, dynamics of majority rule in two-state interacting spin systems, phys. rev. lett. 90, 238701 (2003). [7] c j tessone, r toral, p amengual, h s wio, m san miguel, neighborhood models of minority opinion spreading, eur. phys. j. b 39, 535 (2004). 010004-2 papers in physics, vol. 6, art. 060011 (2014) received: 16 october 2014, accepted: 16 october 2014 edited by: l. a. pugnaloni licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060011 www.papersinphysics.org issn 1852-4249 reply to the commentary on “granular discharge rate for submerged hoppers” t. j. wilson,1, 2 c. r. pfeifer,1, 3 n. meysingier,1, 4 d. j. durian1∗ the commentary of staron [1] raises interesting points and helps frame additional lines of research. based on her initial report, we have made several helpful improvements and clarifications to our manuscript [2]. below, we respond to some of the remaining issues. first, staron is absolutely correct in suggesting that we do not have a clear definitive explanation for the “surge” effect shown in fig. 1, where the discharge rate increases as filling height decreases toward zero. the interstitial fluid clearly plays a role, but there are many possibilities. it could be due to suction into the space between grains as they move apart near the exit, as staron suggests. it could also be due to a reduction in grain–grain or grain–wall friction, or due to lubrication effects when grains come together. we hope that on-going experiments [3] on trends versus system size, etc., will help shed light on this issue. a terminal surge has now been seen for dry grains in air [3], so the effect is of broader interest. perhaps in vacuum it would disappear altogether. in any case, theoretical and simulation input would be helpful. ∗email: djdurian@physics.upenn.edu 1 department of physics & astronomy, university of pennsylvania, philadelphia, pa 19104-6396, usa. 2 department of physics, illinois wesleyan university, bloomington, il 61702-2900, usa. 3 department of physics & astronomy, carleton college, northfield, mn 55057, usa. 4 strath haven high school, wallingford, pa 19086, usa. as for the beverloo scaling, even for dry cohesionless grains, we sympathize with staron’s point that intermittent formation and break-up of force chain arches over the outlet is hard to imagine for very large apertures —where a continuum model would be more natural. we all agree that the janssen argument for saturation of pressure versus depth does not apply, or else discharge rates would grow with hopper diameter. this has been shown directly by recent experiments with a conveyor belt [4]. therefore, some sort of shielding of grain–grain pressure over the outlet is needed, even if transient arches do not form. how exactly the grain pressure behaves near the outlet during flow seems like a fruitful topic for study, perhaps along the lines suggested by staron and fig. 1 of her comment. finally, regarding the exit speed for grains in the submerged case, we did indeed demonstrate a beverloo-like scaling simply by replacing the freefall speed √ gd for the dry case with a single-grain terminal falling speed for the submerged case. as a minor remark, the terminal speed is not simply the stokes speed, which holds only for small grains and small reynolds numbers re. we varied the grain size over a large enough range that the fluid drag crossed over from viscous at low re to inertial at high re. the latter is for large grains, in which case the fluid drag scales as speed-squared and the interstitial fluid flow must be somewhat turbulent. nevertheless, the modified beverloo-like scaling holds across both regimes, with the same small-hole cutoff (which is significantly larger than in the dry 060011-1 papers in physics, vol. 6, art. 060011 (2014) / t. j. wilson et al. case). thus, it may not be necessary to grapple with the full complexities of the interstitial fluid flow. further experiments would be helpful, both to investigate initial transients as staron suggests and also to systematically vary an imposed downor up-flow of fluid through the packing along the lines illustrated in fig. 1d of our manuscript. the reason for a small-hole cutoff must involve dynamics, and it would be interesting to know how the size depends on fluid properties. [1] l staron, commentary on “granular discharge rate for submerged hoppers”, pap. phys. 6, 060010 (2014). [2] t j wilson, c r pfeifer, n meysingier, d j durian, granular discharge rate for submerged hoppers, pap. phys. 6, 060009 (2014). [3] j koivisto, d j durian, unpublished (2014). [4] m a aguirre, j g grande, a calvo, l a pugnaloni, j-c géminard, pressure independence of granular flow through an aperture, phys. rev. lett. 104, 238002 (2010). 060011-2 papers in physics, vol. 1, art. 010003 (2009) received: 2 october 2009, accepted: 6 october 2009 edited by: m. c. barbosa licence: creative commons attribution 3.0 doi: 10.4279/pip.010003 www.papersinphysics.org issn 1852-4249 commentary on “a note on the consensus time of mean-field majority-rule dynamics” hugo fort,1∗ in this commentary, i review the article by d. h. zanette on the consensus time of mean-field majority-rule dynamics [1]. the paper identifies two different regimes for the mean field (mf) version of the majority-rule (mr) opinion dynamics, characterized by different dependences on the population size n . in one of them, corresponding to gradual persuasion, the typical known logarithmic dependence is observed. the novelty appears in the alternative regime, associated with very drastic events, which is governed by a power law. in this commentary, i point out a couple of minor points that, in my opinion, deserve further clarification. i also make some general remarks and briefly discuss some features which can be incorporated in order to use the model in more realistic contexts. the author addresses the problem of the dependence of the consensus time with the population size in a mean field (mf) version of the majorityrule (mr) opinion dynamics. the size of the group of agents selected at each evolution step, g, is drawn from a probability distribution pg, which for large g decays as a power law. the main result is that, for mfmr, the consensus time s exhibits two distinct regimes, characterized by different dependences on the population size n . if the exponent of pg is larger than 2, s has ∗e-mail: hugo@fisica.edu.uy 1 instituto de f́ısica, facultad de ciencias, universidad de la república, iguá 4225, cp 11.400 montevideo, uruguay. a dependence of the kind n log n which is already known from analytical results for constant g. on the other hand, if the exponent of the distribution of group sizes is less or equal than 2, the dependence of s on n is also given by a power law. it is interesting that the two regimes are related to two different mechanisms of consensus attainment: gradual persuasion versus drastic large-g events which involve the whole population at a single evolution step. some points that would need further clarification are: a. the equivalence of the mfmr to a random walk under the action of a force field, mentioned at the end of page 1, is an interesting issue. for those who are not experts on this area i think that it deserves a more detailed explanation; indeed, the walk has variable step length since only the agents with the minority opinion flip, and the direction of the force seems to exhibit a sort of persistence. b. what is the rationale for considering values of g larger than the population size? in fact, the maximum physically possible g is n . perhaps it is for technical reasons or difficulties implementing the constraint that g cannot be greater than n ? some considerations, a little beyond the scope of the majority-rule opinion model, on features concerning realistic situations which are not included in this model: 010003-1 papers in physics, vol. 1, art. 010003 (2009) / h. fort 1. individual heterogeneity. there are always individuals who are not susceptible to the mr for their convictions, or by necessity, and resist the majority opinion. in addition, there are individuals who can change their mind not only by following herd behaviour but also through other mechanisms like learning from experience, etc. 2. space. the spatial structure might have a relevant effect on the mr opinion dynamics. has this been studied? in general, it turns out that spatial correlation may introduce important differences in agent based models. 3. chance. the application of the mr is completely deterministic. the effects of introducing a stochastic component is something worth exploring. for example, in the form that some of the individuals in the minority of g do not flip or some in the majority suffer spontaneous flips. 4. population dynamics. for long times it seems unavoidable to consider births and deaths, this turnover population seems to be a dynamic source for the opinion formation. maybe this could be implemented by a noise term like the one mentioned above. 5. opinion formation factors. mass media signals, advertising and propaganda operating over the individuals play an important role in opinion formation and can have an important impact. it seems that this could be modeled by external fields. [1] d h zanette, a note on the consensus time of mean-field majority-rule dynamics, pap. phys. 1, 010002 (2009). 010003-2 papers in physics, vol. 6, art. 060010 (2014) received: 16 october 2014, accepted: 16 october 2014 edited by: l. a. pugnaloni licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060010 www.papersinphysics.org issn 1852-4249 commentary on “granular discharge rate for submerged hoppers” l. staron1, 2∗ i. introduction the paper by wilson et al [1] describes experimental results on the discharge of hoppers filled with granular material and immersed in water. the discharge of dry granular matter through hoppers (as well as pipes and silos of various geometries) has been and is still much studied, due to the practical importance of these flow geometries but also because of the theoretical difficulties posed by their puzzling behavior. in this paper, the authors examine how the well-known results for “dry” hoppers are affected when the whole system is immersed in water. this experimental setup is original and discloses an intriguing new behavior. the authors have no clear physical explanation for this intriguing new behavior; however, the results are very interesting and leave many questions open for future research. some of them are the subjects of the comments below. ii. scaling for the discharge velocity for dry hoppers in the introduction, the beverloo scaling, that is the scaling for the discharge velocity for dry granu∗e-mail: lydie.staron@upmc.fr 1 cnrs université pierre et marie curie paris 6, umr 7190, institut jean le rond d’alembert, f-75005 paris, france. 2 school of earth sciences, university of bristol, queens road, bristol br8 1rj, united kingdom. lar hoppers, is understood as resulting from a freefall arch phenomenology (as described in [2]): force chains form and break intermittently above the orifice, shielding the grains from the pressure above and allowing free-fall dynamics. as a result, the velocity scales like √ gd, where d is the diameter of the orifice and g is gravity. this mechanism is described as intuitive. yet, i find that the discrete (i.e., explicitly granular) picture of force chains is difficult to conciliate with the observation of stationary flow. would not the intermittent formation and breaking of force chains lead to an intermittent flow regime, as observed for small apertures? how do force chains form over large apertures, up to 300 diameters in the present paper? this aspect is indeed central for the understanding of the physics of granular hoppers. an alternative interpretation of the discharge velocity is to consider the granular flow as a continuum with yield stress properties. these yield stress properties are the cause of the existence of “dead zones”, namely areas of static equilibrium in the bottom corners of the containers and at the container’s walls [3]. these “dead zones” can be seen themselves like solid walls surrounding the flow around the outlet. considering this simplification of the dynamics of the grains above the outlet, one can again try to understand the discharge rate as resulting from the friction forces. i fully agree with the authors that the janssen analysis applied to the walls of the container cannot account for the beverloo scaling, and that a local argument is needed [4]. interestingly, the janssen analysis applied to the conduit formed by the static areas is a local argument that may yield the correct 060010-1 papers in physics, vol. 6, art. 060010 (2014) / l. staron figure 1: illustration of a granular flow in a “dry” hopper. the static area forms solid walls surrounding the flow area. the classical janssen analysis considers a slice of material spanning the whole hopper and the friction forces at the wall. alternatively, the analysis may be performed for a slice of material spanning the flow area only, considering the friction forces acting in the bulk at the boundary between flow and static areas. (the dotted lines show typical streamlines.) scaling. this is illustrated in fig. 1. equilibrating the pressure gradient and the friction forces at the walls for a slice of material spanning the whole container gives a saturation pressure scaling like ρgw/2µ, where w is the hopper diameter and µ the coefficient of friction at the walls: this is the classical janssen result. using the same analysis, equilibrating the pressure gradient and the friction forces acting between the flow and the static areas for a slice of material spanning the flow only gives a saturation pressure scaling like ρgw ′/2µ′, where w ′ is the flow width and µ′ the coefficient of friction at this location. very likely, w ′ will scale like the outlet diameter d, hence the beverloo scaling. iii. tall and short columns for immersed hoppers in the case of immersed hoppers, the authors successfully explain the new scaling for the discharge rate by replacing the free-fall-like velocity observed in dry cases by the stokes velocity. this works well but, as stated by the authors, this probably gives an oversimplified picture of what really happens in the hopper above the outlet. in particular, since the grains initially filling the hoppers are arranged in dense packing, the flow at the aperture, necessary involving shearing, may induce locally dilation and the sucking of water in the hopper, as it is the case in quick sands for instance. in [5], the pressure at the bottom of dense granular columns immersed in water at the onset of collapsing was found to be negative. of course the systems are different, but the same mechanism may apply. the experimental setup used in wilson et al, being 3d and opaque, does not allow to observe in details what happens during the transient, when the flow starts. hence, one can only speculate that probably dilation occurs, and probably it does affect the discharge rate. the shearing occurring when the flow starts (between the flow conduit and the surrounding static areas as illustrated in fig. 1) is massive as it involves the full height of the column. one thus expects a non negligible quantity of fluid to be sucked into the hopper, and as a result, a change in the structure of the packing to occur. this may be at the origin of the difference between very tall column and shorter columns described by wilson et al. the fluid sucked in shorter columns may be sufficient to affect the whole packing structure, while the structure of taller column may be partly preserved (the upper part for instance) so that the well-known dry granular behavior is preserved too. in other words, taller columns would still coincide with the discharge of dense granular flow, while shorter columns may imply the discharge of a dense suspension. this, of course, is only a supposition. iv. conclusion the experiment described in this paper is not easy to understand: the hopper configuration is complex in itself, and the fact that the system is immersed implies that we are at the frontier between 060010-2 papers in physics, vol. 6, art. 060010 (2014) / l. staron the physics of dense dry granular packings and dense suspensions. many interesting open questions remain, and certainly the work in progress by the same team will bring original new material to understand what happens during the discharge of granular flows be it dry or immersed. [1] t j wilson, c r pfeifer, n meysingier, d j durian, granular discharge rate for submerged hoppers, pap. phys. 6, 060009 (2014). [2] j e hilton, p w cleary, granular flow during hopper discharge, phys. rev. e 84, 011307 (2011). [3] l staron, p-y lagrée, s popinet, the granular silo as a continuum plastic flow: the hourglass vs the clepsydra, phys. fluids 24, 113303 (2012). [4] h a janssen, versuche uber getreidedruck in silozelen, zeitschr. vereines deutsch. ing. 39, 1045 (1895). [5] l rondon, o pouliquen, p aussillous, granular collapse in a fluid: role of the initial volume fraction, phys. fluids 23, 073301 (2011). 060010-3 papers in physics, vol. 6, art. 060005 (2014) received: 3 august 2014, accepted: 3 august 2014 edited by: g. martinez mekler licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060005 www.papersinphysics.org issn 1852-4249 reply to the commentary on “critical phenomena in the spreading of opinion consensus and disagreement” a. chacoma,1 d. h. zanette1, 2∗ 1. taking into account the reviewer’s concern (see ref. [1]), we have replaced the right panel of fig. 1 in our main article (see ref. [2]) by a plot showing the evolution of an entire 200-agent array. we hope that this dissipates the possible confusion pointed out by the reviewer. the caption and the main text have been modified accordingly. 2. indeed, the equivalence between the onedimensional (1d) versions of the voter model with nearest-neighbor interactions and of diffusionlimited binary annihilation (a + a → 0) has been recognized since the first studies of coarsening processes [3]. the boundaries separating domains with different opinions in the 1d voter model move as random walkers, which annihilate with each other when they meet during their motion. in view that, in the case of a linear array with pd = pc = 1, our model reduces to two mutually intercalated voter systems, the scaling laws of binary annihilation also apply to our results. it is well known, for instance, that the number of particles a(t) in 1d diffusionlimited annihilation decays with time as a ∼ t−1/2 [4]. this implies that, in a finite system, the time needed for complete annihilation of an initial (even) number of particles, a(0), goes as t ∼ a(0)2. in our ∗e-mail: zanette@cab.cnea.gov.ar 1 instituto balseiro and centro atómico bariloche, 8400 san carlos de bariloche, ŕıo negro, argentina. 2 consejo nacional de investigaciones cient́ıficas y técnicas, argentina. model, in turn, the initial number of boundaries between opinion domains can be seen to behave as b(0) ∼ n+(0)n−(0)n, where n±(0) is the initial fraction of agents with each opinion, and n is the system size. the above result indicates that the time needed for all the boundaries to disappear is t ∼ n2+(0)n2−(0)n2. in other words, as illustrated by the results shown in the lower panel of fig. 2, the product n−2t depends only on the initial concentration of each opinion. a similar argument makes it possible to show that the probability of reaching consensus pcons depends on n±(0), but it is independent of the system size, as shown in the upper panel of the same figure. on the other hand, the possible connection between the case pd, pc 6= 1 and branchingannihilation random walks, is less clear. as remarked by the reviewer, this connection should, in turn, establish a link with the universality class of directed percolation. however, the facts that our model exhibits multiple absorbing states and that there is no phase where fluctuations persist at asymptotically long times (as well as the absence —in the 1d case, where the connection is expected to hold— of nontrivial critical exponents) do not seem to suggest a relation to that universality class [5]. the point, nevertheless, is worth considering in future work. 3. certainly, as acknowledged in the paper’s final section, the most important direction in which our model should be extended is to consider more 060005-1 papers in physics, vol. 6, art. 060005 (2014) / a. chacoma et al. complex topologies, in particular, those that represent real-life social systems. the oneand twodimensional arrays studied in the paper are just a convenient —and probably the simplest— way of defining the groups of agents that participate in the opinion dynamics. it must be realized, however, that the existence of an underlying network of social contacts (either ordered or not) is not necessary to specify the structure of groups relevant to our class of models. in fact, the most general definition of the group structure is to provide a list of all the groups present in the population, enumerating the agents that belong to each group. this procedure encompasses all the possible partitions into groups of any given population —even those that cannot be represented by means of an underlying network [6]— and thus allows for the consideration of any degree of complexity compatible with the population size. the active group g and the reference group g′ involved in each interaction event can then be chosen —for instance, at random— from the list that specifies the group structure. note that, from this perspective, a network — whose topology is entirely defined by the list of all its links— is nothing but a structure formed by a set of two-agent groups. in this sense, the notion of group structure generalizes that of network, introducing a kind of higher-degree connection between population members [6, 7]. [1] f bagnoli, commentary on “critical phenomena in the spreading of opinion consensus and disagreement”, pap. phys. 6, 060004 (2014). [2] a chacoma, d h zanette, critical phenomena in the spreading of opinion consensus and disagreement, pap. phys. 6, 060003 (2014). [3] s redner, a guide to first-passage processes, cambridge university press, cambridge (2001). [4] a s mikhailov, a y loskutov, foundations of synergetics ii. chaos and noise, springer, berlin (1996). [5] h hinrichsen, nonequilibrium critical phenomena and phase transitions into absorbing states, adv.phys. 49, 815(2000). [6] d h zanette, beyond networks: opinion formation in triplet-based populations, phil. trans. r. soc. a 367, 3311 (2009). [7] d h zanette, a note on the consensus time of mean-field majority-rule dynamics, pap. phys. 1, 010002 (2009). 060005-2 papers in physics, vol. 7, art. 070010 (2015) received: 25 june 2015, accepted: 25 june 2015 edited by: a. vindigni licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070010 www.papersinphysics.org issn 1852-4249 commentary on “an intermediate state between the kagome-ice and the fully polarized state in dy2ti2o7” mauro perfetti1∗ the authors of ref. [1] provide experimental evidence on the existence of an intermediate state between the kagome-ice and the fully polarized state in dy2ti2o7 single crystals and try to support this finding with theoretical calculations. after recalling the state-of-the-art knowledge of these systems, the authors present ac susceptibility measurements and monte-carlo simulations tentatively performed to reproduce and better characterize this intermediate state. the spin-ice systems are a hot topic in molecular magnetism because they represent a class of model systems to study frustration not only experimentally, but also theoretically. the magnetic moments of rare earths in the pyrochlore derivatives ln2ti2o7 (ln=dy,ho) sit on a lattice of corner-shared tetrahedra and have ising character, as a result of the crystal field. in this geometry, the spin configuration “two in two out” minimizes the (partially free) magnetic charges, and thus the dipolar energy, within each tetrahedron. with this “ice rule”, it is impossible to fulfill simultaneously all the six ferromagnetic nearest-neighbor couplings among spins. therefore, frustration arises already at the level of a single tetrahedron, with a six-fold degenerate ground state. this degeneracy is enhanced at the bulk level, where it produces a macroscopic entropy at low temperature. for this rea∗e-mail:mauro.perfetti@unifi.it 1 dipartimento di chimica “u. schiff” and udr instm, universitá di firenze, via della lastruccia 3-13, sesto fiorentino(fi), italy. son, spin-ice systems possess a macroscopic number of quasi-degenerate ground states and a lowtemperature entropy very similar to the one predicted by pauling for protons in water. the study presented in this paper clearly evidences the occurrence of an intermediate state between the kagomeice and the fully polarized state. the specificheat measurements suggest that there is some disorder in this state. it would be interesting to investigate more in detail this spin-disorder by using a multi-technique approach, for example, using cantilever magnetometry on single crystals coupled with a sub-kelvin cooling system and with proper experimental setup. moreover, different degrees of disorder are expected to split differently the energy levels of rare earths: electronic paramagnetic resonance measurements could, in principle, provide useful information about the new energy levels structure and, indirectly, about the newly discovered spin-ice phase. alternatively, neutrons and muons could also be employed to extract useful information about the phonons and the local magnetic fields that are related to the spin structure of the system. the knowledge of the actual type of ordering of this novel intermediate state would be precious to deeply understand the role and the weight of the dipolar interaction in the spin hamiltonian used for the rationalization of the spin-ice behavior. indeed, the treatment that is commonly used for modeling these systems uses ewald summations to obtain an absolutely convergent effective dipole-dipole interaction between two spins, 070010-1 papers in physics, vol. 7, art. 070010 (2015) / m. perfetti but this method does not provide any evidence of the intermediate state. for this reason, a refinement of the model is mandatory. indeed, the authors attempt to reproduce the intermediate phase using the generalized dipolar spin-ice model, proposed in refs. [2, 3] that has successfully explained many other features of these materials. they perform monte-carlo simulations on a 2d ising lattice in which the ewald summations are used to handle dipole-dipole interactions among spin pairs. unfortunately, the intermediate phase observed in experiments is not encountered in the monte-carlo simulations, which points out the need to develop more sophisticated theoretical models. the study conducted on this pyrochlore derivative could be easily extended to ho2ti2o7 to investigate if this partially-disordered state is a common feature of the two systems. all the aforementioned suggestions could help achieve a complete scenario of the magnetic behavior of the pyrochlore spin-ice systems that is necessary for the development of a more accurate model. concluding, this study provides a further step in the understanding of the magnetic behavior of these systems. on one side, it reveals the presence of an intermediate state between the kagome-ice and the fully polarized state in dy2ti2o7 that was not detected so far. on the other side, it points out that the theory commonly used to describe the magnetic behavior of this compound should be refined in order to account for all the states/phases through which the system passes while the temperature is varied. [1] s a grigera, r a borzi, d g slobinsky, a s gibbs, r higashinaka, y maeno, t s grigera, an intermediate state between the kagome-ice and the fully polarized state in dy2ti2o7, papers in physics 7, 070009 (2015) [2] j p c ruff, r g melko, m j p gingras, finitetemperature transitions in dipolar spin ice in a large magnetic field, phys. rev. lett. 95, 097202 (2005). [3] t yavorskii, t fennell, m j p gingras, s t bramwell, dy2 ti2 o7 spin ice: a test case for emergent clusters in a frustrated magnet, phys. rev. lett. 101, 037204 (2008). 070010-2 papers in physics, vol. 5, art. 050008 (2013) received: 10 october 2013, accepted: 4 november 2013 edited by: s. a. grigera licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.050008 www.papersinphysics.org issn 1852-4249 commentary on “graphite and its hidden superconductivity” e. m. forgan1∗ i write this comment on the article by esquinazi [1] as an expert on superconductivity but not as an expert on graphite. i should also mention that in 1986, a student asked me what i thought about a paper written by bednorz & muller in z. phys. after looking at it carefully, i commented that it represented measurements on a mixed-phase sample, which had a resistivity ∼1000 times that of copper at room temperature. the resistivity was increasing as the temperature was lowered, i.e., behaving in a non-metallic fashion. at ∼35 k, the resistivity began to fall, but had not become “zero” until ∼10 k. note that “zero” on the scale of the graph in the paper might just be the resistivity of copper at room temperature. hence i concluded that there was no proof of superconductivity (such as the meissner effect) and i highlighted the word “possible” in the title of the paper. however, other workers were more “gullible” and attempted to repeat and extend this work. it turned out that the phenomenon was very “democratic” and widely reproducible (unlike the equally surprising reports of “cold fusion” a few years later). so here i try to discuss whether the proposed superconductivity in graphite at elevated temperatures is real or not. one initial bibliographic comment may well be relevant: the papers reporting signs of superconductivity in graphite have a very restricted group of authors, suggesting that the phenomenon may ∗e-mail: ted.forgan@gmail.com 1 school of physics & astronomy, university of birmingham, u.k. not be “democratic”. some workers have become persuaded that the phenomenon is real, but they have not yet convinced a much wider audience, who probably feel that exceptional claims need exceptionally strong evidence. it is clear from the discussion in section i of the paper, and a review of the extensive literature, that graphite is a complicated and sometimes irreproducible material. this is partly due to the weak interlayer forces, which mean that it does not always stack in an ideal abab hexagonal pattern. in addition, after the discovery of single-layer graphene, we know that independent layers may exist with extremely high mobility, conducting only in the basal plane direction. even without this complication, graphite is a highly anisotropic material: this can easily cause difficulties in measuring transport properties, since the anisotropy in resistivity can give non-uniform current distributions. the effect of magnetic field on electron motion is also very anisotropic, with c-axis fields having strong effects on transport properties, and basal plane fields having almost no effect. furthermore, the diamagnetic susceptibility is strong, very anisotropic and temperature-dependent. this bulk property and many others, such as the de haas van alphen effect in large samples [2] have been understood in general terms [3] as a consequence of a semi-metallic band-structure [4] since ∼1960. i now turn to the various sections of the paper. in section ii, there is an account of strong magnetoresistance effects. similar effects have also been observed in bismuth [5] and have a very interesting explanation [5] in terms of the semi-metallic prop050008-1 papers in physics, vol. 5, art. 050008 (2013) / e. m. forgan erties of graphite and bismuth, so there is no need to propose a superconducting explanation for this. in section iii, a tiny hysteresis in magnetoresistance is described. two comments are relevant here: the author notes that the sign of the hysteresis is opposite to that expected for a superconductor and limits himself to stating that the data provide “striking hints that granular superconductivity is at work in some regions of these samples”. this is hardly definitive proof. section iv is headed “direct evidence for josephson behavior”. this quotes data from a recent publication from the author’s group [6]. it is worth noting that these measurements were made with very small currents, so that the limit of measurement value is in the ohms region or greater. in some cases [6], apparent negative resistance values were observed. this can easily occur (and has been observed by myself) in a layered material. the phenomenon arises from nonuniform current flow enhanced by the resistivity anisotropy, combined with voltage leads which effectively make contact at different positions along the c-axis of the sample. it seems likely that these curious results, and their current-dependence, arise from non-ideal connections of the voltage and/or current leads. other odd features of the results, such as sample-dependent noise at low temperatures, and the fact that magnetic fields could increase, decrease or have no effect on the voltages observed, also cast great doubt on the josephson interpretation. in section v, we have an account of some magnetic susceptibility measurements, such as those reported in [7] on graphite “doped” with water. the hysteresis loops reported in that paper correspond to a maximum signal only . 1% of the c-axis susceptibility of graphite. the value of this susceptibility, though relatively large, is < 0.001 (si dimensionless units). hence if the width of the hysteresis loop observed in these measurements corresponds to a meissner signal from superconductivity, then this supposed superconductivity occupies a volume fraction . 10−5. esquinazi et al. contend that this is consistent with superconductivity only present at somewhat ill-defined interfaces; however, it also means that one has to beware of artifacts. in response to [7], a colleague repeated their measurements as an undergraduate project [8]. their clear conclusion was that if the correct diamagnetic background slope (that obtained at large fields) is subtracted, then the hysteresis corresponds to a tiny ferromagnetic component. however, if a slightly different background is chosen, the hysteresis loops look somewhat like the response of a granular superconductor. however, for a granular superconductor the hysteresis peaks should lie away from the vertical axis in the bottom right/top left corners (see e.g. [9]) and this is contrary to what is observed in graphite. further evidence that this hysteresis is not due to superconductivity may be obtained from its temperature-dependence. we see in [7] that the hysteresis at 300 k is essentially the same as that at 5 k. we bear in mind that by assumption the superconductivity is confined to an atomic layer, and that the higher the tc of a superconductor the shorter the coherence length. these two together ensure that thermal fluctuations (which are already very noticeable at t ∼ 100 k in cuprate materials) would be huge for any room temperature graphite superconductivity [10]. thermal fluctuations would greatly reduce vortex pinning and magnetic irreversibility at room temperature, contrary to what is observed. on the other hand, a saturated ferromagnetic response would be almost temperature-independent for temperatures well below the curie point. there are further measurements [11] which appear to show magnetic hysteresis (as a function of direction of temperature sweep, not as a function of field) going to zero at 400 k. however, this temperature is where the sweep direction changes, so the hysteresis with temperature is by definition zero at 400 k. once again the differences in the magnetic signals are a tiny fraction of the total sample magnetization. there are many possible reasons (both real and due to experimental artifacts) why measurements on a sample taken on heating and cooling might disagree. hence, the rather complicated results summarized in esquinazi’s paper cannot confidently be ascribed to (as yet not understood) superconducting effects. i cannot give an overriding simple explanation for all the different results reported in equinazi’s paper, but neither can the author. in some cases this is because the proposed superconductivity is a “moving target”: sometimes with a tc ∼ 25 k, and sometimes tc well above room temperature; sometimes superconducting effects are suppressed by magnetic field and sometimes enhanced at high fields. in interpreting the evidence presented, the 050008-2 papers in physics, vol. 5, art. 050008 (2013) / e. m. forgan author has a tendency to jump to a superconducting interpretation, when others are perfectly possible. unless and until graphite samples can be produced which exhibit the meissner effect for a volume fraction of at least 1%, and which show direct evidence of quantum coherence (hysteresis which might arise from josephson networks or from other causes is not direct evidence), i expect that the scientific community at large will not accept that graphite exhibits high-temperature superconductivity. [1] p esquinazi, graphite and its hidden superconductivity, pap. phys. 5, 050007 (2013). [2] j w mcclure, band structure of graphite and de haas-van alphen effect, phys. rev. 108, 612 (1957). [3] j w mcclure, theory of diamagnetism of graphite, phys. rev. 119, 606 (1960). [4] j-c charlier, x gonze, j-p michenaud, firstprinciples study of the electronic properties of graphite, phys. rev. b 43, 4579 (1991). [5] x du, s w tsai, d l maslov, a f hebard, metal-insulator-like behavior in semimetallic bismuth and graphite, phys. rev. lett. 94, 166601 (2005). [6] a ballestar, j barzola-quiquia, t scheike, p esquinazi, josephson-coupled superconducting regions embedded at the interfaces of highly oriented pyrolytic graphite, new j. phys. 15, 023024 (2013). [7] t scheike, w bhlmann, p esquinazi, j barzola-quiquia, a ballestar, a setzer, can doping graphite trigger room temperature superconductivity? evidence for granular hightemperature superconductivity in water-treated graphite powder, adv. mater. 24, 5826 (2012). [8] m robson, p diwell (unpublished). supervised by e blackburn, school of physics & astronomy, university of birmingham, u.k. (2012). [9] s senoussi, c aguillon, s hadjoudj, the contribution of the intergrain currents to the low field hysteresis cycle of granular superconductors and the connection with the microand macrostructures, physica c 175, 215 (1991). [10] a gurevich, challenges and opportunities for applications on unconventional superconductors, annu. rev. cond. matter phys., in press. [11] t scheike, p esquinazi, a setzer, w böhlmann, granular superconductivity at room temperature in bulk highly oriented pyrolytic graphite samples, carbon 59, 140 (2013). 050008-3 papers in physics, vol. 2, art. 020001 (2010) received: 13 july 2009, revised: 29 december 2009, accepted: 7 february 2010 edited by: s. a. cannas reviewed by: p. netz, inst. de qúımica, univ. federal do rio grande do sul, brazil licence: creative commons attribution 3.0 doi: 10.4279/pip.020001 www.papersinphysics.org issn 1852-4249 structural and dynamic properties of spc/e water m. g. campo,1∗ i have investigated the structural and dynamic properties of water by performing a series of molecular dynamic simulations in the range of temperatures from 213 k to 360 k, using the simple point charge-extended (spc/e) model. i performed isobaric-isothermal simulations (1 bar) of 1185 water molecules using the gromacs package. i quantified the structural properties using the oxygen-oxygen radial distribution functions, order parameters, and the hydrogen bond distribution functions, whereas, to analyze the dynamic properties i studied the behavior of the history-dependent bond correlation functions and the non-gaussian parameter α2(t) of the mean square displacement of water molecules. when the temperature decreases, the translational (τ ) and orientational (q) order parameters are linearly correlated, and both increase indicating an increasing structural order in the systems. the probability of occurrence of four hydrogen bonds and q both have a reciprocal dependence with t , though the analysis of the hydrogen bond distributions permits to describe the changes in the dynamics and structure of water more reliably. thus, an increase on the caging effect and the occurrence of long-time hydrogen bonds occur below ∼ 293 k, in the range of temperatures in which predominates a four hydrogen bond structure in the system. i. introduction water is the subject of numerous studies due to its biological significance and its universal presence [1– 3]. the thermodynamic behavior of water presents important differences compared with those of the other substances, and many of the characteristics of such behavior are often attributed to the existence of hydrogen bonds between water molecules. scientists have found that the water structure produced by the hydrogen bonds is peculiar as compared to that of other liquids. then, the advances in the knowledge of hydrogen bond behavior are crucial to understanding water properties. ∗e-mail: mario@exactas.unlpam.edu.ar 1 universidad nacional de la pampa, facultad de ciencias exactas y naturales, uruguay 151, 6300 santa rosa, la pampa, argentina. the method of molecular dynamics (md) allows to analyze the structure and dynamics of water at the microscopic level and hence to complement experimental techniques in which these properties can be interpreted only in a qualitative way (infra-red absorption and raman scattering [4], depolarized light scattering [5, 6], neutron scattering [7], femtosecond spectroscopy [8–11] and other techniques [12–14]. among the usual methods to study the short range order in md simulations of water are the calculus of radial distribution functions, hydrogen bond distributions and order parameters. the orientational order parameter q measures the tendency of the system to adopt a tetrahedral configuration considering the water oxygen atom as vertices of a tetrahedron, whereas the translational order parameter τ quantifies the deviation of the pair correlation function from the uniform value of 020001-1 papers in physics, vol. 2, art. 020001 (2010) / m. g. campo unity seen in an ideal gas [15, 16]. the order parameters are used to construct an order map, in which different states of a system are mapped onto a plane τ -q. the order parameters are, in general, independent, but they are linearly correlated in the region in which the water behaves anomalously [17]. the dynamics of water can by characterized by the bond lifetime, τhb, associated to the process of rupturing and forming of hydrogen bonds between water molecules which occurs at very short time scale [9, 18, 19, 21–23]. τhb is obtained in md using the history-dependent bond correlation function p (t), which represents the probability that an hydrogen bond formed at time t = 0 remained continuously unbroken and breaks at time t [24, 25]. also, the dynamics of water can be studied by analyzing the mean-square displacement time series m (t). in addition to the diffusion coefficient calculation at long times in which m (t) ∝ t, in the supercooled region of temperatures and at intermediate times m (t) ∝ tα (0 < α < 1). this behavior of m (t) is associated to the subdiffusive movement of the water molecules, caused by the caging effect in which a water molecule is temporarily trapped by its neighbors and then moves in short bursts due to nearby cooperative motion. a time t∗ characterizes this caging effect (see sec. ii for more details) [26, 27]. in a previous work, we found a q-exponential behavior in p (t), in which q increases with t −1 approximately below 300 k. q(t ) is also correlated with the probability of occurrence of four hydrogen bonds, and the subdiffusive motion of the water molecules [28]. the relationship between dynamics and structural properties of water has not been clearly established to date. in this paper, i explore whether the effect that temperature has on the water dynamics reflects a more general connection between the structure and the dynamics of this substance. ii. theory and method i have performed molecular dynamic simulations of spc/e water model using the gromacs package [29, 30], simulating fourteen similar systems of 1185 molecules at 1 bar of pressure in a range of temperatures from 213 k to 360 k. i initialized the system at 360 k using an aleatory configuration of water table 1: details of the simulation procedure. duration of the stabilization period (test) and the md sampling (tm d) in the different ranges of temperatures temp. range (k) test (ns) tm d (ns) 213 243 20.0 10.0 253 273 16.0 10.0 283 360 16.0 8.0 molecules, assigning velocities to the molecules according to a boltzmann’s distribution at this temperature. for stabilization, i applied berendsen’s thermal and hydrostatic baths at the same temperature and 1 bar of pressure [31]. then, i ran an additional md obtaining an isobaric-isothermal ensemble. i obtained the other systems in a similar procedure, but using as initial configuration that of the system of the preceding higher temperature and cooling it at the slow rate of 30 k ns−1 [17]. stabilization and sampling periods for the systems at different temperatures are indicated in table 1. simulation and sampling time steps were 2 fs and 10 fs, respectively. the sampling time step was shorter than the typical time during which a hydrogen bond can be destroyed by libration movements. i calculated the hydrogen bond distribution functions f (n) (n = 0, 1, ..., 5), which is the probability of occurrence of n hydrogen bonds by molecule, considering a geometric definition of hydrogen bond [20]. as parameters for this calculation, i used a maximum distance between oxygen atoms of 3.5 å and a minimum angle between the atoms odonor– h–oacceptor of 145◦. the radial distribution function (rdf) is a standard tool used in experiments, theories, and simulations to characterize the structure of condensed matter. using rdfs, i obtained the average number, n , of water molecules in the first hydration layer (the hydration number) n = 4πρ ∫ rmin 0 g(r)r2dr (1) where ρ is the number density. the translational order parameter, τ , is defined in ref. [16] as 020001-2 papers in physics, vol. 2, art. 020001 (2010) / m. g. campo figure 1: hydrogen bonds distribution functions f (n) (n = 0, ..., 5) versus t . the zones a, b and c correspond to ranges of temperatures in which occur different relationships between f (4), f (3) and f (2). note the reciprocal scale for the temperatures. see the text for details. figure 2: oxygen-oxygen radial distribution functions for the systems at 213 k (continuous line), 293 k (dashed line), and 360 k (dotted line). inset: the hydration number n vs. t 5. τ ≡ ∫ sc 0 |g(s) − 1|ds (2) where the dimensionless variable s ≡ rn1/3 is the radial distance r scaled by the mean intermolecular distance n1/3, and sc corresponds to half of the simulation box size. the orientational order parameter q is defined as [15] q = 〈 1 − 3 8 n∑ i=1 4∑ j=1 4∑ l=j+1 [ cosθjik + 1 3 ]2〉 (3) where θjik is the angle formed by the atoms oj –oi– ok. here, oi is the reference oxygen atom, and oj and ok are two of its four nearest neighbors. q=1 in an ideal configuration in which the oxygen atoms would be located in the vertices of a tetrahedron. i obtained the bond correlation function p (t) from the simulations by building a histogram of the hydrogen bonds lifetimes for each configuration. then, i fitted this function with a tsallis distribution of the form expq(t) = [1 + (1 − q) t] 1/(1−q) (4) being t the hydrogen bond lifetime and q the nonextensivity parameter [28, 32]. if q = 1, eq. (4) reduces to an exponential, whereas if q > 1, p (t) decays more slowly than an exponential. this last behavior occurs when long lasting hydrogen bonds increase their frequency of occurrence. the subdiffusive movement of water occurs when the displacement of the molecules obeys a nongaussian statistics. this behavior is characterized by t∗, the time in which the non-gaussian parameter α2(t) reaches a maximum [see eq. (5)]. then, t∗ is the parameter associated to the average time during which a water molecule is trapped by its environment (caging effect), and this prevents it from reaching the diffusive state [26, 27]. α2(t) = 3〈r4(t)〉 5〈r2(t)〉 − 1 (5) iii. results and discussion three zones or ranges of temperatures can be distinguished in the graph of the hydrogen bond distributions f (n) vs. t (see fig. 1). zone a (t > 350 k) in which f (3) > f (2) > f (4), zone b (293 k > t > 350 k) in which f (3) > f (4) > f (2), and zone c (t < 293 k) in which f (4) > f (3) > f (2). these results indicate a predominant structure of three and two hydrogen bonds (3hb-2hb) in zone 020001-3 papers in physics, vol. 2, art. 020001 (2010) / m. g. campo a, 3hb-4hb in zone b, and 4hb-3hb in zone c, respectively. f (4) ∝ t −1 in all ranges of temperatures, showing that the tetrahedral structure of water decreases with the increase of this variable.f (3) increases with t up to 293 k, and then remains approximately constant (∼ 0.4) up to 360 k. f (2) also increases with t in all range of temperatures, but only overcomes f (4) at t > 350 k. figure 3: position of the first minimum of the oxygen-oxygen radial distribution function vs t 4, associated to the size of the first hydration layer. fig. 2 shows the oxygen-oxygen rdfs corresponding to the systems at 213 k, 293 k and 360 k. when the temperature decreases, the minimum and maximum tend to be more defined. this being associated with an increasing order in the system. the position of the first minimum moves closer to the origin decreasing the size of the first hydration layer (∝ t 4 see fig. 3). both facts can be associated to the decrease of the hydration number from n ∼ 5 to n ∼ 4 (see inset, fig. 2). the simultaneous behavior of q and τ is shown in the order map of fig. 4, in which the location of the values corresponding to 293 k are indicated by an arrow. the order parameters present similar behaviors with the temperature. upon cooling, these parameters are linearly correlated and move in the order map along a line of increasing values, up to reaching maximum values at 213 k. the slope of the line increases a little for t > 293 k, indicating that τ has a response to the increase of t slightly higher than q in this range of temperatures. the figure 4: order map with the values of the order parameters corresponding to the simulated systems. note the change in the slope of the line at t ∼ 273 k positive values of the slopes indicate an increasing order of the system when the temperature decrease. the f (n) functions allow to obtain a more detailed picture of the structural orientational changes at shorter ranges between water molecules than the orientational order parameter. while a small change at 293 k occurs in the order map, the structures of two, three and four hydrogen bonds are alternated in importance when the temperature changes. the ability of f (n) to more reliably describe the structure of the water occurs because the calculation of the hydrogen bond distributions includes the location of the hydrogen atoms, whereas q only quantifies the changes in the average angle between neighbor oxygen atoms. although the behavior of f (4) and q are correlated (see fig. 5), f (4) shows a greater response to the temperature than q, indicating that the main change in the tetrahedral structure with the decrease of the temperature occurs mainly in the orientation of the bonds between water molecules. the approximately linear correlation between both variables also indicates a similar dependence with the temperature (∝ t −1). figure 6 shows the behavior of the dynamical parameters t∗ and q with q. the characteristic time t∗ has an exponential response to q ≥ 0.58 (t ≤ 360 k), but the slope of the semilog plot of t∗ vs. q increases significantly for q ≥ 0.67. a similar 020001-4 papers in physics, vol. 2, art. 020001 (2010) / m. g. campo figure 5: q vs f (4). the change in f (4) is higher than that of q in the range of temperatures studied. see the text for details. change occurs for q ≥ 0.67 in the linear correlation between q and q. then, the values q ≈ 0.67 and τ ≈ 1.1 of the order map can be associated to changes in the dynamics of the system. the transition of q ≈ 1 to q > 1 indicates the increase of the probability of two water molecules remaining bonded by a hydrogen bond during an unusual long time, whereas the increase of t∗ is associated to the increase of the time during which the molecules remain in a subdiffusive regime. however, only the analysis of the f (n) functions reveals the structural modification that explains the structural and dynamic changes in the system. the changes in the increase of the order map, t∗(q) and q(q) occur below 293 k, in the range of temperatures in which prevail a structure of four hydrogen bonds in the system. iv. conclusions the molecular dynamic method allows to study the structure and dynamics of the spc/e model of water in the range of temperatures from 213 k to 360 k. lowering the temperature of the system from 360 k to 213 k, the number of water molecules in the first hydration layer decreases from n ∼ 5 to n ∼ 4, along with a decrease in size. the increase of the tetrahedral structure of the system is also characterized by a growth of the percentage figure 6: (a) semilog plot of t∗ vs. q. (b) q vs. q. see the text for details. of occurrence of four hydrogen bonds and the orientational order parameter q. however, only the analysis of the behavior of the hydrogen bond distribution allows to deduce that, when a tetrahedral structure associated to the percentage of four hydrogen bonds predominates, the behavior of the dynamical variables p (t) and t∗ show the occurrence of long lasting hydrogen bonds and caging effect between the molecules of the system. acknowledgements i am grateful for the financial support by picto unlpam 2005 30807 and facultad de ciencias exactas y naturales (unlpam). [1] d eisenberg, w kauzmann, the structure and properties of water, clarendon press, oxford (1969). 020001-5 papers in physics, vol. 2, art. 020001 (2010) / m. g. campo [2] f h stillinger, water revisited, science 209, 451 (1980). [3] o mishima, h e stanley, the relationship between liquid, supercooled and glassy mater, nature 396, 329 (1998). [4] e w castner, y j chang, y c chu, g e walrafen, the intermolecular dynamics of liquid water, j. chem. phys. 102, 653 (1995). [5] c j montrose, j a bucaro, j marshallcoakley, t a litovitz, depolarized rayleigh scattering and hydrogen bonding in liquid water, j. chem. phys. 60, 5025 (1974). [6] w danninger, g zundel, intense depolarized rayleigh scattering in raman spectra of acids caused by large proton polarizabilities of hydrogen bonds, j. chem. phys. 74, 2779 (1981). [7] j yeixeira, s h chen, m-c bellissent-funel, molecular dynamics of liquid water probed by neutron scattering, j. mol. liq., 48, 111 (1991). [8] r laenen, c rauscher, a laubereau, dynamics of local substructures in water observed by ultrafast infrared hole burning, phys. rev. lett. 80, 2622 (1998). [9] s woutersen, u emmerichs, h j bakker, femtosecond mid-ir pump-probe spectroscopy of liquid water: evidence for a two-component structure, science 278, 658 (1997). [10] h k nienhuys, s woutersen, r a van santen, h j bakker, mechanism for vibrational relaxation in water investigated by femtosecond infrared spectroscopy, j. chem. phys. 111, 1494, (1999). [11] g m gale, g gallot, f hache, n lascoux, s bratos, j c leickman, femtosecond dynamics of hydrogen bonds in liquid water a real time study, phys. rev. lett. 82, 1068, (1999). [12] a h narten, m d danford, h a levy, x-ray diffraction study of liquid water in the temperature range 4 �– 200 �, faraday discuss. 43, 97 (1967). [13] a k soper, f bruni, m a ricci, site-site pair correlation functions of water from 25 to 400 �: revised analysis of new and old diffraction data, j. chem. phys. 106, 247 (1997). [14] k modig, b g pfrommer, b halle, temperature-dependent hydrogen-bond geometry in liquid water, phys. rev. lett. 90, 075502 (2003). [15] h tanaka, simple physical explanation of the unusual thermodynamic behavior of liquid water, phys. rev. lett. 80, 5750 (1998). [16] j r errington, p g debenedetti, relationship between structural order and the anomalies of liquid water, nature 409, 318 (2001). [17] n giovambattista, p g debenedetti, f sciortino, h e stanley, structural order in glassy water, phys. rev. e 71, 061505 (2005). [18] c a angell, v rodgers, near infrared spectra and the disrupted network model of normal and supercooled water, j. chem. phys. 80, 6245 (1984). [19] j d cruzan, l b braly, k liu, m g brown, j g loeser, r j saykally, quantifying hydrogen bond coopertivity in water: vrt spectroscopy of the water tetramer, science 271, 59 (1996). [20] d. c. rapaport, hydrogen bonds in water, mol. phys. 50, 1151 (1983). [21] a luzar, resolving the hydrogen bond dynamics conundrum, j. chem. phys. 113, 10663 (2000). [22] f mallamace, m broccio, c corsaro, a faraone, u wandrlingh, l liu, c mou, s h chen, the fragile-to-strong dynamics crossover transition in confined water: nuclear magnetic resonance results, j. chem. phys. 124, 124 (2006). [23] c j montrose, j a búcaro, j marshallcoakley, t a litovitz, depolarizated lightscattering and hydrogen bonding in liquid water, j. chem. phys. 60, 5025 (1974). [24] f w starr, j k nielsen, h e stanley, fast and slow dynamics of hydrogen bonds in liquid water, phys. rev. lett. 82, 2294 (1999). 020001-6 papers in physics, vol. 2, art. 020001 (2010) / m. g. campo [25] f w starr, j k nielsen, h e stanley, hydrogen-bond dynamics for the extended simple point charge model of water, phys. rev. e. 62, 579 (2000). [26] s chatterjee, p g debenedetti, f h stillinger, r m lynden-bell, a computational investigation of thermodynamics, structure, dynamics and solvation behavior in modified water models, j. chem. phys. 128, 124511 (2008). [27] m g mazza, n giovambattista, h e stanley, f w starr, connection of translational and rotational dynamical heterogeneities with the breakdown of the stokes-einstein and stokeseinstein-debye relations in water, phys. rev. e 76, 031203 (2007). [28] m g campo, g l ferri, g b roston, qexponential distribution in time correlation function of water hydrogen bonds, braz. j. phys. 39, 439 (2009). [29] h j c berendsen, d van der spoel, r v drunen, gromacs: a message passing parallel molecular dynamics implementation, comp. phys. comm. 91, 43 (1995). [30] h j c berendsen, j r grigera, t p straatsma, the missing term in effective pair potentials, j. phys. chem. 91, 6269 (1987). [31] h j c berendsen, j postma, w van gunsteren, a di nola, j haak, molecular dynamics with coupling to an external bath, j. chem. phys. 81, 3684 (1984). [32] c tsallis, possible generalization of boltzmann-gibbs statistics, j. stat. phys. 52, 479 (1988). [33] l a báez, p clancy, existence of a density maximum in extended simple point charge water, j. chem. phys. 101, 9837 (1994). 020001-7 papers in physics, vol. 5, art. 050004 (2013) received: 2 march 2013, accepted: 5 june 2013 edited by: g. mindlin licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.050004 www.papersinphysics.org issn 1852-4249 revisiting the two-mass model of the vocal folds m. f. assaneo,1∗ m. a. trevisan1† realistic mathematical modeling of voice production has been recently boosted by applications to different fields like bioprosthetics, quality speech synthesis and pathological diagnosis. in this work, we revisit a two-mass model of the vocal folds that includes accurate fluid mechanics for the air passage through the folds and nonlinear properties of the tissue. we present the bifurcation diagram for such a system, focusing on the dynamical properties of two regimes of interest: the onset of oscillations and the normal phonation regime. we also show theoretical support to the nonlinear nature of the elastic properties of the folds tissue by comparing theoretical isofrequency curves with reported experimental data. i. introduction in the last decades, a lot of effort was devoted to develop a mathematical model for voice production. the first steps were made by ishizaka and flanagan [1], approximating each vocal fold by two coupled oscillators, which provide the basis of the well known two-mass model. this simple model reproduces many essential features of the voice production, like the onset of self sustained oscillation of the folds and the shape of the glottal pulses. early analytical treatments were restricted to small amplitude oscillations, allowing a dimensional reduction of the problem. in particular, a two dimensional approximation known as the flapping model was widely adopted by the scientific community, based on the assumption of a transversal wave propagating along the vocal folds [2, 3]. moreover, this model was also used to successfully ∗e-mail: florencia@df.uba.ar †e-mail: marcos@df.uba.ar 1 laboratorio de sistemas dinámicos, depto. de f́ısica, fcen, universidad de buenos aires. pabellón i, ciudad universitaria, 1428ega buenos aires, argentina. explain most of the features present in birdsong [4, 5]. faithful modeling of the vocal folds has recently found new challenges: realistic articulatory speech synthesis [6–8], diagnosis of pathological behavior of the folds [9, 10] and bioprosthetic applications [11]. within this framework, the 4-dimensional two-mass model was revisited and modified. two main improvements are worth noting: a realistic description of the vocal fold collision [13,14] and an accurate fluid mechanical description of the glottal flow, allowing a proper treatment of the hydrodynamical force acting on the folds [8, 15]. in this work, we revisit the two-mass model developed by lucero and koenig [7]. this choice represents a good compromise between mathematical simplicity and diversity of physical phenomena acting on the vocal folds, including the main mechanical and fluid effects that are partially found in other models [13, 15]. it was also successfully used to reproduce experimental temporal patterns of glottal airflow. here, we extend the analytical study of this system: we present a bifurcation diagram, explore the dynamical aspects of the oscillations at the onset and normal phonation and study the 050004-1 papers in physics, vol. 5, art. 050004 (2013) / m. f. assaneo et al. isofrequency curves of the model. this work is organized as follows: in the second section, we describe the model. in the third section, we present the bifurcation diagram, compare our solutions with those of the flapping model approximation and analyze the isofrecuency curves. in the fourth and last section, we discuss our results. ii. the model each vocal fold is modeled as two coupled damped oscillators, as sketched in fig. 1. figure 1: sketch of the two-mass model of the vocal folds. each fold is represented by masses m1 and m2 coupled to each other by a restitution force kc and to the laryngeal walls by k1 and k2 (and dampings b1 and b2), respectively. the displacement of each mass from the resting position x0 is represented by x1 and x2. the different aerodynamic pressures p acting on the folds are described in the text. assuming symmetry with respect to the saggital plane, the left and right mass systems are identical (fig. 1) and the equation of motion for each mass reads ẋi = yi (1) ẏi = 1 mi [fi −ki(xi) −bi(xi,yi) −kc(xi −xj)] , for i,j = 1 or 2 for lower and upper masses, respectively. k and b represent the restitution and damping of the folds tissue, f the hydrodynamic force, m is the mass and kc the coupling stiffness. the horizontal displacement from the rest position x0 is represented by x. we use a cubic polynomial for the restitution term [eq. (2)], adapted from [1, 7]. the term with a derivable step-like function θ [eq. (3)] accounts for the increase in the stiffness introduced by the collision of the folds. the restitution force reads ki(xi) = kixi(1 + 100xi 2) (2) + θ ( xi + x0 x0 ) 3ki(xi + x0)[1 + 500(xi + x0) 2 ], with θ(x) = { 0 if x ≤ 0 x2 8 10−4+x2 if x > 0 , (3) where x0 is the rest position of the folds. for the damping force, we have adapted the expression proposed in [7], making it derivable, arriving at the following equation: bi(xi) = (4)[ 1 + θ ( xi + x0 x0 ) 1 �i ] ri(1 + 850xi 2)yi, where ri = 2�i √ kimi, and �i is the damping ratio. in order to describe the hydrodynamic force that the airflow exerts on the vocal folds, we have adopted the standard assumption of small inertia of the glottal air column and the model of the boundary layer developed in [7, 11, 15]. this model assumes a one-dimensional, quasi-steady incompressible airflow from the trachea to a separation point. at this point, the flow separates from the tissue surface to form a free jet where the turbulence dissipates the airflow energy. it has been experimentally shown that the position of this point depends on the glottal profile. as described in [15], the separation point located at the glottal exit shifts down to the boundary between masses m1 and m2 when the folds profile becomes more divergent than a threshold [eq. (7)]. viscous losses are modeled according to a bidimensional poiseuille flow [eqs. (6) and (7)]. the equations for the pressure inside the glottis are 050004-2 papers in physics, vol. 5, art. 050004 (2013) / m. f. assaneo et al. pin = ps + ρu2g 2a21 , (5) p12 = pin − 12µugd1l 2 g a31 , (6) p21 = { 12µugd2l 2 g a32 + pout if a2 > ksa1 0 if a2 ≤ ksa1 , (7) pout = 0. (8) as sketched in fig. 1, the pressures exerted by the airflow are: pin at the entrance of the glottis, p12 at the upper edge of m1, p21 at the lower edge of m2, pout at the entrance of the vocal tract and ps the subglottal pressure. the width of the folds (in the plane normal to fig. 1) is lg; d1 and d2 are the lengths of the lower and upper masses, respectively. ai are the crosssections of the glottis, ai = 2lg(xi + x0); µ and ρ are the viscosity and density coefficient of the air; ug is the airflow inside the glottis, and ks = 1.2 is an experimental coefficient. we also assume no losses at the glottal entrance [eq. (5)], and zero pressure at the entrance of the vocal tract [eq. (8)]. the hydrodynamic force acting on each mass reads: f1 = { d1lgps if x1 ≤−x0 or x2 ≤−x0 pin+p12 2 in other case (9) f2 =   d2lgps if x1 > −x0 and x2 ≤−x0 0 if x1 ≤−x0 p21+pout 2 in other case (10) following [1, 7, 10], these functions represent opening, partial closure and total closure of the glottis. throughout this work, piecewise functions p21, f1 and f2 are modeled using the derivable steplike function θ defined in eq. (3). iii. analysis of the model i. bifurcation diagram the main anatomical parameters that can be actively controlled during the vocalizations are the subglottal pressure ps and the folds tension controlled by the laryngeal muscles. in particular, the action of the thyroarytenoid and the cricothyroid muscles control the thickness and the stiffness of folds. following [1], this effect is modeled by a parameter q that scales the mechanic properties of the folds by a cord-tension parameter: kc = qkc0, ki = qki0 and mi = mi0 q . we therefore performed a bifurcation diagram using these two standard control parameters ps and q. five main regions of different dynamic solutions are shown in fig. 2. at low pressure values (region i), the system presents a stable fixed point. reaching region ii, the fixed point becomes unstable and there appears an attracting limit cycle. at the interface between regions i and ii, three bifurcations occur in a narrow range of subglottal pressure (fig. 3, left panel), all along the q axis. the right panel of fig. 3 shows the oscillation amplitude of x2. at point a, oscillations are born in a supercritical hopf bifurcation. the amplitude grows continuously for increasing ps until point b, where it jumps to the upper branch. if the pressure is then decreased, the oscillations persist even for lower pressure values than the onset in a. when point c is reached, the oscillations suddenly stop and the system returns to the rest position. this onset-offset oscillation hysteresis was already reported experimentally in [12]. the branch ab depends on the viscosity. decreasing µ, points a and b approach to each other until they collide at µ = 0, recovering the result reported in [3, 10, 14], where the oscillations occur as the combination of a subcritical hopf bifurcation and a cyclic fold bifurcation. on the other hand, the branch bc depends on the separation point of the jet formation. in particular, for increasing ks, the folds become stiffer and the separation point moves upwards toward the output of the glottis. from a dynamical point of view, points c and b approach to each other until they collapse. in this case, the oscillations are born at a supercritical hopf bifurcation and the system presents no hysteresis, as in the standard flapping model [17]. regions ii and iii of fig. 2 are separated by a saddle-repulsor bifurcation. although this bifurcation does not represent a qualitative dynamical change for the oscillating folds, its effects are relevant when the complete mechanism of voiced sound 050004-3 papers in physics, vol. 5, art. 050004 (2013) / m. f. assaneo et al. figure 2: bifurcation diagram in the plane of subglottal pressure and fold tension (q,ps). the insets are two-dimensional projections of the flow on the (v1,x1) plane, the red crosses represent unstable fixed points and the dotted lines unstable limit cycles. normal voice occurs at (q,ps) ∼ (1, 800). the color code represents the linear correlation between (x1 − x2) and (y1 + y2): from dark red for r = 1 to dark blue for r = 0.6. this diagram was developed with the help of auto continuation software [20]. the rest of the parameters were fixed at m1 = 0.125 g, m2 = 0.025 g, k10 = 80 n/m, k20 = 8 n/m, kc = 25 n/m, �1 = 0.1, �2 = 0.6, lg = 1.4 cm, d1 = 0.25 cm, d2 = 0.05 cm and x0 = 0.02 cm. production is considered. voiced sounds are generated as the airflow disturbance produced by the oscillation of the vocal folds is injected into the series of cavities extending from the laryngeal exit to the mouth, a non-uniform tube known as the vocal tract. the disturbance travels back and forth along the vocal tract, that acts as a filter for the original signal, enhancing the frequencies of the source that fall near the vocal tract resonances. voiced sounds are in fact perceived and classified according to these resonances, as in the case of vowels [18]. consequently, one central aspect in the generation figure 3: hysteresis at the oscillation onset-offset. left panel: zoom of the interface between regions i and ii. the blue and green lines represent folds of cycles (saddle-node bifurcations in the map). the red line is a supercritical hopf bifurcation. right panel: the oscillation amplitude of x2 as a function of the subglottal pressure ps, at q = 1.71. the continuation of periodic solutions was realized with the auto software package [20]. of voiced sounds is the production of a spectrally rich signal at the sound source level. interestingly, normal phonation occurs in the region near the appearance of the saddle-repulsor bifurcation. although this bifurcation does not alter the dynamical regime of the system or its time scales, we have observed that part of the limit cycle approaches the stable manifold of the new fixed point (as displayed in fig. 4), therefore changing its shape. this deformation is not restricted to the appearance of the new fixed point but rather occurs in a coarse region around the boundary between ii and iii, as the flux changes smoothly in a vicinity of the bifurcation. in order to illustrate this effect, we use the spectral content index sci [21], an indicator of the spectral richness of a signal: sci = ∑ k akfk/( ∑ k akf0), where ak is the fourier amplitude of the frequency fk and f0 is the fundamental frequency. as the pressure is increased, the sci of x1(t) increases (upper right panel of fig. 4), observing a boost in the vicinity of the saddle-repulsor bifurcation that stabilizes after the saddle point is generated. thus, the appearance of this bifurcation near the region of normal phonation could indicate a possible mechanism to further enhance the spectral richness of the sound source, on which the production of voiced sounds ultimately relies. 050004-4 papers in physics, vol. 5, art. 050004 (2013) / m. f. assaneo et al. figure 4: a projection of the limit cycle for x1 and the stable manifold of the saddle point, for parameters consistent with normal phonatory conditions, (q,ps) = (1, 850) (region iii). left inset: projection in the 3-dimensional space (y1, x1, x2). right inset: spectral content index of x1(t) as a function of ps for a fixed value of q = 0.95. in green, the value at which the saddle-repulsor bifurcation takes place. in the boundary between regions iii and iv, one of the unstable points created in the saddle-repulsor bifurcation undergoes a subcritical hopf bifurcation, changing stability as an unstable limit cycle is created [19]. finally, entering region v, the stable and the unstable cycles collide and disappear in a fold of cycles where no oscillatory regimes exist. in fig. 2, we also display a color map that quantifies the difference between the solutions of the model and the flapping approximation. the flapping model is a two dimensional model that, instead of two masses per fold, assumes a wave propagating along a linear profile of the folds, i.e., the displacement of the upper edge of the folds is delayed 2τ with respect to the lower. the cross sectional areas at glottal entry and exit (a1 and a2) are approximated, in terms of the position of the midpoint of the folds, by { a1 = 2lg(x0 + x + τẋ) a2 = 2lg(x0 + x− τẋ) , (11) where x is the midpoint displacement from equilibrium x0, and τ is the time that the surface wave takes to travel half the way from bottom to top. equation (11) can be rewritten as (x1 − x2) = τ(y1 + y2). we use this expression to quantify the difference between the oscillations obtained with the two-mass model solutions and the ones generated with the flapping approximation, computing the linear correlation coefficient between (x1 −x2) and (y1 + y2). as expected, the correlation coefficient r decreases for increasing ps or decreasing q. in the region near normal phonation, the approximation is still relatively good, with r ∼ 0.8. as expected, the approximation is better for increasing x0, since the effect of colliding folds is not included in the flapping model. ii. isofrequency curves one basic perceptual property of the voice is the pitch, identified with the fundamental frequency f0 of the vocal folds oscillation. the production of different pitch contours is central to language, as they affect the semantic content of speech, carrying accent and intonation information. although experimental data on pitch control is scarce, it was reported that it is actively controlled by the laryngeal muscles and the subglottal pressure. in particular, when the vocalis or interarytenoid muscle activity is inactive, a raise of the subglottal pressure produces an upraising of the pitch [16]. compatible with these experimental results, we performed a theoretical analysis using ps as a single control parameter for pitch. in the upper panels of fig. 5, we show isofrequency curves in the range of normal speech for our model of eqs. (1) to (10). following the ideas developed in [22] for the avian case, we compare the behavior of the fundamental frequency with respect to pressure ps in the two most usual cases presented in the literature: the cubic [1, 7] and the linear [10, 14] restitutions. in the lower panels of fig. 5, we show the isofrequency curves that result from replacing the cubic restitution by a linear restitution ki(xi) = kixi + θ( xi+x0 x0 )3ki(xi + x0). although the curves f0(ps) are not affected by the type of restitution at the very beginning of oscillations, the changes become evident for higher values of ps, with positive slopes for the cubic case and negative for the linear case. this result suggests that a nonlinear cubic restitution force is a 050004-5 papers in physics, vol. 5, art. 050004 (2013) / m. f. assaneo et al. figure 5: relationship between pitch and restitution forces. left panels: isofrequency curves in the plane (q,ps). right panels: curves f0(ps) for q=0.9, q=0.925 and q=0.95. in the upper panels, we used the model with the cubic nonlinear restitution of eq. (2). in the lower panels, we show the curves obtained with a linear restitution, ki(xi) = kixi + θ( xi+x0 x0 )3ki(xi + x0). good model for the elastic properties of the oscillating tissue. iv. conclusions in this paper, we have analyzed a complete twomass model of the vocal folds integrating collisions, nonlinear restitution and dissipative forces for the tissue and jets and viscous losses of the air-stream. in a framework of growing interest for detailed modeling of voice production, the aspects studied here contribute to understanding the role of the different physical terms in different dynamical behaviors. we calculated the bifurcation diagram, focusing in two regimes: the oscillation onset and normal phonation. near the parameters of normal phonation, a saddle repulsor bifurcation takes place that modifies the shape of the limit cycle, contributing to the spectral richness of the glottal flow, which is central to the production of voiced sounds. with respect to the oscillation onset, we showed how jets and viscous losses intervene in the hysteresis phenomenon. many different models for the restitution properties of the tissue have been used across the literature, including linear and cubic functional forms. yet, its specific role was not reported. here we showed that the experimental relationship between subglottal pressure and pitch is fulfilled by a cubic term. acknowledgements this work was partially funded by uba and conicet. [1] k ishizaka, j l flanagan, synthesis of voiced sounds from a two-mass model of the vocal cords, bell syst. tech. j. 51, 1233 (1972). [2] i r titze, the physics of smallamplitude oscillation of the vocal folds, j. acoust. soc. am. 83, 1536 (1988). [3] m a trevisan, m c eguia, g mindlin, nonlinear aspects of analysis and synthesis of speech time series data, phys. rev. e 63, 026216 (2001). [4] y s perl, e m arneodo, a amador, f goller, g b mindlin, reconstruction of physiological instructions from zebra finch song, phys. rev. e 84, 051909 (2011). [5] e m arneodo, y s perl, f goller, g b mindlin, prosthetic avian vocal organ controlled by a freely behaving bird based on a low dimensional model of the biomechanical periphery, plos comput. biol. 8, e1002546 (2012). [6] b h story, i r titze voice simulation with a bodycover model of the vocal folds, j. acoust. soc. am. 97, 1249 (1995). [7] j c lucero, l koening simulations of temporal patterns of oral airflow in men and women using a two-mass model of the vocal folds under dynamic control, j. acoust. soc. am. 117, 1362 (2005). [8] x pelorson, x vescovi, c castelli, e hirschberg, a wijnands, a p j bailliet, h m 050004-6 papers in physics, vol. 5, art. 050004 (2013) / m. f. assaneo et al. a hirschberg, description of the flow through in-vitro models of the glottis during phonation. application to voiced sounds synthesis, acta acust. 82, 358 (1996). [9] m e smith, g s berke, b r gerratt, laryngeal paralyses: theoretical considerations and effects on laryngeal vibration, j. speech hear. res. 35, 545 (1992). [10] i steinecke, h herzel bifurcations in an asymmetric vocalfold model, j. acoust. soc. am. 97, 1874 (1995). [11] n j c lous, g c j hofmans, r n j veldhuis, a hirschberg, a symmetrical two-mass vocal-fold model coupled to vocal tract and trachea, with application to prosthesis design, acta acust. united ac. 84, 1135 (1998). [12] t baer, vocal fold physiology , university of tokyo press, tokyo, (1981). [13] t ikeda, y matsuzak, t aomatsu, a numerical analysis of phonation using a twodimensional flexible channel model of the vocal folds, j. biomech. eng. 123, 571 (2001). [14] j c lucero, dynamics of the two-mass model of the vocal folds: equilibria, bifurcations, and oscillation region, j. acoust. soc. am. 94, 3104 (1993). [15] x pelorson, a hirschberg, r r van hassel, a p j wijnands, y auregan, theoretical and experimental study of quasisteadyflow separation within the glottis during phonation. application to a modified twomass model, j. acoust. soc. am. 96, 3416 (1994). [16] t baer, reflex activation of laryngeal muscles by sudden induced subglottal pressure changes, j. acoust. soc. am. 65, 1271 (1979). [17] j c lucero, a theoretical study of the hysteresis phenomenon at vocal fold oscillation onsetoffset, j. acoust. soc. am. 105, 423 (1999). [18] i titze, principles of voice production, prentice hall, (1994). [19] j guckenheimer, p holmes, nonlinear oscillations, dynamical systems and bifurcations of vector fields, springer, (1983). [20] e doedel, auto: software for continuation and bifurcation problems in ordinary differential equations, auto user manual, (1986). [21] j sitt, a amador, f goller, g b mindin, dynamical origin of spectrally rich vocalizations in birdsong, phys. rev. e 78, 011905 (2008). [22] a amador, f goller, g b mindlin, frequency modulation during song in a suboscine does not require vocal muscles, j. neurophysiol. 99, 2383 (2008). 050004-7 papers in physics, vol. 6, art. 060008 (2014) received: 10 october 2014, accepted: 10 october 2014 edited by: l. a. pugnaloni licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060008 www.papersinphysics.org issn 1852-4249 commentary on “jamming transition in a two-dimensional open granular pile with rolling resistance” roberto arévalo1∗ i. introduction the paper “jamming transition in a twodimensional open granular pile with rolling resistance” studies the jamming transition using the flow of granular particles from piles. in this commentary i would like to give a perspective on the results by magalhães et al. [1] and previous related studies obtained in silos. we speak about jamming transition in a silo when an arch forms at the outlet effectively arresting the flow. this is likely to happen when the size of the outlet is a few times bigger than that of the flowing particles. what factors control the probability of arch formation? if we fix the nature of the particles, it is found that the pressure close to the outlet has a major influence. in ref. [2], it was shown that altering this pressure in a two-dimensional silo can change the probability of arch formation by two orders of magnitude. at the same time, the flow rate or velocity field are modified only slightly. in a flat-bottomed silo, pressure increases linearly from the free surface toward the bottom. at a certain depth, however, pressure saturates and remains constant until the base. this is the well known janssen effect. it is usually attributed to ∗e-mail: aroberto@ntu.edu.sg 1 nanyang technological university, school of physical and mathematical sciences, physics and applied physics, 21 nanyang link, 637371, singapore. the friction of the walls that sustains part of the weight of the column of particles. the depth at which pressure saturates depends on properties of the grains (like the poisson ratio) and the geometry of the silo. this scenario for the silo contrasts greatly with that of a pile, as the ones studied in ref. [1]. piles have no lateral walls so no janssen effect is present. further, the geometry of the pile itself assures that pressure is not homogeneous. as it could be expected, pressure increases linearly from the exterior as we move toward the center. surprisingly, the pressure is not maximum at the center of the pile. instead, it shows a drop that has been termed “pressure dip” [3]. when one opens an orifice at the base of a silo, the material inside outpours driven by gravity. as stated, if the size of the outlet is a few times that of the flowing particles, an arch may form that jams the flow. as the size of the orifice grows, the probability of arch formation decreases. eventually, for relatively large orifices, no arches are observed during the accessible time-scale and the flow seems continuous. actually, the last statement is currently under debate. some authors have conjectured that there exist two regimes in the silo discharge. one for orifices with radius r < rc, where rc is some critical value (in units of the particles’ radii. since most of the literature concerns disks in 2d or spheres in 3d, i will refer here to the radii of the particles and that of the orifice). in this regime arching 060008-1 papers in physics, vol. 6, art. 060008 (2014) / r. arévalo probability is always bigger than zero and arches are bound to appear. a second regime is claimed to exist for r > rc, in which the probability to observe a stable arch is zero and the flow is actually continuous. ii. silos, piles and jamming several works have been devoted in the last years to try to settle the accuracy of this picture, and to obtain the value of rc for different geometries. in ref. [4], zuriguel et al. experimentally study the case of a cylindrical silo filled with spherical particles as well as other shapes (small cylinders with low aspect ratio and rice). they measure the size s of the avalanches, defined as the number of particles fallen between two clogs. it is found that their data are well described by a power law of the form < s >= a/ (rc −r) γ . (1) this leads them to propose that the avalanche size indeed diverges at rc (whose value depends slightly on the shape of the particles). for spheres, they find rc = 4.94 and γ = 6.9. a finite size analysis was carried out in ref. [5]. to this end, the jamming probability function j (n,r) was defined as the probability that the flow clogs before n particles have fallen. in other words, the probability to observe an avalanche of size n for a given aperture size r. fixing the value of n, the authors compute the jamming probability jn (r). for low values of n, jn (r) presents a maximum value of 1 for low r and a minimum value 0 for big aperture size. the transition between these values is fast but smooth, with a well defined slope. upon increasing n, the transition becomes more abrupt and the slope steepens. a limiting step function as n → ∞ would be the hallmark of a real abrupt transition separating the two regimes at rc. the authors fit the data to jn (r) = {1 − tanh [α (r−rc)]}/2, which actually converges to the heaviside function in the limit α → ∞. however, there is no theoretical justification for this expression, apart from the good fit. an analogous research in 2d (using spherical particles) was carried out in ref. [6]. it was found that the data on avalanche size could be fit with the same expression than in the 3d case, with values rc = 8.5 and γ = 12.7. nevertheless, a careful finite-size system analysis leads the authors to conclude that there are no two separated regimes in this case. as in the 3d case, the jamming probability function j (n,r) is used. also in this case, the data from experiments show an apparent convergence toward a step function. however, by a subsequent theoretical analysis, the authors found a limiting function which is smooth and shows a doubly exponential dependence in r. the fact that the experimental data could be fit by eq. (1) is explained taking into account that the maximum measured avalanche is for r ' 5.5 while the presumed critical value is much higher, rc = 8.5. to et al. [7] have studied the jamming of disks in 2d in a hopper geometry. measuring the jamming probability defined above, they find that their data can be fitted with several different expressions. first, by an independent theoretical analysis of their experiment, an expression for jn (r) is found which is formally analogous to the one proposed by janda et al., and does not contain any divergence. however, their data can be well fit by eq. (1) and an exponential containing a divergent term. being the curves indistinguishable in the range of data available, the conclusion is that the question of the existence of a critical outlet size cannot be settled. recently, magalhães et al. have proposed a series of simulations [1, 8] addressing the existence of a critical outlet for the flow of grains in a different geometrical setting. they use a realistic soft particle molecular dynamics method with static friction, which is supplemented with rolling resistance in their contribution to this volume. briefly, the protocol implemented is as follows: in 2d a pile of grains is created by allowing particles to “rain” over a base of a certain width l, which is used as a measure of the size of the initial piles. once the pile is stable, a hole of size r is opened at the center of the base and particles flow through it. the pile ends up in one of two possible final states: i) a stable arch appears around the outlet blocking the flow; ii) no arch is formed and the pile collapses completely, only a few particles remain at both sides of the outlet. in both cases, the number of particles fallen is recorded and, in case i), also the height h of the final pile. the authors propose that this scenario can be 060008-2 papers in physics, vol. 6, art. 060008 (2014) / r. arévalo likened to a phase transition between the two regimes. taking h as an order parameter, the plot h (r) shows three regimes. for small apertures, h is essentially a constant which depends on the size of the initial pile. for big r, the pile collapses and so h = 0. the transition between the two values is gradual for small piles and becomes more abrupt for larger piles. the fluctuations of h as a function of the outlet size present a well defined peak at some value r∗ which depends on the size l of the initial pile. as l grows, r∗ seems to converge to a limiting value, and when extrapolating l → ∞, the authors find a critical outlet size rc = 5.0. in this case, there is no attempt at a theoretical analysis of the observations that could further support the extrapolation of h (r) to a step function. in their contribution to this volume, the authors revisit this problem with new simulations in which they implement rolling friction, making them more realistic. the main conclusion is that the picture drawn in their first work holds, although a correction in the value of rc is introduced. iii. a critical outlet? let us, at this point, critically address the results summarized in the previous section. regarding the 3d silo, the conclusion that there exists a critical outlet size is based on the good fit of the data on avalanche size by eq. (1). in principle, there is no reason precluding the idea that avalanches keep growing for large orifices, with the probability of arch formation decreasing without ever reaching zero. that this is not observed could be simply due to the limitations of the experimental setup. the observations were carried out keeping the height of material inside the silo constant. this is required to assure steady state conditions and, in particular, that there is always a janssen effect present (if the height of the material decreases too much, pressure does not saturate). under this conditions, it was simply not possible to refill the silo in order to measure avalanches bigger that a few million of grains. from a more theoretical perspective, eq. (1) might seem a bit unsatisfactory. in first place, coefficient a lacks a fundamental interpretation, being just the size of the avalanche for an orifice of size r = rc−1. in second place, the value of the exponent γ is rather high compared to what one usually finds in phase transitions; and this holds for both the 2d and 3d silos. third, the avalanche size should approach zero as the outlet size approaches one, in units of the diameter of the particles. as a matter of fact, other expressions can be found to fit the available data. for example < s >= a(r−1)e(b·r c) gives a good fit with a ' 3.4, b ' 0.04 and c ' 3.8. this expression is not divergent and goes to zero as the outlet size approaches that of the particles. although given without theoretical motivation, it shows that the present data are not enough to settle the question of the existence of a critical outlet size. a better understanding of the process of arch formation and how it is affected by the flow is required to gain insight into this problem. in the case of 2d silos we find, in addition to a wealth of experimental data and careful finite size study, a theoretical insight into the process of arching. in both the independent analysis by to and janda et al. formally equivalent expressions for the jamming probability are found. this reinforces the conclusion that the avalanche size does not diverge and there is no critical outlet. however, one must consider i) that there are underlying asumptions in the theoretical considerations, and ii) that the available data can be well fit with diverging expressions. more (difficult to obtain) data for orifices similar to the putative critical size would be necessary to make a strong statement. it is worth mentioning that the theoretical derivation of the jamming probability in ref. [6] leads, when applied to a 3d silo, to an expression that does not fit the corresponding experimental data. this could be due to some assumption being valid in 2d but not in 3d. or, one can also consider the possibility that there is no transition in two dimensions, and a critical outlet size appears in three dimensions. this is a perfectly possible scenario, as we know from other phase transitions whose existence depends on the dimensionality of the system. magalhães et al. propose an analogous study in a radically modified geometry. one can imagine that removes the lateral walls of a 2d silo, then the column of particles collapses and ends up in a pile of certain width. under these conditions, there is 060008-3 papers in physics, vol. 6, art. 060008 (2014) / r. arévalo no janssen effect, the pressure is not homogeneous and the dynamics could be very different from that of a silo. in any case, as mentioned in the introduction, the probability of arch formation could be drastically affected. thus, it is not obvious that the conclusions reached for silos should hold in the case of piles. actually, the conclusion in the previous work, ref. [8], is that there exists a critical outlet for the flow from piles. the obtained value rc = 5 is far below to rc = 8.5, reported by janda el al. for the 2d silo. without taking into account the existence or not of a critical outlet, this could be an indication that the two setups are not comparable. in their present contribution, magalhães et al. report a correction to rc = 5.3 for a contact model that includes rolling resistance (rr). this is reasonable, since particles interacting with rr should interlock more easily, leading to more stable arches. nevertheless, this new rc value is still much smaller than the one found for the 2d silo. iv. avalanche size and flow it may seem difficult to imagine the formation of an arch stable enough to arrest the flow in a silo with a large aperture. effectively, it appears that the flow will sweep any incipient arch and will continue undisturbed. however, the flow is a tunable feature of the silo, known to vary with the square root of gravity, as manifested in experiments [9–11] and simulations [12]. gravity can be modified in experiments up to a certain point, and it is always possible in simulations, so let us call γ to the imposed body force on the particles. one can imagine, then, that reduces γ, thus making the flow slower, even for large apertures. under these conditions, molecular dynamics simulations [12] lead to the conclusion that the size of the avalanches increases with the kinetic energy of the system. this is so because, when reducing the driving force, the particles in the outlet region have more time to dissipate their kinetic energy and form a stable structure that blocks the flow. it is then conceivable that one can have a γ small enough to allow the particles form a stable arch before being swept by the flow. should this picture be correct, there would not be a critical outlet size. arches would simply be less likely to be stabilize as the flow was increased by increasing r. the observation of arches would be limited by the time window allowed by the experiments or simulations. in order to shed some light in the process of arching a low γ, additional simulations with a variable outlet size are currently under way. v. conclusions in this brief commentary, i have tried to give a perspective on recent results on the jamming of particle flows in silos and its relation with magalhães et al. contribution to this volume. it is commonplace that silos get jammed. when the size of the aperture is not much larger than that of the particles, arches appear to block the flow. upon increasing the size of the outlet, arches become scarcer. and for large orifices, they are not seen at all at accessible time scales. the observation of power-law relations between the size of the avalanches and the outlet size sparked the interest in considering a critical-like transition between two regimes: a jamming regime in which arches are bound to appear and block the flow, and a continuous regime in which there are no arches. both regimes would be separated by a critical outlet size. magalhães et al. undertake an analogous research changing the conditions of the reservoir from which the particles flow. they use piles, which have open boundaries, and introduce new complications to be considered. if not transferable from one another, the results in piles and silos will widen our perspective on the clogging of granular particles. let me, very briefly, summarize my opinion on the existence of a critical outlet: • experimental data in 3d silos are compatible with a transition from a jammed to a continuos flow regime at a certain value of the outlet. however, descriptions not involving a divergence are also possible. so far, we lack a theoretical frame that justifies one view over the other. • experimental data in 2d silos are also compatible with both the existence and the absence of a critical outlet. some theoretical insights may point toward a picture without a critical 060008-4 papers in physics, vol. 6, art. 060008 (2014) / r. arévalo outlet. however, one should take into account that: these theoretical insights contain underlying assumptions. they are not always based on a complete understanding of the arching process. • simulations in refs. [1, 8] on the flow from 2d piles are compatible with the existence of a critical outlet. there is not a theoretical picture in which to understand the results yet. • due to the disparities in the conditions of the reservoir, it is not a priori clear that results in 2d silos and 2d piles should be comparable. • based on results of recent simulations in 2d, one can speculate that an arch can always block the outlet irrespective of the flow rate and, hence of r. for large r, these arches are just extremely unlikely. possible ideas to advance in the future could be: • obtain more experimental data for r close to the transition in silos and piles. this could eliminate alternative expresions for the avalanche size. • carry out experiments in inclined silos to mimic reduced gravity. • investigate the dynamics of particles in the region close to the outlet. here, simulations should be extremely useful, given that one has access to all the variables involved. • extend the study of piles to three dimensions and diverse conditions, as reduced gravity. [1] c f m magalhães, a p f atman, g combe, j g moreira, jamming transition in a twodimensional open granular pile with rolling resistance, pap. phys. 6, 060007 (2014). [2] i zuriguel, a janda, a garcimart́ın, c lozano, r arévalo and d maza, silo clogging reduction by the presence of an obstacle, phys. rev. lett. 107 278001 (2011). [3] a p f atman, p brunet, j geng, g reydellet, p claudin, r p behringer, e clément, from the stress response function (back) to the sand pile “dip”, eur. phys. j. e 17 93 (2005). [4] i zuriguel, a garcimart́ın, d maza, l a pugnaloni and j m pastor, jamming during the discharge of granular matter from a silo, phys. rev. e 71 051303 (2005). [5] i zuriguel, l a pugnaloni, a garcimart́ın, d maza, jamming during the discharge of grains from a silo described as a percolating transition, phys. rev. e 68 030301 (2003). [6] a janda, i zuriguel, a garcimart́ın, l a pugnaloni and d maza, jamming and critical outlet size in the discharge of a two-dimensional silo, europhys. lett. 84 44002 (2008). [7] k to, p y lai and h k pak, jamming of granular flow in a two-dimensional hopper, phys. rev. lett. 86 71 (2001); k to, jamming transition in two-dimensional hoppers and silos, phys. rev. e 71 060301 (2005). [8] c f m magalhães, j g moreira and a p f atman, catastrophic regime in the discharge of a granular pile, phys. rev. e 82 051303 (2010). [9] w a beverloo, h a leniger, and j van de velde, the flow of granular solids through orifices, chem. eng. sci. 15 260 (1961). [10] c mankoc, a janda, r arévalo, j m pastor, i zuriguel, a garcimart́ın and d maza, the flow rate of granular materials through an orifice, gran. matt. 9 407 (2007). [11] s dorbolo et al., influence of the gravity on the discharge of a silo, gran. matt. 15 263 (2013). [12] r arévalo, i zuriguel, d maza, a garcimart́ın, role of driving force on the clogging of inert particles in a bottleneck, phys. rev. e 89 042205 (2014). [13] a janda, i zuriguel and d maza, flow rate of particles through apertures obtained from self-similar density and velocity profiles, phys. rev. lett. 108 248001 (2012). 060008-5 papers in physics, vol. 7, art. 070009 (2015) received: 17 may 2015, accepted: 12 june 2015 edited by: a. vindigni reviewed by: m. perfetti, dipartimento di chimica, universitá di firenze, italy licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070009 www.papersinphysics.org issn 1852-4249 an intermediate state between the kagome-ice and the fully polarized state in dy2ti2o7 s. a. grigera,1, 2∗ r. a. borzi,3 d. g. slobinsky,1† a. s. gibbs,1 r. higashinaka,4 y. maeno,5 t. s. grigera3 dy2ti2o7 is at present the cleanest example of a spin-ice material. previous theoretical and experimental work on the first-order transition between the kagome-ice and the fully polarized state has been taken as a validation for the dipolar spin-ice model. here we investigate in further depth this phase transition using ac-susceptibility and dc-magnetization, and compare this results with monte-carlo simulations and previous magnetization and specific heat measurements. we find signatures of an intermediate state between the kagome-ice and full polarization. this signatures are absent in current theoretical models used to describe spin-ice materials. i. introduction spin-ice materials are deceptively simple in their constitution: classical ising spins with nearestneighbour ferromagnetic interactions forming a pyrochlore lattice. this crystal structure can be thought as an alternating stack of kagome and tri∗e-mail: sag2@st-and.ac.uk †now at: departamento de ingenieŕıa mecánica, facultad regional la plata, universidad tecnológica nacional, 1900 la plata, argentina. 1 school of physics and astronomy, university of st andrews, north haugh, st andrews ky16 9ss, uk 2 instituto de f́ısica de ĺıquidos y sistemas biológicos, unlp-conicet, 1900 la plata, argentina 3 instituto de investigaciones fisicoqúımicas teóricas y aplicadas unlp-conicet and departamento de f́ısica, facultad de ciencias exactas, universidad nacional de la plata, 1900 la plata, argentina 4 graduate school of science, tokyo metropolitan university, hachioji, tokyo 192-0397, japan. 5 department of physics, kyoto university, kyoto 6068502, japan. angular lattices along the [111] direction. the spins sit at the vertices of tetrahedra and can point either to their center or towards the outside. the magnetic frustration can be seen at the level of a single tetrahedron: the energy is minimized by having two spins pointing inwards and two outwards. this is the ice rule, which corresponds exactly to the pauling rules for protons in water ice; like the latter, it also leads to zero-point entropy, a characteristic signature of spin-ice systems [1]. we have chosen to work on dy2ti2o7 as the cleanest example of a spin-ice material. its ground state properties can be well described by a model with only an effective nearest neighbour exchange interaction jsi of ≈ 1.1 k [2]. within this framework, when one applies an external magnetic field h in [111] below 1 k, the polarization of the system will happen in two steps. first, the spins in the triangular lattice that lie parallel to [111] will orient along the magnetic field, removing part of the residual entropy but with no change in the configurational energy [3, 4]. when the magnetic moment of this sublattice has saturated, the magnetization m cannot be further increased without breaking 070009-1 papers in physics, vol. 7, art. 070009 (2015) / s. a. grigera et al. the spin-ice rule. this leads to a plateau as a function of field at m = 3.33 µb/dy-ion, characteristic of the kagome ice state. at higher fields, the spins in the kagome lattice are finally fully polarized, leading to a sudden but continuous increase in m towards its saturation. this behavior was predicted theoretically and found in monte carlo simulations [5, 6]. in spite of this, something different happens in real spin-ice materials. since the magnetic moment of the magnetic ions in spin-ice materials is quite large – the one associated with dy3+ ions in dy2ti2o7 is near 10µb– , long range dipolar interactions have to be considered [7]. these interactions do not alter the zero field ground state [8], but have a big effect on its excitations. in relation to this, the transition to the fully polarized state —which is the main concern of this paper— experiments a qualitative change. t. sakakibara and collaborators [9] studied experimentally the magnetization with h// [111] to temperatures much smaller than jsi. after a well defined plateau at ≈ 3.33 µb/dy-ion, they observed a very sharp increase in the magnetization. the presence of hysteresis was a convincing argument that the real system reaches the fully polarized state through a metamagnetic first order phase change at the lowest temperatures. the change in m becomes continuous at the critical end-point tc = 360 ± 20mk and µ0hc ≈ 0.93 t [9]. the change in character of this transition — from a crossover to a discontinuity when dipolar interactions are included— was later understood in terms of the defects associated with the breaking of the ice rules, or monopoles. the nearest neighbors model corresponds to the case of free non-conserved monopoles sitting in a diamond lattice. including dipolar interactions implies turning on a coulomb interaction between these charges, allowing them to condense through a real first order transition [10]. numerical simulations (including ewald summations to take into account long range interactions) proved this picture right, and provided an additional validation to the dipolar model [10]. the mvs. h curves obtained in these simulations are quite symmetrical around hc. the jump in the magnetization ∆m(t) when crossing the first order transition line grows very abruptly with decreasing temperature: for t only ≈ 10% below tc, ∆m(t) amounts to ≈ 90% of the total change in magnetization from the kagome ice to full saturation. in other words, almost full order is achieved in the system for temperatures just below tc and a magnetic field of 1 t. specific heat cp measurements confirmed the existence of a critical end-point —a sharp peak is clearly seen very near the precise spot in field and temperature specified by sakakibara et al. [11]. however, the identification of a single first-order line below tc is less clear. the cp(t)vs.h curves show peaks at the fields hc(t) identified in [9] as the first-order line, albeit of much smaller amplitude than that at tc. additionally, below 300 mk, a second peak at higher fields is discernible [11]. even at the lowest temperatures (100 mk), magnetic fields above 2 t are needed to coerce the specific heat down to 0. this suggests that, in spite of the absence of thermal excitations, the system does not reach full polarization immediately after the first order transition from the kagome ice, and an intermediate state establishes between these two well-known phases. this specific heat features were confirmed by ac-susceptibility measurements on the same samples [12]. in all cases, the sample sat at a fixed platform with respect to the magnetic field and therefore the alignment with respect to the [111] direction was within a few degrees. an angular dependent study of the magnetization with sato and coworkers [19] showed these asymmetries, and additional features in the polarization transition were seen at small angles away from [111]. the implications of these results in the current understanding and modeling of the spin-ice materials have not been considered. in this paper, we study in detail this additional intermediate state, and show that it cannot be explained by any of the models currently used to study spin-ice materials. working at small angles away from [111], we looked for a magnetic signature by repeating the static magnetization measurements in several samples. in addition, improving the sensitivity by three orders of magnitude, we measured ac-susceptibility at different frequencies, which also allowed us to do a characterisation of the dynamics of the observed transitions. in order to gain further insight into this possible intermediate state, we performed monte carlo simulations of the experimental situation using the currently accepted models including ewald summations and exchange interactions up to the third nearest neighbor [13, 14]. 070009-2 papers in physics, vol. 7, art. 070009 (2015) / s. a. grigera et al. figure 1: real (left) and imaginary (right) parts of the ac-susceptibility as a function of temperature and magnetic field, for temperatures between 50 mk and 6000 mk and magnetic fields between -4 and 4 t. the oscillatory field was of amplitude 0.05 oe and at a frequency of 87hz. the zero-field schottky-type anomaly corresponding to the onset of spin-ice correlations, and the peaks corresponding to the critical point at ±1 t and ≈400 mk are clearly seen. ii. methods for our work, we measured several dy2ti2o7 single crystals grown in kyoto and in st andrews with the floating-zone method. samples were oriented using laue diffraction and cut into 3 mm long prisms of square or octagonal section of approximately 1 mm2, with the [111] direction along the long axis to reduce demagnetising effects with the field in the vicinity of [111] (≈ 5o). the experiments were performed in a dilution refrigerator in st andrews. samples were thermally grounded to the mixing chamber through gold wires attached into them with silver paint. for susceptibility, we used a drive field of 3.3·10−5 t r.m.s., and counter-wound pickup coils each consisting of approximately 1000 turns of 12 µm diameter copper wire. the filling factor of the sample in the pick up coil was of approximately 90%. we measured using drive fields of frequencies varying from approx. 10 hz to 1.0 khz. low temperature transformers mounted on the 1 k pot of the dilution refrigerator were used throughout to provide an initial signal boost of approximately a factor of 100. the magnetization was measured using a home-built capacitance faraday magnetometer [15]. iii. results and discussion figure 1 shows the real (∆χ′, left) and imaginary (∆χ′′, right) parts of the ac-susceptibility χ as a function of temperature and magnetic field in the whole area of interest. the excitation frequency in this case is ω = 87 hz; the main features we describe in the following are qualitatively independent of ω. at zero field, there is a very noticeable peak in both ∆χ′ and ∆χ′′ for t ≈ 2 k. this corresponds to the schottky-type anomaly associated with the onset of spin-ice correlations of the system. the magnetic field axis spans from -4 to 4 t, and we can clearly see in the real part two peaks (at positive and negative fields) corresponding to the critical point at ≈ ±1 t and ≈ 400 mk. for temperatures below 400 mk, we see a much smaller feature in ∆χ′′, which has a correspondence in ∆χ′′: a ridge with an amplitude that decreases as a function of temperature. the magnitude of the latter is comparatively very small. at low temperatures and for fields 0.3 t< |µ0h| < 0.9 t, and 2 t < |µ0h|, the susceptibility is very low, in accordance to the kagome ice plateau and the saturation in the magnetization, respectively. we now concentrate on the real part of the susceptibility at temperatures below jsi. in fig. 2, we can see a series of curves at fixed temperatures (from 50 to 500 mk) and fields between −3.5 and 3.5 t. the excitation field used was 0.05 oe, and the frequency 87 hz. the curves have been offset by 30% for clarity. the field was swept from negative to positive values. before the kagome ice state is established, the low field susceptibility (|µ0h| < 0.3 t) at temperatures below 600 mk is strongly dependent on the magnetic field sweep rate and direction (increasing or decreasing), both signs 070009-3 papers in physics, vol. 7, art. 070009 (2015) / s. a. grigera et al. figure 2: low temperature real part of the acsusceptibility as a function of field at fixed temperatures as indicated in the plot. the excitation field was 0.05 oe at a frequency of 87hz. the curves are offset by 30% for clarity. as temperature is lowered from 400 mk, the peak at approx. 1 t at 400mk rapidly decreases in amplitude, and splits into two peaks at lower temperatures. of out-of-equilibrium behavior. at higher magnetic fields, we only observe a small difference in the height of the peaks at around ±1 t, depending on whether the transitions are swept upwards or downwards in field. the position changes very little, and the shape of the features is unaltered. as we lower the temperature, the peak at ≈ 1 t decreases markedly in amplitude, but without a corresponding change in its high field side shoulder. below 400 mk, it eventually splits into two distinct features. their separation in the field axis (≈ 0.1 t at 300 mk) is consistent with previous measurements for a similar sample orientation with respect to [111] [19]. while the first set of peaks has a correlate in the imaginary part of ∆χ (not shown here), no feature is discernible in ∆χ′′ for the peaks at higher fields. in fig. 3, we have plotted the position of these peaks as a function of field and temperature (white circles), and the position of the critical point (black circle). we have taken the specific heat data from reference [11] and determined the position of the peaks in c vs. h for different temperatures. these are plotted in this same graphic as red symbols. figure 3: phase diagram with field slightly tilted from [111] (θ ≤ 5o). an intermediate phase is seen between the kagome-ice and fully polarized regions. the black circle is the critical point as identified from a peak in the real part of the ac-susceptibility, χ′. the dotted white circles correspond to a small doubled peaks seen in χ′ with a corresponding feature in the imaginary part χ′′, while the white circles denote small peak in χ′ with no signature in χ′′. the red circles are taken from peaks in the specific heat (c) measurements of reference [11]. the main divergence of c seen in reference [11] and identified as a critical point coincides with the critical point (black circle). the coincidence between these two experiments of different quantities, on different samples, laboratories and experimental setup is remarkable. as mentioned before, this secondary peak at higher fields is absent in the dm/dh data presented on ref. [9]. we measured the magnetization using a faraday balance on the same samples and under similar temperature and field conditions than before [16]. the main body of fig. [17] shows our dm/dh as a function of field, compared with curves of ∆χ at t = 100 mk and frequencies spanning two orders of magnitude (from ω ≈ 10 to 1000 hz). for clarity, we have multiplied ∆χ by a factor of twenty. the peak in dm/dh is markedly asymmetric, with an extended tail in the high field side but no additional feature is seen at high fields, in coincidence with sakakibara’s observations. on the other hand, the second peak is clearly seen for low temperature (t < 400 mk) at all measured frequencies in 070009-4 papers in physics, vol. 7, art. 070009 (2015) / s. a. grigera et al. 0.5 1.0 1.5 0 10 20 [ b / t d y] 0.5 1.0 1.5 2.0 0 5 10 15 20 25 [ b / t d y] field (tesla) 20x dm/dh figure 4: dm/dh (dotted line) and real part of acsusceptibility measured at different excitation frequencies, from top to bottom: 19, 37, 77, 136, 277, 561, and 1117 hz, and for t = 100mk. for ease of comparison, the latter have been multiplied by a factor 20, and normalized to the amplitude of second peak (no imaginary part has been measured for this feature). the inset shows both sets of data in the same scale. the ac-susceptibility. while these two experiments seem to be in mutual contradiction, the issue can be easily explained in terms of the resolutions of both techniques. the inset of fig. 4 shows both sets of data on the same scale; we can see that the high field shoulder on the dm/dh peak directly corresponds (in the limit of long measurement times or low frequencies) to the second peak detected with ac-susceptibility. through this analysis, we can see that the experimental volume of data concerning this transition seems to be compatible. between ≈ 300mk and the lowest temperatures (50 mk in ref. [9]), only ≈ 60% of the total change in magnetization occurs when traversing the first order transition line. the remaining 40% is delivered gradually when the field is further increased to values well above 1.5hc, in a fashion that does not seem to depend much on temperature (see fig. 3 on ref. [9]). this gradual (as opposed to discontinuous) change is behind the asymmetric shape of the magnetization curves, and the second set of peaks in cp and ∆χ. the theoretical prediction for the transition between kagome-ice to fully polarized state with field in [111] was of a single transition—the “dimer to monomer” transition of refs. [6, 18]. a small additional perpendicular field – present in the experiments at small angles away from [111]– induces order in the dimers in the kagome-ice state, but does not change the prediction of a single transition into the fully polarised “monomer” state [18]. this might hold true when further interactions are added, such as dipolar or further neighbor exchange interactions. in order to investigate this, we performed a numerical check. we did extensive monte carlo simulations of the dipolar model including ewald summations to account for the dipolar long range interactions. we also added exchange interactions up to third nearest neighbors (taking the exchange constants and other parameters within the constraints given by refs. [13, 14]). we explored a wide range of field angles around [111], but were unable to detect a double feature in cv at low temperatures compatible with the experimental observations. it is then worth stressing that the very observation of a second feature –even when taking into account a possible sample misalignment– asks for new ingredients in the hamiltonians that are regularly used to describe spin-ice materials. given these considerations, it is difficult to discuss on the nature of this intermediate state. it is tempting to think of some sort of “charge” ordering in the diamond lattice (2-in 2-out tetrahedra within a majority of 3-1 and 1-3), previous to the final zn-blende arrangement, where only ≈ 40 − 50% of the sites are occupied by single monopoles. note that this does not rule out the possibility of still storing some residual entropy, since there are different spin configurations that generate the same charge within a given tetrahedron. we have not found previous data of the evolution of the entropy as a function of field at temperatures well below tc. however, the very asymmetric shape of the entropy at 350 mk obtained using the magnetocaloric effect shows that at this temperature the system is already experiencing a strong first order metamagnetic transition, as mentioned in ref. [20]. this work shows that a big fraction of the residual entropy stored in the kagome planes remains in the system well above hc [20], suggesting that the intermediate state is indeed a partially disordered one. 070009-5 papers in physics, vol. 7, art. 070009 (2015) / s. a. grigera et al. iv. conclusions in conclusion, we observe an intermediate state between the kagome-ice and the fully polarized state when the field is slightly tilted from the [111] direction. the signature of a double step we find in ac-susceptibility and magnetization measurements is also present in earlier calorimetric measurements, and suggested by magnetocaloric effect experiments. this feature cannot be captured by the models regularly used to describe spin-ice systems, fact that asks for further model refinements. at present, this data stands as a challenge for the development of a realistic theoretical model of spinice materials. acknowledgements we thank joseph betouras, andrew green and chris hooley for useful discussions. sag would like to acknowledge financial support from the royal society (uk), rab and tsg from conicet, unlp and anpcyt (argentina). [1] m j harris, s t bramwell, d f mcmorrow, t zeiske, k w godfrey, geometrical frustration in the ferromagnetic pyrochlore ho2ti2o7, phys. rev. lett. 79, 2554 (1997). [2] s t bramwell, m j p gingras, spin ice state in frustrated magnetic pyrochlore materials, science 294, 1495 (2001). [3] z hiroi, k matsuhira, s takagi, t tayama, t sakakibara, specific heat of kagom ice in the pyrochlore oxide dy2ti2o7 , j. phys. soc. jpn. 72, 411 (2003). [4] m udagawa, m ogata, z hiroi, exact result of ground-state entropy for ising pyrochlore magnets under a magnetic field along [111] axis, j. phys. soc. jpn. 71, 2365 (2002). [5] m j harris, s t bramwell, p c w. holdsworth, j d m champion, liquid-gas critical behavior in a frustrated pyrochlore ferromagnet, phys. rev. lett. 81, 4496 (1998). [6] s v isakov, k s raman, r moessner, s l sondhi, magnetization curve of spin ice in a [111] magnetic field, phys. rev. b 70, 104418 (2004). [7] s t bramwell, m j harris, b c den hertog, m j p gingras, j s gardner, d f mcmorrow, a r wildes, a l cornelius, j d m champion, r g melko, t fennell, spin correlations in ho2ti2o7: a dipolar spin ice system phys. rev. lett. 87, 047205 (2001). [8] s v isakov, r moessner, s l sondhi, why spin ice obeys the ice rules phys. rev. lett. 95, 217201 (2005). [9] t sakakibara, t tayama, z hiroi, k matsuhira, s takagi, observation of a liquid-gas-type transition in the pyrochlore spin ice compound dy2ti2o7 in a magnetic field, phys. rev. lett. 90, 207205 (2003). [10] c castelnovo, r moessner, s l sondhi, magnetic monopoles in spin ice, nature 451, 42 (2008). [11] r higashinaka, h fukazawa, k deguchi, y maeno, low temperature specific heat of dy2ti2o7 in the kagome ice state, j. phys. soc. jpn. 73, 2845 (2004). [12] r higashinaka, field orientation control of geometrical frustration in the spin ice dy2ti2o7, ph. d. thesis, kyoto university (2005). [13] j p c ruff, r g melko, m j p gingras, finitetemperature transitions in dipolar spin ice in a large magnetic field, phys. rev. lett. 95, 097202 (2005). [14] t yavors’kii, t fennell, m j p gingras, s t bramwell, dy2ti2o7 spin ice: a test case for emergent clusters in a frustrated magnet, phys. rev. lett. 101, 037204 (2008). [15] d slobinsky, r a borzi, a p mackenzie, s a grigera, fast sweep-rate plastic faraday force magnetometer with simultaneous sample temperature measurement , rev. sci. instrum. 83, 125104 (2012). [16] since we are using two different probes, the exact position and orientation of the sample with respect to the magnet differs slightly between the 070009-6 papers in physics, vol. 7, art. 070009 (2015) / s. a. grigera et al. ac-susceptibility and the magnetization measurement, and a positive shift of 0.02 t in the magnetization measurement was necessary to make the critical field of the first order transition coincide. [17] the jump in magnetization ∆m is essentially independent of temperature at low t. since ∆m is the integral of the susceptibility, one would naively expect that area below the peak in χ′ (fig. 2) to be also independent of t . but this is true only for the dc susceptibility, or, more accurately, for χ′ measured at frequencies lower than the inverse of the longest relaxation time. the fact that we can measure an out of phase response ∆χ′′ reveals we are actually measuring dynamic response, i.e. that our frequencies are high and some relaxation processes do not contribute to ∆χ′. since relaxation times grow on lowering the temperature, the area loss observed in this figures is quite natural. [18] r moessner, s l sondhi, theory of the [111] magnetization plateau in spin ice, phys. rev. b 68, 064411 (2003). [19] h sato, k matsuhira, t sakakibara, t tayama, z hiroi, s takagi, field-angle dependence of the ice-rule breaking spin-flip transition in dy2ti2o7, j. phys. condens. matter 19, 145272 (2007). [20] h aoki, t sakakibara, k matsuhira, and z hiroi, magnetocaloric effect study on the pyrochlore spin ice compound dy2ti2o7 in a [111] magnetic field, j. phys. soc. jpn. 73, 2851 (2004). 070009-7 papers in physics, vol. 5, art. 050005 (2013) received: 7 december 2012, accepted: 19 june 2013 edited by: j. j. niemela reviewed by: v. lakshminarayanan, waterloo university, canada licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.050005 www.papersinphysics.org issn 1852-4249 enhancement of photoacoustic detection of inhomogeneities in polymers p. grondona,1 h. o. di rocco,2 d. i. iriarte,2 j. a. pomarico,2 h. f. ranea-sandoval,2∗ g. m. bilmes3 we report a series of experiments on laser pulsed photoacoustic excitation in turbid polymer samples addressed to evaluate the sound speed in the samples and the presence of inhomogeneities in the bulk. we describe a system which allows the direct measurement of the speed of the detected waves by engraving the surface of the piece under study with a �duciary pattern of black lines. we also describe how this pattern helps to enhance the sensitivity for the detection of an inhomogeneity in the bulk. these two facts are useful for studies in soft matter systems including, perhaps, biological samples. we have performed an experimental analysis on grilon®samples in di�erent situations and we show the limitations of the method. i. introduction in highly light-scattering materials, such as certain types of polymers, turbid liquids, glassy structures, and body organs, inspection and monitoring of internal features were made possible by means of xray irradiation until the development of ultrasound imaging. the former has the well-known disadvantage that in biological tissues it may trigger degenerative processes in the cells, and in non-biological samples, x-ray inspection is not always simple to perform directly in the production line. ultrasound imaging is very helpful in these situations. on the other hand, visible light optical tomogra∗e-mail: hranea@exa.unicen.edu.ar 1 universidad nacional de rosario. facultad de ciencias bioquímicas y farmacéuticas. rosario (santa fe) argentina. 2 instituto de física �arroyo seco�, universidad nacional del centro de la provincia de buenos aires. calle pinto 399, b7000ghg, tandil (buenos aires) argentina. 3 centro de investigaciones ópticas (conicet-cic) and facultad de ingeniería universidad nacional de la plata, la plata. argentina. phy and optical topography is nowadays reaching the status of clinical resource in the detection and monitoring of several types of tumors and for noninvasive evaluation of oxygenation of tissues in biological samples. in non-clinical applications it can be used for the detection of abnormal bodies within materials, which is of great importance in quality control in several areas of technology. these techniques were derived from the study of light propagating in turbid media, and applied afterward to biological samples and medical imaging of di�erent parameters often using polymers as phantoms of biological tissues [1�7]. the photoacustic e�ect (pa) provides a method of analysis that has been used in clear �uids and has su�ciently proven its capability for detecting very low concentrations of absorbing species in a mixture or solution; it has also been used for the monitoring of molecular processes in di�erent environments as shown in references [7�17]. this paper intends to make a contribution on the application of the pa in soft matter, namely the detection of inclusions in polymer samples and the direct determination of the speed of sound in the material used 050005-1 papers in physics, vol. 5, art. 050005 (2013) / p. grondona et al. for the samples. the pa technique has the advantage that acoustic waves do not scatter as light does in the characteristic lengths of many experimental situations. even if the excitation light undergoes scattering, the location of an inhomogeneity within the bulk of the sample can be achieved by detecting the remnant of the shock wave generated at an absorbing region or at an interface at which the speed of the sound waves changes. repeating this inspection at other relative positions of the laser and the acoustic detector and with the aid of a suitable algorithm, a su�ciently precise location of a single inhomogeneity of simple geometry can thus, in principle, be resolved, together with some information about its composition (using at least two wavelength for the excitation), provided the speed of sound is known (see, for example, ref. [20]). an example of this is presented in ref. [21] in a rear-detection scheme used for detecting inhomogeneities in subsurface inhomogeneities in metals. the pa detection of bodies included in a turbid medium may provide complementary information to di�use light propagation studies in that medium. namely, it could bring an independent value for the absorption coe�cient, and it thus may help in the solution of the inverse problem in optical tomography of samples. the speed of sound determination relies on the fact that the acoustic signal picked by the transducer arrives at times proportional to the distance from the laser beam that generates the shock wave to that transducer. a drawback with the photoacustic method applied to turbid materials is that the light scattered by the bulk generates a pressure pulse on the detector if it is in contact with a free surface of the sample explored. consequently, the time of arrival of the pressure signal at the detector is insensitive to the relative position of the laser and the sensor. hence, the speed of the waves involved in the pa signal is di�cult to determine and requires an adequate procedure to evaluate it. this is one of the motivations of this contribution. for this paper, we used a laser pulsed photoacustic system equipped with a pzt in contact with the sample made of polymer grilon®which is representative of a turbid medium, to show how the presence of controlled, �duciary absorbing regions in the surface of a sample are used as local wave generators that allow the determination of the speed of waves in materials despite of the light scattering described. we have engraved in the surface of the samples a pattern of stripes of absorbing material. in this way, we have a greater signal whose contribution may be discriminated from the signal generated by the light scattered by the sample material. we demonstrate that this �duciary pattern is useful also to enhance the photoacustic signal, and that from that signal the presence of inhomogeneities in a medium may be inferred. other successful recent approaches to the problem of detection of tumoral tissues in biological samples can be found in refs. [18,19]. ii. experimental a scheme of the pulsed photoacustic system used in all the experiments is shown in fig. 1, which is essentially the same that can be used to determine the speed of sound in liquids and in clear samples. lps ndl lb ph pztx t s aps pas do ts figure 1: experimental setup of the experiment. the parts are: the nd+3:yag laser (ndl), the positioning device (x-ts), the pinhole (ph) to clip the laser beam (lb), and the oscilloscope (do), the ampli�er with a dc power supply (aps). the do is synchronized with the laser via the laser pulse synchronization (lps) cable. we used a pulsed nd+3:yag laser emitting at 1.06µm with a pulse duration of approximately 10 ns, at energies between 0.5 mj to 50 mj. the laser beam was clipped by means of a pinhole in order to reduce the original laser beam size and to use 050005-2 papers in physics, vol. 5, art. 050005 (2013) / p. grondona et al. a uniform spot thus reducing the power impinging on the samples. this pinhole was held at the far end of a beam dump for security reasons. in the results we present here, we have used two pinholes of 1 mm and 1.5 mm in diameter which shall be speci�ed in each experiment. this is the diameter of the laser impinging on the sample, as inferred from sensitive photographic paper. for acoustic detection, a ceramic 4 × 4 mm2 pzt transducer was strongly pressed against one of the free surfaces of the sample, namely the one normal to that facing the laser. the photoacustic signals were ampli�ed and processed by means of a tektronix tds 3032b, 300 mhz digital oscilloscope, averaging at least 64 signals before displaying the photoacustic signal. samples used were square-section parallelepipeds, 10 mm width, and 39 mm high, all made from the same polymer grilon®piece. the samples were placed in a c-clamp, with the pzt cage in one of its arms. the �duciary pattern engraved on one of the faces of some samples consists of �ve grooves of approximately 1 mm width and 0.2 mm deep, �lled with thick black paint, separated by stripes of material which retain the natural turbid white color of the polymer (which we call �clear� for short) of 1 mm, whose lengths are approximately 70% the length of the face. a second type of sample prepared in a similar fashion, but with a centered cylindrical hole of 3 mm diameter drilled in it parallel to the surfaces of the sample in all cases mentioned, was also used in the experiments in order to compare the signals with the former. this cavity was alternatively emptied or �lled with deionized water. we call �sample 1� the one drilled with the cylindrical cavity, and �sample 2� the one without the hole. figure 2 is a sketch of sample 1 with a schematic representation of the �duciary pattern used. the pzt and the laser beam relative positions are displayed, together with the approximate position of the cavity. we obtained two types of signals, those from samples without holes and those from samples with centered holes. each type was subdivided into signals taken with the laser impinging on the blank surface, and those taken with the laser impinging on the patterned surface. besides, there are sigfigure 2: the grilon®sample prepared for the surface absorption experiments. the shadowed region (pzt) is the location of the transducer with respect to the impinging laser beam direction (lbd). the cavity is a 3 mm diameter hole, whose position (hp) is shown for the samples that have drilled cavities. the cavity may be empty or �lled with water. the height of the samples is 39 mm and has a 10 mm square base. it has �ve grooves (fp) in its front face, painted in black to enhance absorption. nals obtained from the samples with cavities, either empty or �lled with water. in each sample, the laser point of impact was moved from the farthest possible position to the nearest with respect to the pzt. this was accomplished by means of a 1µm precision, step motor movable stage, zaber model t-la60a, controlled by a pc interface. iii. results in order to properly analyze the results, we calibrate the response of the system to increasing laser pulse energy. to this end we irradiate a blank surface of sample 2 at a point near the center of the face, and we plotted the amplitude of the �rst maximum of the acoustic signal as a function of the laser pulse energy. the result is displayed in fig. 3 and shows linearity in the energy range used. in the same plot, we display three points (including the origin) which are the maxima of the signal at the same location of a striped sample face, but impinging on a black groove. as it can be seen, the signal nearly trebles its maximum peak for the same excitation energy. in both experiments, the pinhole used was 1.5 mm in diameter. 050005-3 papers in physics, vol. 5, art. 050005 (2013) / p. grondona et al. 0 1 2 3 4 5 6 7 0 1 2 3 4 5 black groove at 4750 m p a s ig n a l ( v ) energy (a.u.) figure 3: pas vs. laser pulse energy. the linearity of the pa response (squares) to laser excitation is evident in the plots. the black dots represent the pa signal when the laser impinges on a black stripe nearly at the center of the front face of a striped sample. after ascertaining the linearity of the response, and the fact that there is an evident dependence of the pas on the absorbance of the surface, we obtain a pro�le of three of the grooves of sample 2 by plotting the value of the amplitude of the �rst peak of the pa signal versus the relative distance between the pzt and the excited region using the same pinhole as before. the result of this is shown in fig. 4. it can be seen that 1) the groove pro�le is neatly resolved, and 2) there is an improvement of the signal generated in the black stripes which decreases as the distance increases. since the stripes and the laser beam have approximately the same transverse size of the grooves, the resulting pro�le is somehow rounded o�, but this is not important in what we aim to prove here. we could determine the speed of sound in the sample from a plot of the time position of the beginning of the �rst peak of the acoustic signal (arrival time), as a function of the distance between the impinging point on the sample and the pzt detector. but when we try to do that, experiments demonstrate that the time elapsed since a laser pulse triggers a digital oscilloscope and the appearance of the pa signal is the same regardless of the distance between the impinging laser beam and the detector, 0 2000 3000 4000 5000 6000 0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 1 s t m a x p a s ( v ) distance to pzt (μm) figure 4: the centers of clear bands and black grooves are clearly resolved by scanning the surface with the laser. the signals were taken at 100µm displacement from each other. the energy of pas decreases with distance of the laser beam to pzt. the increment in signal due to the grooves is more than 6-fold with respect to the signal due to the bulk polymer. due mainly to the light scattered by the bulk of the polymer that hits the pzt. this poses a problem in the evaluation of the speed of the waves. to avoid that di�culty, we use the signal produced if the laser hits in the black grooves, generated only by the absorption at the grooves. we obtain this by subtracting from the pas signal measured when the laser impinges in a black groove, the pas signal obtained in a �clear� region nearby. to this end, we moved the sample slightly away from the previous black stripe, so the �rst pas signal was obtained with the beam impinging in a black groove and the second pas signal was obtained with the beam impinging in a white stripe. figure 5 shows the determination of the speed of sound in sample 2 by this method. since the plot uses as input the maxima of the amplitude of the signals, the straight line would not cross the origin. the extrapolated value for zero-crossing corresponds approximately to the amplitude of the �rst maximum of the signal in clear samples. evaluation of the slope of the resulting line allows the calculation of the value of speed, as v = (2333 ± 133) ms−1, which compares well with calculated data determined by using the properties of 050005-4 papers in physics, vol. 5, art. 050005 (2013) / p. grondona et al. 6500 7000 7500 8000 8500 9000 9500 10000 10500 4,4 4,6 4,8 5,0 5,2 5,4 5,6 a rr iv a l ti m e s distance to pzt ( m) figure 5: time of the �rst maximum of the processed signals vs. the position of the impinging point of the laser presents a linear correlation (r ≈ 0.994) that yields a value of v = (2333 ± 133) ms−1 for the speed of sound. all these signals were taken with a pinhole of 1 mm diameter. the polymer [22]. since the vibration of the whole sample has distinctive frequencies, the fft of the pa signal provides another estimation of the speed of waves, once the frequency sequence is properly found. the fast fourier transform (fft) analysis was used both, to estimate the sound speed in the grilon®sample via the frequencies identi�ed in the spectrum and their spacing, knowing that the piece is a parallelepiped of known dimensions, and to de�ne a scale for the energy of the pulse. the details of this procedure are straightforward calculations [23]. we found it useful to use the power spectrum for de�ning the energy instead of the integral of the temporal pulse and that is the parameter we use in the presentation of the results. figures 6 and 7 show an fft treatment of the signals obtained from the following three cases: solid grilon®sample, sample with empty cavity and sample with the hole �lled with deionized water. the pas energy used in this �gure is a measure of the energy content of the acoustic pulse, as evaluated from the power spectrum of the signal. in fig. 6, we display the results obtained impinging with the laser on clear faces, and in fig. 7 we show the results impinging with the laser on the patterned faces. it is clear that a distinctive feature arises near the center of the sample in the patterned faces where a black stripe is located, which is not visible in the clear-face analysis. 3000 3500 4000 4500 5000 5500 6000 30 40 50 60 70 80 p a s e n e rg y ( a .u .) distance to pzt ( m) figure 6: the acoustic energy (fft power integral of the pas) vs. the relative distance between the laser beam and the pzt in grilon®samples in a face with no �duciary pattern (clear sample). the energy diminishes as the distance to the detector increases. references in the insert: triangles represent the cavity empty in a clear sample. circles are for cavity �lled with water in clear samples. rhombi are for clear samples without the cavity. the acoustic energy deposited in the patterned faces is more than one order of magnitude higher than that obtained in the clear faces when the inclusion is present (compare fig. 6 with fig. 7). the water-�lled hole and the empty hole are also clearly distinguishable from the solid grilon®response. iv. analysis and conclusions we have performed an experimental analysis of the photoacustic signal in grilon®polymer but it can be extended to other materials as epoxy resins used also as biological phantoms. we have shown that such applications are viable for quantitative determinations. the above results can be used to determine the presence and some optical characteristics of an inhomogeneity embedded in this type of materials. all the acoustic signals detected by the pzt in this soft turbid solid material begin at approximately the same time after the laser trigger �res, regardless of the relative distance from the impact 050005-5 papers in physics, vol. 5, art. 050005 (2013) / p. grondona et al. 3000 3500 4000 4500 5000 5500 6000 0 200 400 600 800 1000 1200 1400 p a s e n e rg y ( a .u .) distance to pzt ( m) figure 7: the energy deposited by the laser on a white stripe and on the black grooves for the three cases analyzed as seen in the insert. the circles represent a grooved sample with the cavity �lled with water. triangles and rhombi are for grooved samples and the empty cavity, and no cavity, respectively. point of the excitation to the pzt acoustic detector, due mainly to scattering, making this method useless to evaluate the speed of sound by the scanning standard procedure. the di�erences in those pa signals are di�cult to analyze. therefore, to evaluate the speed of the acoustic waves and to gather information about the presence of a cavity or inhomogeneity in the polymer, we have developed a method that used a regular pattern on the surface of the sample, consisting on parallel clear stripes of the base material and grooves �lled with highly absorbent black paint. in the black grooves, the localized absorption provides a strong shock wave at the surface. by comparing the time of appearance of the signal in di�erent positions of the surface, it is possible to estimate the speed of those waves in the polymer. for each of the zones in the pattern, the pa curves undergo a change of shape and amplitude from signals in the white zones to the signals obtained in the black stripes, being this strong evidence of the e�ect of the inhomogeneity in the signal. a power fft was performed on each signal in order to provide another means to determine whether a black or a white stripe is excited, and the integral of the fft provides a measure of the energy absorbed by the sample in each case. please note that the energy of the laser was �xed to a value that avoids bleaching of the paint, being in all cases below 500µj per pulse. the signals generated at these inhomogeneities provide a well de�ned point of absorption and thus a de�nite path for the sound generated by the light-absorption mechanism which is very distinctive from other mechanisms of excitation of the pzt. it also reveals the presence of a surface inhomogeneity once the contributions of other sources of acoustic waves are identi�ed making suitable use of reference signals. all the conclusions of this work must be under the proviso that the pzt has a limited frequency band. although the above results were obtained for a soft polymer with a �duciary painted pattern, they can be extended to other type of resins with charge of dyes or other absorbent particles. we are con�dent that with minor modi�cations it can be used for the determination of properties of materials of biological interest as well. the results shown in figs. 6 and 7 con�rm that employing one of the surfaces of the sample conveniently patterned, and scanning it for detection of ultrasound signals, can be used to determine the presence of an inhomogeneity, albeit its precise location and size is not well de�ned by this procedure and it should be complemented by similar determinations at other relative positions of the laser and the pzt. the increase in the signal with respect to the background material is at least one order of magnitude or better. when using this technique in phantoms used in medical applications, one should take care of the fact that there are limitations in several aspects, such as the power involved in each pulse avoiding any kind of damage, and that using other wavelengths would be better suited for biological tissues which involve blood. other type of samples are being currently inspected by modi�cations of the procedure reported here so to adapt it to gelled phantoms. the conclusions are, in short, that the system is sensitive to the presence of the inhomogeneity, and that the higher absorbance of the painted stripes in the surface allows not only to evaluate the speed of sound (which is essential to any tomographic technique) but also improves the detectivity by enhancing the energy released as mechanical waves. this 050005-6 papers in physics, vol. 5, art. 050005 (2013) / p. grondona et al. is a non-trivial result since in the modeling of the propagation of the laser light in the turbid substance, scattering is predominant, but still is su�cient for the detection of inhomogeneities through changes in the absorption. the technique based on the pa is simple and has the advantage that it can be adapted to be used in larger samples or in samples of biological interest. the procedure of using a single acoustic detector for the signals produced by the laser scanning of the surface under study, has an advantage over multiple detector arrangements in the sense that with a suitable �duciary pattern the method can provide information about the speed of the waves involved in the signal. this is interesting because the data processing would not depend on generic information about its value. acknowledgements pg wants to thank the red nacional de laboratorios de óptica for �nancial help and partial funding during the experiments and to interu system for providing a grant for the completion of the experiments. this work partially funded by universidad nacional del centro de la provincia de buenos aires, agencia nacional de promoción cientí�ca y tecnológica (pict 0570) and conicet (pip 384). hodr, dii, jap and hfrs are members of carrera del investigador cientí�co, consejo nacional de investigaciones cientí�cas y técnicas (argentina). gmb is member of carrera del investigador cientí�co, comisión de investigaciones cientí�cas de la provincia de buenos aires (argentina). authors wish to thank nicolás a. carbone for help in the �nal preparation of the manuscript. [1] a ishimaru, wave propagation and scattering in random media, academic press, new york (1978). [2] j ripoll lorenzo, light di�usion in turbid media with biomedical applications, ph. d. thesis, universidad autónoma de madrid, spain (2000). [3] a k dunn, h bolay, m a moskowitz, d a boas, dynamic imaging of cerebral blood �ow using laser speckle, j. cerebr. blood f. met. 21, 195 (2001). [4] p n den outer, th m nieuwenhuizen, ad lagendijk, location of objects in multiplescattering media, j. opt. soc. am. a 10, 1209 (1993). [5] d a boas, m a o'leary, b chance, a g yodh, detection and characterization of optical inhomogeneities with di�use photon density waves: a signal to noise analysis, appl. opt. 36, 75 (1997). [6] d contini, h liszka, a sassaroli, g zaccanti, imaging of highly turbid media by absorption method, appl. opt. 35, 2315 (1996). [7] a c tam, applications of photoacustic sensing techniques, rev. mod. phys. 58, 381 (1986). [8] c k n patel, a c tam, pulsed optoacoustic spectroscopy of condensed matter, rev. mod. phys. 53, 517 (1981). [9] a a oraevsky, a a karabutov, optoacoustic tomography in biomedical photonics, ed. tuan vo-dinh, crc press, chapter 17 (2002). [10] r o esenaliev, a a karabutov, a a oraevsky, ieee j. sel. topics in quantum. electron. 5, 981 (1999). [11] l nicolaides, a mandelis, m munidasa, experimental and image-inversion optimization aspects of thermal wave di�raction tomography microscopy, aip conf. proc. 463, 8 (1998). [12] p c beard, photoacustic imaging of blood vessel equivalent phantoms, proc. spie 4618, 54 (2002). [13] e zhang, j laufer, p beard, backward-mode multiwavelength photoacustic scanner using a planar fabry�perot polymer �lm ultrasound sensor for high-resolution three-dimensional imaging of biological tissues, appl. opt. 47, 561 (2008). [14] s fantini, m a franceschini, e gratton, quantitative determination of the absorption spectra of chromophores in strongly scattered media. a light-emitting-diode based technique, appl. opt. 33, 5204 (1994). 050005-7 papers in physics, vol. 5, art. 050005 (2013) / p. grondona et al. [15] g m bilmes, o e martínez, p seré, d j orzi, a pignotti, on line photoacustic measurement of residual dirt on steel plates, aip conf. proc. 557, 1944 (2001). [16] k h song, e w stein, j a margenthaler, l v wang, noninvasive photoacoustic identi�cation of sentinel lymph nodes containing methylene blue in vivo in a rat model, j. biomed. opt. 13, 054033 (2008). [17] z xu, ch li, l v wang, photoacustic tomography of water in phantoms and tissue, j. biomed. opt. 15, 036019 (2010). [18] l xi, x li, l yao, design and evaluation of a hybrid photoacoustic tomography and di�use optical tomography system for breast cancer detection, med. phys. 39, 2584 (2012). [19] b wang, q zhao, photoacoustic tomography and �uorescence molecular tomography: a comparative study based on indocyanine green, med. phys. 39, 2512 (2012). [20] m xu, l v wang, photoacoustic imaging in biomedicine, rev. sci. instrum. 77, 041101 (2006). [21] r takaue, h tobimatsu, m matsunaga, k hosokawa, detection of surface grooves and subsurface inhomogeneities in metals by transmission correlation photoacoustics, j. appl. phys. 59, 3975 (1986). [22] data on mechanical properties of grilon can be found in http://engr.bd.psu.edu/rxm61/metbd470/ lectures/polymerproperties%20from%20 ces.pdf and in http://www.inoxidable.com/ propiedades1.htm. [23] p grondona, caracterización de un sistema fotoacústico en el ir cercano para estudios en medios turbios. algunas aplicaciones al estudio en fantomas de polímeros con inclusiones, master degree thesis, universidad nacional de rosario, argentina (2009). 050005-8 papers in physics, vol. 7, art. 070002 (2015) received: 20 november 2014, accepted: 10 march 2015 edited by: c. a. condat, g. j. sibona licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070002 www.papersinphysics.org issn 1852-4249 two distinct desynchronization processes caused by lesions in globally coupled neurons fabiano a. s. ferrari,1∗ ricardo l. viana1 to accomplish a task, the brain works like a synchronized neuronal network where all the involved neurons work together. when a lesion spreads in the brain, depending on its evolution, it can reach a significant portion of relevant area. as a consequence, a phase transition might occur: the neurons desynchronize and cannot perform a certain task anymore. lesions are responsible for either disrupting the neuronal connections or, in some cases, for killing the neuron. in this work, we will use a simplified model of neuronal network to show that these two types of lesions cause different types of desynchronization. i. introduction the neuronal dynamics can be represented as a dynamical system and a population of neurons as a neuronal network. the mean electrical field amplitude of a population of neurons has neglected values when they are uncoupled or weakly coupled. this amplitude is enhanced when the coupling between them is high enough to make them synchronized among themselves [1]. at the synchronized state, it is possible to measure the mean electrical activity of a large number of closed neurons using eeg [2, 3]. abnormalities or absence of synchronization have been reported as a consequence of neurodegenerative diseases [4, 5]. this dynamical effect is a consequence of topological changes caused by lesions spreading in the brain. however, every disease has its own features and, here, we propose to study the different dynamical effects caused by different types of lesions. measures of neuronal functional activity using ∗email: fabianosferrari@gmail.com 1 physics department, universidade federal do paraná, curitiba, brazil. eeg [6] and fmri [7] have shown spatiotemporal patterns formation. an explanation for this behavior is the emergence of a critical state in the neuronal dynamics providing conditions for a formation of distinct clusters at the functional level [8]. when the neuronal population is considered as a complex network structure with a hierarchicalmodular architecture, this high heterogeneity is related to a stretching of criticality and consequently increased functionality [9]. recent papers have shown functional differences between healthy and unhealthy patients with different neuropathologies [4, 10, 11]. schizophrenia, for example, has been related to neuronal decoupling [12]. unfortunately, many papers are constrained to the study of functional connections and the comparison between healthy and unhealthy patients. efforts have been made to explain the dynamical changes caused by lesions and different models have been proposed to connect what happens in the neuronal level to what happens in the macroscopic level [8, 13]. nevertheless, a complete understanding about the dynamical effect of lesions in the brain is still missing. from the point of view of electrical activity, when the brain needs to execute a specific task there is 070002-1 papers in physics, vol. 7, art. 070002 (2015) / f. a. s. ferrari et al. a group of neurons that synchronize and work together to perform it. when a lesion spreads in the brain, depending on its size, it can disrupt important connections and certain tasks cannot be done anymore. based on this hypothesis, we present here a simplified neuronal network model of globally coupled neurons to study the effects of the desynchronization induced by a lesion spreading randomly in the brain. we focus on two main cases: one in which the connections among neurons are disrupted and a second one in which the lesion kills the neurons. despite of the simplicity of the model, the observed phase transitions from synchronized to the desynchronized state have shown different properties for these two types of lesions. ii. model in this work, we consider a network of rulkov neurons globally coupled (mean field). however, other neuronal models could be used and provide similar results, for example: kuramoto [14], hindmarshrose [15] and morris-lecar [16]. rulkov neurons are described by a fast variable x and a slow variable y. the dynamic associated with each neuron in the network can be described as x (j) n+1 = α(j) (1 + (x (j) n )2) + y(j)n + ε n n∑ i=1 x(i)n , (1) y (j) n+1 = y (j) n −σx (j) n −β, (2) where σ = β = 0.001, α(j) is a bifurcation parameter randomly chosen in the interval [4.1, 4.3], exhibiting bursts, n is the network size and ε is the coupling strength [17]. the first step in our analysis is to choose an appropriate coupling strength such that the network becomes synchronized. to characterize phase synchronization, we will define a geometric phase for each neuron. considering one period of oscillation and the distance between two successive bursts, the phase of each neuron j is given as [1], ϕ(j)n = 2πk + 2π n−n(j)k n (j) k+1 −n (j) k , (3) where k is the k − th burst and nk is the time in which the k − th burst started. the phase synchronization can be found through the kuramoto’s (a) (b) (c) figure 1: network representation. panel (a) shows a fully connected network, panels (b) and (c) show the effect of lesions type 1 and type 2, respectively. the dashed lines show where the damage caused by the lesion type is. order parameter, rn = 1 n ∣∣∣∣∣∣ n∑ j=1 expiϕ (j) n ∣∣∣∣∣∣ , (4) when this value is one, it means the network is fully synchronized in phase and when this value is zero, it means the network is fully desynchronized [14]. it is known that neuronal networks described by eqs. (1) and (2) exhibit phase synchronization when the coupling strength is increased up to a certain value [1], as we will show in the next section. the second step in our analysis is to study how lesions spreading in the network cause desynchronization. to study this fact, we will assume two different types of lesions: type 1. lesions that disrupt the connection between the neurons. type 2. lesions that kill neurons. in fig. 1 (a), we show a representation for a network of globally coupled neurons; the effect of lesions type 1 are represented in soutthe figure fig. 1 (b), while the effect for lesions type 2 are shown in fig. 1 (c). we also consider that for each type of lesion, the coupling strength can be affected by three different situations: reinforced coupling. for every new damaged neuron, the coupling strength is increased by ε(t) = ε0/(n −nd). invariant coupling. the coupling strength does not change with the lesion size, so ε(t) = ε0/n. 070002-2 papers in physics, vol. 7, art. 070002 (2015) / f. a. s. ferrari et al. 0 0.01 0.02 0.03 0.04 ε 0 0.2 0.4 0.6 0.8 1 < r > n=1000 n=2500 n=5000 n=10000 figure 2: the mean order parameter as a function of the coupling strength. the different colors represent different network sizes. here, 〈r〉 is the order parameter averaged over the whole network for a time series of 10000 discrete steps after 80000 transient times. reduced coupling. for every new damaged neuron, the coupling strength is decreased by ε(t) = ε0/(n + nd). here, nd is the number of disconnected neurons and ε0 is the initial coupling strength. iii. results and discussion the first step is to find the values for the coupling strength in which the network shows phase synchronization. in fig. 2, we show that when the coupling strength is below εc = 0.02 then the system is completely desynchronized (disregarding fluctuations ∼ 1/ √ n). above this critical value, the order parameter increases, and close to ε = 0.04, the network can be considered fully synchronized. based on that for our results, we will use as initial coupling strength ε0 = 0.04. from fig. 2, we can also see that the transition toward synchronization is invariant with respect to the network size. for lesions type 1, when ε(t) is reinforced after each new damaged neuron, the synchronization (characterized by the mean order parameter) decays linearly with the number of disconnected neurons (nd) but the network just becomes completely desynchronized when all the neurons are lesioned, see fig. 3 (a). for the cases where ε(t) is invariant or reduced, we observe a roughly first order phase transition where the order parameter decreases linearly up to a critical size of lesioned neurons nd,critical and the whole network desynchronizes. this fact is absent for the reinforced case because increasing the coupling strength increases nd,critical such that the first order phase transition is never observed. the three color lines in fig. 3 (a) indicate that the smaller the coupling strength becomes, the faster the complete phase desynchronization happens. an interesting effect occurs for lesions type 2, shown in fig. 3 (b). when the coupling strength is reinforced after each lesion, we do not observe desynchronization. this phenomenon is caused by the fact that when neurons die they do not contribute to the global effect in the network and the remaining neurons being more strongly connected remain synchronized. this fact is not observed for the cases in which the coupling strength remains the same (green line) or is reduced (blue line). for lesions type 2, when the coupling strength decreases toward nd,critical, the observed decay follows a roughly second order phase transition, see fig. 3 (b). iv. conclusions here, we have investigated two types of lesions and their effects. the presence of phase transition from synchronized to desynchronized state was observed for all cases except for lesions type 2 when the network is reinforced. we have observed that lesions type 1 obey a roughly first order phase transition while lesions type 2 obey a roughly second order phase transition. for both types of lesions, when the coupling strength is continuously reduced after each new damage, then the network desynchronization is faster. based on that, increasing the coupling strength can be a strategy to compensate the desynchronization effect induced by lesions, but this strategy is more effective for lesions type 2. the two distinct phase transitions allow us to define a characterization scheme: if the synchronization decays linearly, we could say that we are dealing with a lesion type 1 while if the synchro070002-3 papers in physics, vol. 7, art. 070002 (2015) / f. a. s. ferrari et al. 0 2000 4000 6000 8000 10000 n d 0 0.2 0.4 0.6 0.8 1 < r > ε(t)=ε 0 /(n-n d ) ε(t)=ε 0 /n ε(t)=ε 0 /(n+n d ) linear fit 0 2000 4000 6000 8000 n d 0 0.2 0.4 0.6 0.8 1 < r > ε(t)=ε 0 /(n-n d ) ε(t)=ε 0 /n ε(t)=ε 0 /(n+n d ) (b)(a) figure 3: the desynchronization process induced by lesions. panel (a): lesions type 1, panel (b): lesions type 2. the different colors indicate the three different coupling effects: reinforced, invariant and reduced (black, green and blue, respectively). here, nd is the number of affected neurons (disrupted for (a) and killed for (b)) and n = 10000. nization decays non-linearly, then a lesion type 2 could be the case. however, a mixture of events could also be observed and then a characterization would become difficult to achieve. despite this fact, our results show that even simplified models are useful to understand and classify types of lesions and advances in this segment could be helpful to understand the progress of neurodegenerative diseases. acknowledgements this work has financial support from the brazilian research agencies cnpq and capes. we acknowledge carlos a. s. batista for relevant discussions. [1] c a s batista, e l lameu, a m batista, s r lopes, t pereira, g zamora-lópez, j kurths, r l viana, phase synchronization of bursting neurons in clustered small-world networks, phys. rev. e 86, 016211 (2012). [2] w o tatum, a m husain, s r benbadis, p w kaplan, handbook of eeg interpretation, demos medical publishing, new york (2008). [3] p l nunez, r srinivasan, electric fields of the brain: the neurophysics of eeg, oxford university press, new york (2006). [4] c j stam, b f jones, g nolte, m breakspear, ph scheltens, small-world networks and functional connectivity in alzheimer’s disease, cereb. cortex 17, 92 (2006). [5] c y lo, p n wang, k h chou, j wang, y he, c p lin, diffusion tensor tractography reveals abnormal topological organization in structural cortical networks in alzheimer’s disease, neurobiol. dis. 30, 16876 (2010). [6] c j stam, b w van dijk, synchronization likelihood: an unbiased measure of generalized synchronization in multivariate data sets, physica d 163, 236 (2002). [7] v m eguiluz, d r chialvo, g a cecchi, m baliki, a v apkarian, scale-free brain functional networks, phys. rev. lett. 94, 018102 (2005). [8] a haimovici, e tagliazucchi, p balenzuela, d r chialvo, brain organization into resting state networks emerges at criticality on a model of the human connectome, phys. rev. lett. 110, 178101 (2013). 070002-4 papers in physics, vol. 7, art. 070002 (2015) / f. a. s. ferrari et al. [9] p moretti, m a muñoz, griffiths phases and the stretching of criticality in brain networks, nat. comm. 4, 2521 (2013). [10] s c ponten, p tewarie, a j c slooter, c j stam, e van dellen, neural network modeling of eeg patterns in encephalopathy, j. clin. neurophysiol. 30, 545 (2013). [11] m p heuvel, o sporns, g collin, t scheewe, r c w mandl, w cahn, j goñi, h e hulshoff, r s kahn, abnormal rich club organization and functional brain dynamics in schizophrenia, jama psy. 70, 783 (2013). [12] j cabral, m l kringelbach, functinoal graph alterations in schizophrenia: a result from a global anatomic decoupling?, neuroimage 62, 1342 (2012). [13] j cabral, e huges, m l kringelbach, g deco, modeling the outcome of structural disconnection on resting-state functional connectivity, neuroimage 62, 1342 (2012). [14] y kuramoto, self-entrainment of a population of coupled non-linear oscillators, lect. notes phys. 49, 420 (1975). [15] s xia, l qi-shao, firing patterns and complete synchronization of coupled hindmarshrose neurons, chinese phys. 14, 77 (2005). [16] h wang, q lu, q wang, generation of firing rhythm patterns and synchronization in the morris-lecar neuron model, int. j. nonlinear sci. num. 6, 7 (2005). [17] n f rulkov, regularization of synchronized chaotic bursts, phys. rev. lett. 86, 183 (2001). 070002-5 papers in physics, vol. 1, art. 010006 (2009) received: 10 november 2009, accepted: 26 november 2009 edited by: a. goñi licence: creative commons attribution 3.0 doi: 10.4279/pip.010006 www.papersinphysics.org issn 1852-4249 light effect in photoionization of traps in gan mesfets h. arabshahi,1∗a. binesh2 trapping of hot electron behavior by trap centers located in buffer layer of a wurtzite phase gan mesfet has been simulated using an ensemble monte carlo simulation. the results of the simulation show that the trap centers are responsible for current collapse in gan mesfet at low temperatures. these electrical traps degrade the performance of the device at low temperature. on the opposite, a light-induced increase in the traplimited drain current, results from the photoionization of trapped carriers and their return to the channel under the influence of the built in electric field associated with the trapped charge distribution. the simulated device geometries and doping are matched to the nominal parameters described for the experimental structures as closely as possible, and the predicted drain current and other electrical characteristics for the simulated device including trapping center effects show close agreement with the available experimental data. i. introduction gan has become an attractive material for power transistors [1-3] due to its wide band gap, high breakdown electric field strength, and high thermal conductivity. it also has a relatively high electron saturation drift velocity and low relative permitivity, implying potential for high frequency performance. however, set against the virtues of the material are the disadvantages associated with material quality. gan substrates are not readily available and the lattice mismatch of gan to the different substrate materials commonly used means that layers typically contain between 108 and 1010 threading dislocations per cm2. further, several types of electron traps occur in the device ∗e-mail: arabshahi@um.ac.ir 1 department of physics, ferdowsi university of mashhad, p.o. box 91775-1436, mashhad, iran. 2 department of physics, payam-e-nour university, fariman, iran. layers and have a significant effect on gan devices. in the search for greater power and speed performance, the consideration of different aspects that severely limit the output power of gan fets must be accounted for. it is found that presence of trapping centers in the gan material is the most important phenomenon which can effect on current collapse in output drain current of gan mesfet. this effect was recently experimentally investigated in gan mesfet and was observed that the excess charge associated with the trapped electrons produces a depletion region in the conducting channel which results in a severe reduction in drain current [4]. the effect can be reversed by librating trapped electrons either thermally by emission at elevated temperatures or optically by photoionization. there have been several experimental studies of the effect of trapping levels on current collapse in gan mesfet. for example, klein et al. [5-6] measured photoionization spectroscopy of traps in gan mesfet transistors and calculated that the current collapse resulted from charge trapping in the buffer layer. binari et 010006-1 papers in physics, vol. 1, art. 010006 (2009) / h. arabshahi et al. al. [7] observed decreases in the drain current of a gan fet corresponding to the deep trap centers located at 1.8 and 2.85 ev. in this work, we report a monte carlo simulation which is used to model electron transport in wurtzite gan mesfet including a trapping centers effect. this model is based upon the fact that since optical effect can emit the trapped electrons that are responsible for current collapse, the incident light wavelength dependence of this effect should reflect the influence of trap centers on hot electron transport properties in this device. this article is organized as follows. details of the device fabrication and trapping model which is used in the simulated device are presented in section 2, and the results from the simulation carried out on the device are interpreted in section 3. ii. model, device and simulations an ensemble monte carlo simulation has been carried out to simulate the electron transport properties in gan mesfet. the method simulates the motion of charge carriers through the device by following the progress of 104 superparticles. these particles are propagated classically between collisions according to their velocity, effective mass and the prevailing field. the selection of the propagation time, scattering mechanism and other related quantities, is achieved by generating random numbers and using these numbers to select, for example, a scattering mechanism. our self-consistent monte carlo simulation was performed using an analytical band structure model consisting of five nonparabolic ellipsoidal valleys. the scattering mechanisms considered for the model are acoustic and polar optical phonon, ionized impurity, piezoelectric and nonequivalent intervalley scattering. the nonequivalent intervalley scattering is between the γ1, γ3, u, m and k points. the parameters used for the present monte carlo simulations for wurtzite gan are the same as those used by arabshahi for mesfet transistors [8-9]. the device structure illustrated in figure 1.a is used in all simulations. the overall device length is 3.3 µm in the x -direction and the device has a 0.3 µm gate length and 0.5 µm source and drain length. figure 1: (a) cross section of wurtzite gan mesfet structure which we have chosen in our simulation. source and drain contacts have low resistance ohmic contacts, while the gate contact forms a schottky barrier between the metal and the semiconductor epilayer, (b) the instantaneous distribution of 104 particles at steady forward bias (drain voltage 50 v, gate voltage −1 v), superimposed on the mesh. note that in the simulation there are two types of superparticles. the mobile particles which describe unbound electron flow through the device and trapping center particles which are fixed at the center of each electric field cell (in this case in the buffer layer only). the ellipse represents a trap center which is fixed at the center of an electric field cell and occupied by some mobile charges. the source and drain have ohmic contacts and the gate is in shottky contact in 1 ev to reperesent the contact potential at the au/pt. the source and drain regions are doped to 5 × 1023 m−3 and the 010006-2 papers in physics, vol. 1, art. 010006 (2009) / h. arabshahi et al. top and down buffer layers are doped to 2 × 1023 m−3 and 1 × 1022 m−3, respectively. the effective source to gate and gate to drain separation are 0.8 µm and 1.2 µm, respectively. the large dimensions of the device need a long simulation time to ensure convergence of the simulator. the device is simulated at room temperature and 420 k. in the interests of simplicity it is assumed that there is just a single trap with associated energy level et in all or just part of the device. further, it is assumed that only electrons may be captured from the conduction band by the trap centers, which have a capture cross-section σn and are neutral when unoccupied, and may only be emitted from an occupied center to the conduction band. we use the standard model of carrier trapping and emission [9-10]. for including trapping center effects, the following assumption has been considered. the superparticles in the ensemble monte carlo simulation are assumed to be of two types. there are mobile particles that represent unbound electrons throughout the device. however, the particles may also undergo spontaneous capture by the trap centers distributed in the device. the other type of superparticles are trapping centers that are fixed at the center of each mesh cell. as illustrated in figure 1.b, each trap center has the capacity to trap a finite amount of mobile electronic charge from particles that are in its vicinity and reside in the lowest conduction band valley. the vicinity is defined as exactly the area covered by the electric field mesh cell. the finite capacity of the trapping center in each cell of a specific region in the device is set by a density parameter in the simulation programme. the simulation itself is carried out by the following sequence of events. first, the device is initialized with a specific trap which is characterized by its density as a function of position, a trap energy level and a capture cross-section. then at a specific gate bias the source-drain voltage is applied. some of the mobile charges passing from the source to the drain in each timestep can be trapped by the centers with a probability which is dependent on the trap cross-section and particle velocity in the cell occupied at the relevant time t. the quantity of charge that is captured from a passing mobile particle is the product of this probability and the charge on it. this charge is deducted from the charge of the mobile particle and added to the fixed charge of the trap center. the emission of charge is simulated using the emission probability. any charge emitted from a trap center is evenly distributed to all mobile particles in the same field cell. such capture and emission simulations are performed for the entire mesh in the device and information on the ensemble of particles is recorded in the usual way. iii. results the application of a high drain-source voltage causes hot electrons to be injected into the buffer layer where they are trapped by trap centers. the trapped electrons produce a depletion region in the channel of the gan mesfet which tends to pinch off the device and reduce the drain current. this effect can be reversed by any factor which substantially increases the electron emission rate from the trapped centers, such as the elevated temperatures considered previously. here we consider the effect of exposure to light [11-13]. there have been several experimental investigations of the influence of light on the device characteristics. binari et al. [6] were the first to experimentally study the current collapse in gan mesfets as a function of temperature and illumination. they showed that the photoionization of trapped electrons in the high-resistivity gan layers and the subsequent return of these electrons to the conduction band could reverse the drain current collapse. their measurements were carried out as a function of incident light wavelength with values in the range of 380 nm to 720 nm, corresponding to photon energies up to 3.25 ev which is close to the gan band gap. their results show that when the photon energy exceeds the trap energy, the electrons are quickly emitted and a normal set of drain characteristics is observed. to examine the photoionization effect in our simulations, the thermal emission rate etn was changed to etn + e o n, where e o n ∼ σonφ is the optical emission rate, with σon, the optical capture cross-section and φ the photon flux density given by φ = i hν = iλ hc (1) where i is the light intensity, ν is the radiation frequency and λ is the incident light wavelength. our modeling of photoionization effects in gan 010006-3 papers in physics, vol. 1, art. 010006 (2009) / h. arabshahi et al. figure 2: i-v characteristics of a gan mesfet under optical and thermal emission of trapped electrons (solid curve) and thermal emission of trapped electrons (dashed curve) at two different temperatures. (a) at t = 300 k with trap centers at 1.8 ev and illuminated with a photon energy of 2.07 ev. (b) at t = 420 k with trap centers at 2.85 ev and illuminated with a photon energy of 3.1 ev. mesfets is based on parameters used by binari and klein [5-7]. the simulations were all carried out for two different deep trap centers, both with a concentration of 1022 m−3, and with photoionization threshold energies at 1.8 and 2.85 ev and capture cross-sections of 6 × 10−21 m2 and 2.8 × 10−19 m2, respectively. a fixed incident light intensity of 5 wm−2 at photon energies of 2.07 ev and 3.1 ev is used. the simulations have been performed at a sufficiently high temperature (420 k), for both thermal and optical emission, to be significant as well as at room temperature. figure 2a illustrates the effect on the drain current characteristics of exposure of the device to light at room temperature. the gan mesfet has a deep trap center at 1.8 ev and is illuminated at a photon energy of 2.07 ev. it can be seen that in the light the i-v curves generally exhibit a larger drain current, especially at higher drain voltages, reflecting the fact that the density of trapped electrons is much lower. simulations have also been performed at 420 k for a device with deep level traps at 2.85 ev. the simulation results in figure 2b for illumination of a photon energy of 3.1 ev are compared with the collapsed i-v curves in the absence of light. comparison of figures 2a and 2b shows that the currents are generally higher at 420 k and that the light has less effect at the highest temperature. iv. conclusions the dependence upon light intensity (exposure) of the reversal of current collapse was simulated in a gan mesfet for a single tapping center. traps in the simulated device produce a serious reduction in the drain current and consequently the output power of gan mesfet. the drain current behavior as a function of illumination with photon energy was also studied. our results show that as the temperature and photon energy are increased, the collapsed drain current curve moves up toward the non-collapsed curve due to more emission of trapped electrons. acknowledgements the authors wish to thank m. g. paeezi for the helpful comments and critical reading of the manuscript. [1] b gil, group-iii nitride semiconductor compounds, oxford science pub. (1998). [2] m a khan, m s shur, algan/gan metal oxide semiconductor heterostructure field effect transistor, mater. sci. eng. b 42, 69 (1997). [3] p b klein, s c binari, j a freitas, a e wickenden, photoionization spectroscopy of traps in gan metal-semiconductor field-effect transistors, j. appl. phys. 88, 2843 (2000). 010006-4 papers in physics, vol. 1, art. 010006 (2009) / h. arabshahi et al. [4] m a khan, m s shur, q c chen, j n kuznia, low frequency noise in gan metal semiconductor and metal oxide semiconductor field effect transistors, electron. lett. 30, 2175 (1994). [5] p b klein, s c binari, j a freitas, a e wickenden, observation of deep traps responsible for current collapse in gan metalsemiconductor field-effect transistors, j. appl. phys. 88, 2843 (2000). [6] p b klein, j a freitas, s c binari, a e wickenden, algan/gan heterostructure fieldeffect transistor model including thermal effects, appl. phys. lett. 75, 4016 (1999). [7] s c binari, w kruppa, h b dietrich, g kelner, a e wickenden, j a freitas, trapping effects and microwave power performance in algan/gan hemts, solid state electron. 41, 1549 (1997). [8] h arabshahi, monte carlo simulations of electron transport in wurtzite phase gan mesfet including trapping effect, modern phys. lett. b 20, 787 (2006). [9] h arabshahi, the frequency response and effect of trap parameters on the characteristic of gan mesfets, the journal of damghan university of basic sciences 1, 45 (2007). [10] s trassaert, b boudart, c gaquiere, investigation of traps induced current collapse in gan devices, a1404 orsay france, 127 (1999). [11] a kastalsky, s luryi, a c gossard, w k chan, switching in nerfet circuits, ieee electron device lett. 6, 347 (1985). [12] j c inkson, deep impurities in semiconductors. ii. the optical cross section, j. phys. c: solid state phys. 14, 1093 (1981). [13] d v lang, r a logan, m jaros, monte carlo evaluations of degeneracy and interface roughness effects, phys. rev. b 19, 1015 (1979). 010006-5 papers in physics, vol. 6, art. 060009 (2014) received: 11 september 2014, accepted: 10 october 2014 edited by: l. a. pugnaloni reviewed by: l. staron, cnrs, université pierre et marie curie, institut le rond d’alembert, paris, france. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060009 www.papersinphysics.org issn 1852-4249 granular discharge rate for submerged hoppers t. j. wilson,1, 2 c. r. pfeifer,1, 3 n. meysingier,1, 4 d. j. durian1∗ the discharge of spherical grains from a hole in the bottom of a right circular cylinder is measured with the entire system underwater. we find that the discharge rate depends on filling height, in contrast to the well-known case of dry non-cohesive grains. it is further surprising that the rate increases up to about twenty five percent, as the hopper empties and the granular pressure head decreases. for deep filling, where the discharge rate is constant, we measure the behavior as a function of both grain and hole diameters. the discharge rate scale is set by the product of hole area and the terminal falling speed of isolated grains. but there is a small-hole cutoff of about two and half grain diameters, which is larger than the analogous cutoff in the beverloo equation for dry grains. i. introduction the flow of granular materials is of widespread practical [1, 2] and fundamental [3, 4] interest. an important example is the gravity-driven flow of grains from an hourglass or flat-bottomed hopper. for ordinary fluids, the discharge rate is proportional to the pressure head, which is set by filling height, and decreases continuously to zero with vanishing hole size. the discharge of grains is strikingly different. in particular, for dry non-cohesive grains, the mass per unit time discharged from a hole in the bottom of a hopper is accurately described by the empirical beverloo equation [5]: w = cρbg 1/2(d −kd)5/2. (1) ∗email: djdurian@physics.upenn.edu 1 department of physics & astronomy, university of pennsylvania, philadelphia, pa 19104-6396, usa. 2 department of physics, illinois wesleyan university, bloomington, il 61702-2900, usa. 3 department of physics & astronomy, carleton college, northfield, mn 55057, usa. 4 strath haven high school, wallingford, pa 19086, usa. here ρb is the mass density of the bulk granular medium, g = 980 cm/s2, d is the hole diameter, d is the grain diameter, and c and k are dimensionless fitting parameters [5]. the filling height plays no role. the container diameter, and hence the grain-grain contact pressure at the bottom of the container, as given by the janssen argument [6], also plays no role. intuitively, “transient arches” intermittently form and break over the hole, shielding the exiting grains from any sort of pressure head. then the rough scale for discharge is the free-fall speed √ gd times hole area, i.e., w ∼ ρbg1/2d5/2. beverloo et al. plotted their data as w2/5 versus d and found a straight line, but such that w vanishes at a nonzero small-hole cutoff diameter of kd [7]. this has been rationalized by an “empty annulus” around the perimeter of the hole, through which grain centers cannot pass. nevertheless, fundamental justification of eq. (1) remains a topic of on-going interest [8, 9]. the beverloo eq. (1) is supported by a large number of experiments, as reviewed by nedderman and savage et al. [7]. but discrepancies of up to forty percent have been reported when the hole size is increased more widely than usual [10]. 060009-1 papers in physics, vol. 6, art. 060009 (2014) / t. j. wilson et al. recently, we have found excellent agreement with the beverloo equation for up to three [11] and four [12] decades in discharge rate for spherical grains. typical ranges for the numerical coefficients are 0.5 < c < 0.7 and 1.2 < k < 3; near the low end for spherical grains. discharge rates can be increased, paradoxically, by placing an obstacle over the hole [17–19]. and for small enough holes, stable arches can form and cause a clog [20]. related behavior has been reported for the upward discharge of bubbles in an underwater silo [13], particles on a conveyor belt [14], and for disks floating on a fluid that flows through an orifice [15, 16]. an important challenge is to relate all such phenomena to the velocity and density fields, which can be measured in quasi-2d [9,21–25] and index-matched [26] systems. the physical intuition behind the transient arch and empty annulus concepts is appealing, and helps explain the contrast between granular and newtonian fluids. but it has proven difficult to translate into a first-principles theory of hopper discharge and clogging. this could be because the jamming and unjamming of grains in the converging flow near the boundary is not only collective, but is also even more difficult to model than jamming in uniform systems [27]. as an experimental approach to alter transient arching and grain free-fall, we previously explored the effects of tilting the hopper and, hence, the plane of the hole away from horizontal [11, 12]. now in this paper, we perturb transient arching in a totally different way by submerging the entire system underwater. here the grain dynamics become overdamped rather than inertial, and the characteristic grain exit speed becomes set by the fluid and the grain size, rather than by free-fall and the hole size. we find not only a change in the scale of eq. (1), but also in the small-hole cutoff. this is our main focus. in addition, we find an unexpected dependence of discharge rate on filling height, opposite in sign to that for ordinary liquids, which is to be the subject for further experiments [28]. ii. materials and methods the granular media consist of monodisperse spherical beads, primarily of glass (potter industries), but also of lead (mcmaster-carr). the average and standard deviation of the grain diameters d, the grain material density ρg, and the resulting terminal falling speed in water vt, are collected in table 1. the latter are computed using an accurate empirical formula [29] for the dimensionless drag coefficient cd versus reynolds number re = ρvtd/η, where ρ and η are respectively the density and viscosity of water. this is done by equating gravity and drag forces, ∆mg = cdρvt 2a/2 where ∆m = (4/3)(ρg − ρ)π(d/2)3 and a = π(d/2)2, and solving numerically for vt. the packings are further characterized by the draining angle of repose, θr, as measured in air and under water. the results in table 1 show that θr is larger for glass beads when dry, but larger for lead beads when wet. also, θr is noticeably larger for the d = 0.11 mm glass beads. finally, note that the polydispersity is about 20% for the d = 0.11 mm glass beads and the lead beads, but is 1−6% for the other three glass bead samples. two different hoppers types are used. the first is constructed from 93 mm diameter transparent cylindrical plastic (polyethylene terephthalate) jars with screw-on plastic lids of equal diameter. for these, an outlet hole of desired diameter d is drilled into a plastic lid. nearly two dozen different hole diameters are used, each drilled into a different lid. the hole diameters are measured by calipers with an uncertainty of ±0.02 cm. an inlet for water is opened on the other end of the jar in order to prevent a back-flow of water up into the hopper to replace the lost volume of discharged grains. the other hopper type consists of plastic graduated cylinders, diameter 50 mm or 38 mm, with a single hole drilled into the bottom. all discharge measurements are conducted with the hopper fixed to a sturdy aluminum stand, all completely underwater in a large aquarium. prior to use, the glass beads are submerged and repeatedly poured back and forth between two containers in the same aquarium in order to allow all air to escape. the grains are then poured slowly into the hopper, without exposure to air, and allowed to settle with the outlet hole blocked. as such, the packings have a solids volume fraction φ near random-close packing (table 1). for large holes, flow commences immediately after the hole is unblocked. for small holes, gentle tapping is required to start the discharge. in either case, the discharge rate is measured only after the flow has proceeded long enough that a conical depression fully develops at the top of the packing. the height h of the pack060009-2 papers in physics, vol. 6, art. 060009 (2014) / t. j. wilson et al. grains d (mm) ρg (g/ml) vt (cm/s) re φ θr dry (deg) θr wet (deg) glass 0.11 ± 0.02 2.499 ± 0.007 0.94 1 0.57 ± 0.01 29.8 ± 0.2 26.5 ± 0.3 glass 0.31 ± 0.02 2.504 ± 0.003 4.0 12 0.59 ± 0.01 24.8 ± 0.3 22.7 ± 0.3 glass 0.96 ± 0.05 2.519 ± 0.007 15 150 0.62 ± 0.01 24.6 ± 0.3 21.5 ± 0.3 glass 3.00 ± 0.04 2.590 ± 0.010 36 1100 0.57 ± 0.01 24.2 ± 0.3 19.8 ± 0.8 lead 1.0 ± 0.2 10.9 ± 0.2 49 520 0.62 ± 0.01 17 ± 1 22 ± 3 table 1: granular materials properties: d is the average grain diameter, ρg is the density of individual grains, vt is the terminal falling speed of individual grains in water, re is the corresponding reynolds number, φ is the volume fraction of grains in the packing, θr is the draining angle of response as measured for both dry and submerged packings. measurement uncertainties are also given, except for the grain diameter where the standard deviation of the size distribution is reported. ing is then measured as the level of the grains at the container wall. the height of the packing over the hole is then estimated as h = h−(d/2) tan θr. the mass per unit time discharge rate, w, is found by starting/stopping a timer while inserting/removing a cup from the discharge stream. the collection cup is then removed from the aquarium and weighed, carefully topped off with water. the mass of the same cup, filled with water but no grains, is also determined. the difference between the two mass measurements is ∆m = vg(ρg − ρ) where vg is the volume of grains. this gives the mass of the collected grains as mg = vgρg = ∆mρg/(ρg − ρ). the discharge rate, w, is then computed as mg divided by the duration of the collection time. the collection times range from a few seconds to several hours, and mg ranges from about 10 g to 250 g for the glass beads. collection of grains is begun only after a steady-state depression is formed at the top surface of packing; thus the possibility of initial transients, which could depend on initial packing fraction [30], is not investigated. iii. versus packing height a prudent preliminary task is to measure whether or not the discharge rate w for submerged grains is independent of the packing height h, as in the case of dry grains. so in fig. 1 we show w versus h data for d = 1 mm beads, of glass and lead, for holes of various diameter. the rates all appear to approach a constant for very large packing heights. therefore, the driving pressure on the grains at the outlet must be shielded from the weight of the packing due to transient arch formation as for dry grains. for smaller packing heights, however, the discharge rate data in fig. 1 are not constant. rather, the rates all increase as the hopper empties. this is surprising, because the sign of the effect is opposite to that for ordinary fluids where the rate decreases with the diminishing pressure head as the container empties. similar behavior has been noticed for submerged hoppers [31], and also for dry grains in a cylindrical tube with a conical orifice citelepennecpf98. preliminary data, with a new apparatus, show that the discharge increase upon emptying is much smaller for dry grains, on the order of one percent [28]. we are unaware of any theoretical explanation. in fact, both the µ(i) flow law and discrete contact dynamics simulations for two dimensional dry grains predict a decrease in discharge rate as the hopper empties [33, 34]. the former approach has been extended to suspensions [35], but not yet applied to hopper flows. the variation of discharge rate with packing height is roughly exponential. fits to the form w(h) = wo{1 + 0.25 exp[−h/(10 cm)]} are shown by solid curves in fig. 1, where the asymptotic discharge rate wo is the only fitting parameter. this function has no theoretical basis, as yet, but is a simple form that allows us to both estimate wo and to compare the displayed data sets. the fits are quite good, and thus illustrate (i) that the size of the surge is about 25% at most, and (ii) that the characteristic height is about 10 cm. these two numbers represent the average of individual fitting results for all hole sizes for the 93 mm hoppers, when all three parameters are allowed to float. for the 93, 50 and 38 mm diameter hoppers, shown in 060009-3 papers in physics, vol. 6, art. 060009 (2014) / t. j. wilson et al. 1 1.2 1.4 1.6 1.8 0 5 10 15 20 25 30 50 mm 38 mm 93 mm hopper diameter d is c h a rg e r a te , w (g /s ) packing height, h (cm) c) d=5.1 mm x 4.4 4.8 5.2 5.6 6.0 b) d=7.9 mm 34 36 38 40 42 44 a) d=16.4 mm 30 32 34 36 38 e) d=7.9 mm x lead beads 0.42 0.44 0.46 0.48 0.50 0.52 with fixed water flow rate of w f /(ρ g φ) = 17 ml/min d) d=3.7 mm figure 1: discharge rate vs packing height for (a-d) d = 1 mm diameter glass beads and (e) d = 1 mm diameter lead beads, and different hole diameters d as labeled. note that time proceeds right to left. the hopper diameter is 93 mm, except as noted in (c). in (d), data are also included where the container is sealed and a fixed flow of water is pumped into the top at the same volumetric rate at which grains exit the hole at bottom. the horizontal error bars indicate the range of packing heights over which the discharge rate is measured. the vertical error bars indicate the standard deviation of 3-5 repetitions. the solid curves are fits to w(h) = wo{1 + 0.25 exp[−h/(10 cm)]}, where wo is the only adjustable parameter, to show that the height-dependence is similar for different discharge geometries. fig. 1(c), the individually-fitted decay lengths are 14.6 ± 3, 8.7 ± 0.4 and 4.9 ± 0.8 cm, respectively. thus, the decay length appears to be on the order of ten times the hopper diameter. a factor of order one might have been expected if janssen-type wall effects were primarily responsible. nonetheless, one hypothesis for the surge would be that the graingrain contact pressure over the outlet decreases as the hopper empties, and a resulting subtle decrease in packing fraction allows for greater fluidity and hence greater discharge rate. however, this seems ruled out by ref. [36], where data show a decrease of pressure but no change in discharge rate. whatever the cause of the packing height dependence of discharge rate, the fluid clearly plays a role. so it is logical to consider interstitial flow between the grains. certainly, such flow is generated as the grains move apart and out of the hole. a second hypothesis would be that some liquid is drawn down through the whole packing, even more easily and rapidly as the hopper empties —which would cause the grain discharge rate to increase. for large packing heights, such fluid flow would vanish and the grain discharge rate would become constant. as a test, we sealed off the top of a 93 mm diameter hopper and connected it to a gear pump for enforcing a constant interstitial liquid flow rate. the liquid pump rate was set to wo/(ρgφ), so that grains and fluid can flow inside the hopper together, in unison, with the same macroscopic velocity field. this would be the condition for very tall packings, but is now enforced for all packing heights. grain discharge rate data are plotted in fig. 1(d), on top of prior data for an open-topped hopper with no liquid pumping. except for one perhaps-spurious point, the two data sets are indistinguishable in showing the same rate increase as the hopper empties. therefore, interstitial fluid flow down through the packing is not responsible for the surprising variation of discharge rate with packing height. the fluid and the walls probably play a role, but the mechanism is not known. further experiments are now in progress to address this issue [28]. iv. versus hole diameter as our main task, we now turn to the variation of discharge rate with the hole diameter d, in the 060009-4 papers in physics, vol. 6, art. 060009 (2014) / t. j. wilson et al. limit of large packing heights where the rate has a constant value wo. for this, we collect data for 93 mm diameter hoppers with packing heights over the range 25 cm < h < 28 cm; this procedure agrees well with results from exponential fits illustrated in fig. 1. the simplest scale for discharge rate is wo ∝ ρgvtd2, where ρg is the density of the grain material, vt is the terminal falling speed for an individual grain in water. note that vt depends on fluid and grain properties, and scales with grain diameter as d2 at low re but as d1/2 at high re (see section ii.). this contrasts with dry grains, where the discharge speed depends on grain size only for very small holes. imposing a beverloo-like cutoff by the substitution d → (d−kd), we arrive at the expectation wo = cρgvt(d −kd)2, (2) = c(ρgvtd 2)[(d/d) −k]2. (3) similar beverloo-type forms with cutoff have been found for the upward discharge of bubbles in a quasi-2d silo [13] and for particles on a conveyor belt [14]. consideration of vt based on a single grain warrants caution, since there must be a more complex flow of the interstitial fluid in response to granular shear and dilation near the aperture. for comparison of data with eq. (3), fig. 2 shows a dimensionless plot of discharge rate measurements as [wo/(ρgvtd 2)]1/2 versus d/d. indeed, for glass beads of diameter 0.3, 1, and 3 mm, we find excellent collapse of the data onto a single line that vanishes at non-zero d/d. this analysis is analogous to the famous beverloo plot of w2/5 versus d/d for dry grains [7]. here the main plot is double-logarithmic to illustrate a range of roughly two decades in dimensionless hole diameter and five decades in dimensionless discharge rate; the inset has linear axes with a smaller range so that the functional form versus hole diameter is evident. data for the lead and the 0.1 mm glass beads show similar linear behavior, but do not collapse onto the results for the larger glass beads. perhaps this is because their polydisersities are significantly larger, as seen in table 1. two lines are shown in fig. 2 that correspond to eq. (3) and match the two groupings of data. the proportionality constants, c, are close to 2/3 and 1/2. it is reassuring that these values are of order one. but it is more interesting that the lines both 1 10 100 1 10 100 25cm < h < 28cm d=0.1mm, glass d=0.3mm, glass d=1mm, glass d=3mm, glass d=1mm, lead 0.67(d/d-2.4) 0.47(d/d-2.4) [ w o / ( ρ g v t d 2 ) ]1 /2 d/d 0 2 4 6 8 10 0 5 10 15 20 figure 2: mass discharge rate wo vs hole diameter d, in the large packing-height limit, for several grain types as labelled. the axes are made dimensionless by grain diameter d, density ρg of individual grains, and terminal falling speed vt of individual grains in water. the packing height range is 25 cm < h < 28 cm, and the hopper diameter is 93 mm. vertical error bars, given by the standard deviation of 3-5 repetitions, are smaller than symbol size except for the d = 0.1 mm glass beads as shown. the solid curves represent lines that vanish at d/d = 2.4, as specified in the legend. fits to wo 1/2 ∝ (d/d − k) give the average cutoff as k = 2.4 ± 0.1 grain diameters. vanish at the same ratio of hole to grain diameter. based on individual linear fits to the separate data sets, this small-hole cutoff is k = 2.4±0.1. by contrast, for dry spherical grains, the beverloo cutoff for eq. (1) is k = 1.5; and for air bubbles in water it is k = 0.66. therefore, the cutoff involves dynamics and cannot be explained by purely geometrical concepts like the “empty annulus”. alternatively, our data could also be taken as further evidence for the point of view that discharge rate should be described by a functional form in which k = 1 is enforced [9, 10]. v. conclusion in summary, we have observed a striking dependence of the discharge rate on packing height, such that the rate surges as the hopper empties. the mechanism is not yet understood, but may involve wall effects and the flow of liquid in between grains as they come apart near the exit. the latter is 060009-5 papers in physics, vol. 6, art. 060009 (2014) / t. j. wilson et al. reminiscent of an increase of flux due to dilation of grains under a fixed obstacle placed above the hole [17–19]. but here the relevant length scale appears to be the hopper diameter, rather than the hole diameter, and the effect vanishes for very tall packing heights where the discharge rate is constant. in this regime, for two spherical grain types and a wide range of grain diameters, we find excellent agreement between discharge rate data and a modified beverloo-like equation over five decades in dimensionless discharge rate. this empirical result could be of practical use, in the same vein as the beverloo equation. while the basic scale is set by dimensional analysis, we find the existence of a small-hole cutoff such that the discharge rate extrapolates to zero as the hole diameter is decreased to 2.4 times the grain diameter. for dry grains, the cutoff is significantly smaller; this has been known for over fifty years but remains unexplained based on the underlying microscopic physics. the contrast with our findings suggests a missing ingredient: grain dynamics, rather than geometry alone, play a crucial role. acknowledgements we thank ted brzinski, charles thomas, and adam roth for helpful discussions and assistance. this work was supported by the nsf through grants dmr-1305199 and mrsec/dmr-112090 (reu program tjw and crp). [1] r m nedderman, statics and kinematics of granular materials, cambridge university press, ny (1992). [2] f j muzzio, t shinbrot, b j glasser, powder technology in the pharmaceutical industry: the need to catch up fast, powder technol. 124, 1 (2002). [3] h m jaeger, s r nagel, r p behringer, granular solids, liquids, and gases, rev. mod. phys. 68, 1259 (1996). [4] j duran, sands, powders, and grains: an introduction to the physics of granular materials, springer, ny (2000). [5] w a beverloo, h a leniger, j van de velde, the flow of granular solids through orifices, chem. eng. sci. 15, 260 (1961). [6] l vanel, e. clément, pressure screening and fluctuations at the bottom of a granular column, eur. phys. j. b 11, 525 (1999). [7] r m nedderman, u tuzun, s b savage, g t houlsby, the flow of granular materials-1: discharge rates from hoppers, chem. eng. sci. 37, 1597 (1982). [8] j e hilton, p w cleary, granular flow during hopper discharge, phys. rev. e 84, 011307 (2011). [9] a janda, i zuriguel, d maza, flow rate of particles through apertures obtained from selfsimilar density and velocity profiles, phys. rev. lett. 108, 248001 (2012). [10] c manoc, a janda, r arévalo, j m pastor, i zuriguel, a garcimart́ın, d maza, the flow rate of granular materials through an orifice, granul. matter 9, 407 (2007). [11] h g sheldon, d j durian, granular discharge and clogging for tilted hoppers, granul. matter 12, 579 (2010). [12] c c thomas, d j durian, geometry dependence of the clogging transition in tilted hoppers, phys. rev. e 87, 052201 (2013). [13] y bertho, c becco, n vandewalle, dense bubble flow in a silo: an unusual flow of a dispersed medium, phys. rev. e 73, 056309 (2006). [14] m a aguirre, j g grande, a calvo, l a pugnaloni, j-c géminard, pressure independence of granular flow through an aperture, phys. rev. lett. 104, 238002 (2010). [15] a guariguata, m a pascall, m w glimer, a k sum, e d sload, c a koh, d t wu, jamming of particles in a two-dimensional fluid-driven flow, phys. rev. e 86, 061311 (2012). [16] p g lafond, m w glimer, c a koh, e d sloan, d t wu, a k sum, orifice jamming of fluid-driven granular flow, phys. rev. e 87, 042204 (2013). 060009-6 papers in physics, vol. 6, art. 060009 (2014) / t. j. wilson et al. [17] i zuriguel, a janda, a garcimart́ın, c lozano, r arévalo, d maza, silo clogging reduction by the presence of an obstacle, phys. rev. lett. 107, 278001 (2011). [18] c lozano, a janda, a garcimart́ın, d garza, i zuriguel, flow and clogging in a silo with an obstacle above the orifice, phys. rev. e 86, 031306 (2012). [19] f alonso-marroquin, s i azeezullah, s a galindo-torres, l m olsen-kettle, bottlenecks in granular flow: when does an obstacle increase the flow rate in an hourglass?, phys. rev. e 85, 020301 (2012). [20] k to, p y lai, h k pak, jamming of granular flow in a two-dimensional hopper, phys. rev. lett. 86, 71 (2001). [21] j choi, a kudrolli, r r rosales, m z bazant, diffusion and mixing in gravity-driven dense granular flows, phys. rev. lett. 92 174301 (2004). [22] j choi, a kudrolli, m z bazant, velocity profile of granular flows inside silos and hoppers, j. phys.: cond. matter 17, s2533 (2005). [23] j f wambaugh, r r hartley, r p behringer, force networks and elasticity in granular silos, eur. phys. j. e 32, 135 (2010). [24] d d chen, k w desmond, e r weeks, topological rearrangements and stress fluctuations in quasi-two-dimensional hopper flow of emulsions, soft matter 8 10486 (2012). [25] c c kuo, m dennin, buckling-induced jamming in channel flow of particle rafts, phys. rev. e 87, 030201 (2013). [26] a v orpe, a kudrolli, velocity correlations in dense granular flows observed with internal imaging, phys. rev. lett. 98, 238001 (2007). [27] a j liu, s r nagel, the jamming transition and the marginally jammed solid, ann. rev. cond. matt. phys. 1, 347 (2010). [28] j koivisto, d j durian, unpublished (2014). [29] f a morrison, data correlation for drag coefficient for sphere, department of chemical engineering, michigan technological university, houghton, mi, www.chem.mtu.edu/̃fmorriso/datacorrelation forspheredrag2010.pdf (accessed june 26, 2012). this work gives a convenient empirical function that accurately matches experimental data between the limits of cd = 24/re at small re and cd ≈ 1/2 at large re. [30] l rondon, o pouliquen, p aussillous, granular collapse in a fluid: role of the initial volume fraction, phys. fluids 23, 073301 (2011). [31] a kudrolli, private communication regarding experiments done with a. orpe, private communication (2012). [32] t le pennec, k j maloy, e g flekkoy, j c messager, m ammi, silo hiccups: dynamic effects of dilatancy in granular flow, phys. fluids 10, 3072 (1998). [33] l staron, p y lagrée, s popinet, the granular silo as a continuum plastic flow: the hourglass vs the clepsydra, phys. fluids 24 103301 (2012). [34] l staron, p y lagrée, s popinet, continuum simulation of the discharge of the granular silo. a validation test for the mu(i) visco-plastic flow law, eur. phys. j. e 37, 5 (2014). [35] f boyer, e guazzelli, o pouliquen, unifying suspension and granular rheology, phys. rev. lett. 107, 188301 (2011). [36] c perge, m a aguirre, p gago, l a pugnaloni, d le tourneau, j-c géminard, evolution of pressure profiles during the discharge of a silo, phys. rev. e 85, 021303 (2012). 060009-7 papers in physics, vol. 1, art. 010008 (2009) received: 9 november 2009, accepted: 29 december 2009 edited by: j. j. niemela reviewed by: h. cabrera morales (inst. venezolano invest. cient́ıficas, venezuela) licence: creative commons attribution 3.0 doi: 10.4279/pip.010008 www.papersinphysics.org issn 1852-4249 surface percolation and growth. an alternative scheme for breaking the diffraction limit in optical patterning d. kunik,1, 2∗ l. i. pietrasanta,2, 3 o. e. mart́ınez1, 2 a nanopatterning scheme is presented by which the structure height can be controlled in the tens of nanometers range and the lateral resolution is a factor at least three times better than the point spread function of the writing beam. the method relies on the initiation of the polymerization mediated by a very inefficient energy transfer from a fluorescent dye molecule after single photon absorption. the mechanism has the following distinctive steps: the dye adsorbs on the substrate surface with a higher concentration than in the bulk, upon illumination it triggers the polymerization, then isolated islands develop and merge into a uniform structure (percolation), which subsequently grows until the illumination is interrupted. this percolation mechanism has a threshold that introduces the needed nonlinearity for the fabrication of structures beyond the diffraction limit. i. introduction the development of new techniques for the fabrication of smaller and smaller structures has become an objective of great relevance for many fields in science and technology [1, 2]. this includes semiconductor industry [2], mems [3, 4], biology [4–7], microfluidics [8], material science and technology, among the most demanding [2]. techniques include optical lithography, scanning electron lithography [1, 2], dip-pen patterning [9], magnetolithograpy [10, 11], ion milling [12], and many others. among these techniques, optical lithography is the most widely used due to its inherent simplicity, mature ∗e-mail: dkunik@df.uba.ar 1 departamento de f́ısica, universidad de buenos aires, buenos aires, argentina. 2 consejo nacional de investigaciones cient́ıficas y técnicas, argentina. 3 centro de microscoṕıas avanzadas facultad de ciencias exactas, universidad de buenos aires, buenos aires, argentina. development, fast production speed and less stringent ambient requirements [1, 2]. the far field optical lithography, both in projection and scanning, has a first principle limitation in size reduction, the diffraction limit, as light cannot be focused below a size of about half the wavelength used. the main approach to circumvent this problem is to reduce the light wavelength and this is the roadmap traced by the semiconductor industry [13], now using less than 100nm sources mostly from synchrotron radiation but also from other new developments such as short wavelength lasers [13, 14]. another approach is to avoid the far field limit by near field approaches [15], but these techniques suffer from the same drawbacks of other sophisticated scanning methods, slow throughput and small scanning areas, unless very sophisticated tricks are developed [16]. it is now well recognized that the diffraction limit is derived assuming a linear response of the media to the light and this can be a way to circumvent the limitation imposed by diffraction [17]. this approach has been used in many microscopy techniques and also in optical lithography 010008-1 papers in physics, vol. 1, art. 010008 (2009) / d. kunik et al. with the main advantage of allowing the construction of three dimensional structures, but the increase in resolution was marginal due to the need of longer optical wavelengths to target the same material transitions with more photons [17–19]. in the last decade, new microscopic techniques with super-resolution have been developed, that rely on some complex nonlinearities, such as stimulated emission depletion (sted) [20]. recently, several researchers have reported ways to use these techniques using such nonlinear methods and switching strategies for photolithography [21–23]. in this work we present a new concept on how to circumvent the diffraction limit in optical lithography by placing the nonlinearity not in the light-matter interaction but in the material growth mechanism itself. the technique uses a dye molecule that initiates the polymerization reaction from an excited state with a very low efficiency. the mechanism relies on the following distinctive steps: (1) mixture of the dye molecules and polymer in adequate proportions. (2) deposition of a drop of the mixture on a transparent substrate. (3) adsorption of a fraction of the dye molecules on the substrate material. (4) illumination of the mixture with a focused beam through the substrate at the absorption band of the dye. (5) initiation of the polymerization by due to a very low quantum yield energy transfer from the dye to the polymer. (6) percolation of the structure growing at the substrate surface. (7) deposition of fresh dye molecules from the bulk onto the growing surface. (8) photoinitiation of the fresh molecules as the structure grows. we will show how this steps allow the controlled growth of the structures with 10nm resolution in vertical direction (direction of propagation of the light), and that the lateral resolution can be increased by a factor of at least three, as compared to the diffraction limit. figure 1: experimental setup. ii. materials and methods the experimental setup is shown in fig. 1 and it is similar to that used in previous works [7, 24, 25]. the main difference is that the photo curable resin was not triggered by exciting the uv initiator by two-photon absorption directly but instead, the energy transfer to activate the polymerization was mediated by a dye molecule excited by one photon excitation as described in [24]. different blends of dyes and resins were prepared and a drop of the blend was deposited onto a cover slip positioned in an inverted microscope setup equipped with a motorized stage. the laser beam was focused onto the sample by means of a high numerical aperture air objective (uplansapo 40x, na=0.9, olympus, tokyo, japan). control of the focus was made by means of a ccd camera imaging the back reflection of the laser beam, and the sample focus was adjusted in order to minimize the image size in the ccd camera [25]. because the resin had a refractive index matching that of the glass cover slip used as substrate, the surface had to be focused before the blend drop was placed. the choice of the laser to excite the dye that triggers the polymerization process, depends on the absorption spectrum of the dye used. the results shown in this work correspond to cyanine (ads675mt, american dye source, quebec, canada) and oxazine dyes (nileblue 690 perchlorate, exciton, ohio, usa) excited with a he-ne laser emitting at 632.8 nm but similar results were obtained with different infrared cyanines dyes (ads740, ads760 and ads775pi, 010008-2 papers in physics, vol. 1, art. 010008 (2009) / d. kunik et al. american dye source, quebec, canada and hitc iodide and lds821, exciton, ohio, usa) excited with a ti:sapphire laser in cw mode tuned to match the absorption spectrum of these dyes. the laser power was adjusted by inserting neutral density filters in the beam path, and the exposure time was controlled with the scanning speed. in order to draw the desired pattern, a shutter was used to turn on and off the illumination while the scanning was being made. tapping-mode afm was performed in dry nitrogen using a nanoscope iiia multimodeafm (digital instruments-veeco metrology, santa barbara, ca, usa) and images were acquired simultaneously with the height and the phase signals. images were processed by flattening, using nanoscope software to remove background slope. different polymer-dye blends were prepared changing both the uv curing adhesive (noa 60, noa 63 or noa65, norland products cranbury, nj, usa) and the dyes used. to ensure that the uv photo-sensitizer played no role in the process, a special batch of noa60 resin without the uv sensitive additive was provided by the manufacturer, (norland products) which yielded similar results. the following steps were taken: (1) select an adequate cosolvent for both the dye and the adhesive, in most cases ethanol, methanol and acetone worked well. we used methanol for our reported essays. (2) make a concentrated solution of the dye in the solvent, typically 10 mm in methanol. (3) add the dye solution to the polymer resin up to the desired concentration. (4) place a coverslip on the microscope and set the focus at the surface. (5) place a drop of blend on the coverslip. (6) scan the laser to draw the desired structure. (7) rinse the coverslip immersing it in ethanol and acetone. a final rinse of a few seconds in methylene chloride is helpful to efficiently remove the remanent unpolymerized resin. iii. results in order to study the different stages of the process, we fabricated different samples under specially designed conditions. after selecting the polymer and the dye, three parameters remain for the control of the size of the structure. (a) the dye concentration figure 2: different stages of the surface percolation process. upper inset is an afm image of a wide line showing the location of the subsequent detailed images. afm height images taken from the periphery (a) toward the centre of the structure (f). at the periphery small clusters are observed. towards the center of the structure (higher laser intensity), the size of the clusters grow until percolation takes place. the size of each image was 500 nm × 500 nm and the displacement between adjacent images was approximately 400 nm. (b) the beam intensity (c) the light exposure time (or scan velocity). figure 2 shows afm images of a wide structure fabricated with a mixture of the optical adhesive noa 63 and the laser dye ads675mt. the dye conentration was 1 mm. the sample was illuminated with a he-ne laser beam and the power at the sample was 1 mw. the laser beam was defocused in order to produce a broad line about 5 µm wide in such a way that in a cross section the different stages of the polymerization process can be visualized. the scan speed was 10 µm s−1 and the different degree of coverage can be detected as one moves towards the line center (maximum intensity). the structure was detailedly scanned in six different regions going from the periphery of the structure (a) towards the center of the line (f). the different scans are 500 nm × 500 nm in size and located at 400 nm steps from each other. the first scan shows very isolated polymer islands of differ010008-3 papers in physics, vol. 1, art. 010008 (2009) / d. kunik et al. figure 3: structure at the percolation threshold showing full coverage. (a) 3d reconstruction of a afm topographic image. (b) average cross section. ent sizes with heights from a few nm to about 7 nm. as the center is approached, the size of the islands grows, but the height only grows marginally, showing a percolation process as more dye molecules located at the surface start the polymerization process at different locations. towards the center, the percolation phenomenon becomes evident, with islands merging towards a larger and larger coverage, but with only a minor increase in the height. with an increase in the intensity by an improvement in the focusing of the beam, a total coverage is obtained, as shown in fig. 3, still maintaining 13 nm in height. the laser power was 35 µw, the scanning velocity was 16 µm s−1 and the blend used was a mix of the optical adhesive noa60 and the dye ads675mt at a concentration of 1 mm. once the percolation threshold is reached, the structure grows slowly, as more dye molecules are adsorbed in the fresh polymer surface. this continues until the illumination is turned off or the beam walks away from that portion of the surface. different concentrations and speeds were used in order to fabricate lines of different heights and widths. the concentrations used were 1 mm, 500 µm, 100 µm, 50 µm, and 10 µm of a noa60-ads675mt blend. for each dye concentration, afm topographies in tapping mode and phase images were obtained for lines manufactured at different speeds with the same beam intensity in order to cover a wide range of stages of the percolation and growth process. the phase images (not shown here) helped to easily distinguish the polymer from the substrate due to their different hardness. when the volume dye concentration is very large (1 mm and figure 4: height of the lines as a function of the exposition time for samples with dye concentration of 50 µm (squares) and 100 µm (circles). examples of structures (topographic afm images) below, at and above the percolation threshold are also shown. 500 µm) the surface percolation phenomenon can be observed (as shown already for 1 mm) but the growth was hindered by the fact that at these high concentrations a volume reaction takes place. this fact is due to the existence of a second threshold fluency that gives rise to a volume percolation, as will be discussed in the next section. in fig. 4, a plot of the maximum height of the lines as a function of the exposition time (defined as the ratio between the beam size and the scan speed) is presented for the samples with dye concentrations 100 µm (squares) and 50 µm (circles). the beam power at the sample was 290 µw (100 µm) and 460 µw (50 µm) and the beam diameter, full width half maximum (fwhm), was 800 nm. for the sample made with 100 µm, the speed was varied from line to line in steps of 0.2 µm s−1, from 2.2 µm s−1 for the slower scan (higher line) to 3.8 µm s−1 for the fastest one (lower line). while for the sample made with 50 µm the range of the speed was 1.4 µm s−1 to 3 µm s−1. from the afm topographies, the surface percolation threshold time tthi (threshold time obtained from the image) was determined and is indicated in the plot. we also show three characteristic lines at both sides of the percolation threshold of the 100 µm sample. 010008-4 papers in physics, vol. 1, art. 010008 (2009) / d. kunik et al. figure 5: subdriffaction structures made with the blend noa63-nile blue 1 mm. normalized psf of the laser beam (red) and afm profile of the structure (blue). inset: 3d reconstruction of the subdiffraction structures topography. as the scanning speed is reduced, the exposure time increases and the structures generated show a transition from isolated islands to full coverage and afterwards, a gradual growth in height. a similar experiment was performed for the five concentrations indicated before. however, for the 10 µm sample, the percolation threshold was not reached. for the samples with concentrations of 1 mm and 500 µm after the surface percolation was reached, the structure did not grow gradually but instead grew suddenly from less than 20 nm to 1 µm or more. the experimental data presented in fig. 4 show a clear nonlinear growth mechanism with an apparent linear asymptotic behavior for large exposure times. it can also be observed that the linear asymptote crosses the time axis at approximately the threshold exposure time. figure 5 shows the lateral reduction below the diffraction limit of structures fabricated with nile blue 690 perchlorate and noa63. the point spread function (psf) is plotted (red) together with a line profile of the structures (blue). a 3d topographic reconstruction of the structures is also shown as inset. the height of the structures was 67 nm and the width (fwhm) was 265 nm. the psf of the beam was determined by measuring the scattered light from a gold nanoparticle 80 nm in diameter and the width was 800 nm (fwhm). the lateral resolution is at least three times better than the beam size, considering that the tip shape was not deconvolved in the measurement. this reduction below the diffraction limit is a clear evidence of the nonlinear growth mechanism also shown in fig. 4. similar results were obtained for many combinations of dye molecules and uv curing resins. the common denominator of all the dyes used, which included cyanines and oxazines, is that they are all good laser dyes (meaning high fluorescence efficiency, low triplet formation, low excited state absorption). experiments with methylene blue were not successful (no polymerization was observed) even when the dye was bleached. another common aspect is that all the dyes used have the chromophore positively charged and they were all very poorly soluble in the polymer blend. iv. discussion as shown in the previous section, the process has two very distinctive stages. a first stage in which the polymerization process takes place mainly at the substrate surface with the formation of isolated islands that gradually merge, and a second stage after the surface percolation in which the structure grows gradually until the illumination is turned off. as only marginal growth is detected during the percolation process. we will discuss two models in order to explain the two stages separately. i. percolation model as the smallest islands at the very beginning of the process are too small to be measured with our afm microscope (24 nm radius tip) we cannot accurately determine the island size distribution. hence, we will only model this stage semi-quantitatively and we will show that the surface percolation process requires a large adsorption of dye molecules at the substrate surface. if this is not the case, or else if the volume concentration is too large, a volume percolation precedes the surface percolation and a thick structure of the size of the illuminated volume is generated. in order to compare these two situations, we modeled the percolation process as follows: (a) we assumed that the dye molecules were distributed in an ordered lattice with a distance between neighbors given by the concentration used. (b) at the substrate surface, a similar lattice was 010008-5 papers in physics, vol. 1, art. 010008 (2009) / d. kunik et al. assumed but with a different neighbor distance kept as a parameter. despite the fact that in the experiment the dye molecules are actually randomly distributed, we found that adding this to the model did not modify the result. the result being that at equal nearest neighbor distance the volume percolation precedes the surface percolation. it was assumed that each dye molecule (lattice site) could initiate the polymerization process at random with a probability that increases with the illumination time and intensity. each polymerization triggered generated a sphere of polymer with a gaussian distribution in size. we found no significant differences in the results by changing the variance of the distribution. after a given time, the process is stopped, each sphere is linked with any neighboring sphere if their distance is smaller that the sum of the radii and only the spheres that are linked (directly or through other neighbors) to the substrate are kept (sample rinsing). the results of some typical situations are shown in fig. 6. for very low concentrations or light doses, isolated islands appear at the surface because the structures generated in the bulk do not touch each other and are washed. as the surface concentration is increased, a more densely packed structure develops, as shown in the sequence of fig. 6 (a) to (c). in this last case, the surface is fully covered by the polymer spheres. if the surface concentration is kept with a nearest neighbor distance larger than about one half of the bulk average distance, the surface percolation never occurs before a massive reaction takes place due to the percolation of the structure in bulk. this is shown in the simulation of fig. 6(d) where the same distance between neighbors was used for the surface and bulk. in brief, the surface percolation process as modeled yields structures with typical sizes that depend on the nearest neighbor distance, and a much lower distance (high surface concentration) is required if the surface phenomenon is to prevail the bulk process. this result, together with the fact that at the percolation threshold lower lines were obtained with high concentrations, indicate that the substrate surface was not saturated and fully covered by the dye molecules. the surface dyes concentration grows with the volume concentration. this also explains why at higher concentrations (1 mm and 500 µm) no growth could be figure 6: simulations of the percolation process. spheres with average radius r are randomly created. (a), (b) and (c) show structures simulated as the distance between neighbors ds was diminished. (d) shows the volume percolation for the case where the nearest neighbor distance in bulk equals that at the surface. observed but instead a sudden bulk polymerization at the entire illuminated volume. also that for the low concentration case (10 µm) the surface concentration was too low and hence the large distance between neighbors (larger than the size of the isolated islands) did not allow the percolation process to take place. ii. growth model once the surface is fully covered, the growth of the structure requires the adsorption of fresh dye molecules at the surface of the growing polymer structure. we will model this situation by assuming, in a simple manner, that the growth rate follows the rate equation dh dt = νφpiσρ. (1) 010008-6 papers in physics, vol. 1, art. 010008 (2009) / d. kunik et al. figure 7: fit of the data for the sample made at 100 µm dye concentration. where i is the beam intensity, σ is the absorption cross section of the dye molecule, ρ is the surface dye concentration, φp is the efficiency of the initiation of the polymerization process (the inverse of the number of photons required to trigger one polymerization reaction) and ν is a characteristic volume indicating the size of the structure created once a polymerization event is triggered. this simple model is consistent with the assumption that each molecule can trigger only one event and yields a typical size of the polymer structure or, alternatively, that each molecule can catalyze more than one event and the size is proportional to the number of polymer chains initiated. to complete the description, an equation is needed for the dynamics of the surface density ρ. such equation should take into account the adsorption-desorption process and the dye consumption due to the reaction and it can be written as dρ dt = −κρ + αc(ρ0 − ρ) − φpiσρ. (2) where κ is the desorption rate, α is the adsorption or sticking coefficient proportional to the number of collisions one molecule has with the polymer surface and the probability of sticking to that surface, c is the volume concentration to be assumed constant, ρ0 is the surface density of sites for the dye molecule (being ρ0−ρ the density of available sites). figure 8: fit of the data for the sample made at 50 µm dye concentration. the three components in the right hand side of the equation account for the desorption rate, adsorption rate and dye consumption respectively. any gradient in the volume concentration c is neglected for the sake of simplicity and is a good approximation if the rate is dominated by the dye consumption term, as the structure would be growing faster than the development of the concentration gradient. the solution to eq. (2) is ρ(t) = ρe ( 1 − e− t τ ) , (3) were ρe = ρ0 α c α c + κ + σiφp (4) is the equilibrium surface concentration and τ = 1 α c + κ + σiφp (5) is a characteristic transition time. inserting eq. (3) in eq. (1), the solution for the height evolution is h(t) = m (t − t0) + h0 − m τ ( 1 − e− t−t0 τ ) ; (6) 010008-7 papers in physics, vol. 1, art. 010008 (2009) / d. kunik et al. were m = v α c ρ0 σ i φp α c + κ + σ i φp , (7) t0 is the percolation threshold time delay (initial time for the growth equation) and h0 is the height of the structure at the percolation threshold. equation (6) shows an initial nonlinear growth towards an asymptotic linear growth with slope m. the characteristic transition time towards the asymptotic behavior is precisely the parameter τ defined in eq. (5). the results shown in fig. 4 were fit using eq. (6) and the result is shown in fig. 7 for the 100 µm dye concentration sample and in fig. 8 for the 50 µm dye concentration sample. from these fits, the following conclusions can be drawn: (a) the linear asymptote is evident in both cases and extrapolates crossing the time axis at the measured percolation time (b) the slope of the asymptote grows with the dye concentration. (c) the surface percolation height decays with the concentration ranging from almost 100 nm for the 50 µm concentration, to 13 nm for the 1 mm case. the last result is consistent with the fact that the surface is not saturated with the dye but instead the surface concentration grows with the bulk concentration. by examining the equations obtained for the growth rate, these results are consistent with having the surface concentration rate dominated by the dye consumption σiφp � κ + αc, (8) which yields a slope proportional to the concentration as from eq. (7) in this limit m = v α c ρ0. (9) and in the same limit eq. (5) yields τ = 1 σ i φp . (10) the ratio of the slopes between the 100 µm and the 50 µm is m100 m50 = 1.6 ± 0.3 which is, within the experimental error, the ratio between concentrations c100 c50 = 2±0.4 . the experiment for the two concentrations was done at different beam powers yielding τ100 τ50 = 1.43 ± 0.9 and i50 i100 = 1.58 ± 0.0.3 which show an excellent agreement with this limiting approximation. from eq. (10) and the characteristic time obtained by fitting the experimental results, we found the efficient of initio of the polymerization process is φp = 1 σ i τ = (7 ± 2) 10−7 (11) v. conclusions a technique has been described that allows the fabrication of polymer structures by scanning a laser beam and yielding features with sizes below the diffraction limit. the structure height can be controlled with 10 nm resolution in the range from a few tens of nanometer to hundreds of nanometers. the width was shown to be reduced by at least a factor of three times the diffraction limit. the technique relies on the use of a light sensitive dye that transfers the absorbed energy to the resin and triggers the polymerization process. the exact mechanism could not be clearly separated as the reaction efficiency was shown to be extremely low (below 1ppm). this low yield appears to be crucial for the success of the technique. our results show that the surface dye adsorption and the low dye polymerization efficiency are the keys to allow a smooth growth from surfaces. the surface dye concentration increases the speed of the polymerization at the surface with respect to the volume. on the other hand, the low yield of the triggered polymerization by the excited dye allows this effect to be observable. if the yield of the dye polymerization was high, no matter if there are more initiators at the surface than in the polymer volume, all excited dye would trigger a polymerization event and hence the whole volume illuminated would be polymerized. if the polymerization speed was uniform, the surface growth would take place at a lower speed than the volume growth due to the higher neighbors available in volume, and in this case, the polymerization would also occur 010008-8 papers in physics, vol. 1, art. 010008 (2009) / d. kunik et al. in the whole illuminated volume. as was shown, two distinct processes take place in an almost sequential manner. the first one is a surface nucleation of isolated islands that finally percolate into a uniform surface coverage with a characteristic size that decreases as the dye concentration increases. this is followed by a structure growth, as fresh dye molecules are deposited on the growing polymer surface. the process is stopped when the illumination is interrupted by shutting down the laser beam or moving it to a different spot where the process starts again. the percolation mechanism was modeled numerically and the simulations showed that a large surface adsorption of the dye molecules is needed in order to avoid the volume percolation and allow a surface growth mechanism. a model developed for the growth process permitted the fit of the experimental data and the determination of the extremely low transfer efficiency from photons to the polymer. how the growth rate increases with the dye concentration after an incubation time that is inversely proportional to the light intensity, was also explained. acknowledgements we thank norlandproducts for the sample without photoinitiator. grants from universidad de buenos aires, conicet (argentina) and anpcyt (argentina) are acknowledged. [1] y xia, j rogers, k e paul, g m whitesides, unconventional methods for fabricating and patterning nanostructures, chem. rev. 99, 1823 (1999). [2] m geissler, y xia, patterning: principles and some new developments, adv. mater. 16, 1249 (2004). [3] j j yao, rf mems from a device perspective, j. micromech. microeng. 10, 9 (2000). [4] w c chang, m kliot, d w sretavan, microtechnology and nanotechnology in nerve repair, neurol. res. 30, 1053 (2008). [5] r nielson, b kaehr, j b shear, microreplication and design of biological architectures using dynamic-mask multiphoton lithography, small 5, 120 (2009). [6] j m bélisle, j p correia, p w wiseman, t e kennedy, s costantino, patterning protein concentration using laser-assisted adsorption by photobleaching, lapap, lab. chip 8, 2164 (2008). [7] s costantino, k g heinze, p de koninck, p w wiseman, o e mart́ınez, two-photon fluorescent microlithography for live-cell imaging, microsc. reser. tech. 68, 272 (2005). [8] j godin, c chen, s h cho, w qiao, f tsai, y h lo, microfluidics and photonics for biosystem-on-a-chip: a review of advancements in technology towards a microfluidic flow cytometry chip, j. biophoton. 1, 355 (2008). [9] b basnar, i willner, dip-pen-nanolithographic patterning of metallic, semiconductor, and metal oxide nanostructures on surfaces, small 5, 28 (2009) [10] a. bardea, r. naaman, magnetolithography: from bottom-up route to high throughput, small 5, 316 (2009) [11] e menéndez, m o liedke, j fassbender, t gemming, a weber, l j heyderman, k v rao, s c deevi, s suriñach, m d baró, j sort, j nogués, direct magnetic patterning due to the generation of ferromagnetism by selective ion irradiation of paramagnetic feal alloys, small 5, 229 (2009). [12] m g ancona, s e kooi, w kruppa, patterning of narrow au nanocluster lines using v2o5 nanowire masks and ion-beam milling, nano lett. 3, 135 (2003). [13] b wu, a kumar, extreme ultraviolet lithography: a review j. vac. sci. technol. b 25, 1743 (2007). [14] p w wachulak, m g capeluto, c s menoni, j j rocca, m c marconi, nanopatterning in a compact setup using table top extreme ultraviolet lasers, opto-electron. rev. 16, 444 (2008). [15] m m alkaisi, r j blaikie, s j mcnab, nanolithography in the evanescent near field, adv. mater. 13, 877 (2001). 010008-9 papers in physics, vol. 1, art. 010008 (2009) / d. kunik et al. [16] y wang, x hong, j zeng, b liu, b guo, h yan, afm tip hammering nanolithography, small 5, 477 (2009). [17] s maruo, j t fourkas, recent progress in multiphoton microfabrication, laser photon. rev. 2, 100 (2008). [18] b h cumpston, s p ananthavel, s barlow, d l dyer, j e ehrlich, l l erskine, a a heikal, s m kuebler, i y s lee, d mccordmaughon, j q qin, h rockel, m rumi, x l wu, s r marder, j w perry, two-photon polymerization initiators for three-dimensional optical data storage and microfabrication, nature 398, 51 (1999). [19] s kawata, h b sun, t tanaka, k takada, finer features for functional microdevices micromachines can be created with higher resolution using two-photon absorption, nature 412, 697 (2001). [20] s w hell, j wichmann, breaking the diffraction resolution limit by stimulated-emission stimulated-emission-depletion fluorescence microscopy, opt. lett. 19, 780 (1994). [21] l li, r r gattass, e gershgoren, h hwang, j t fourkas, achieving lambda/20 resolution by one-color initiation and deactivation of polymerization, science 324, 910 (2009). [22] t f scott, b a kowalski, a c sullivan, c n bowman, r r mcleod, two-color singlephoton photoinitiation and photoinhibition for subdiffraction photolithography, science 324, 913 (2009). [23] t l andrew, h tsai, r menon, confining light to deep subwavelength dimensions to enable optical nanopatterning, science 324, 917 (2009). [24] d kunik, p f aramendia, o e mart́ınez, single photon fluorescent microlithography for live-cell imaging, microsc. res. tech. 73, 20 (2010). [25] d kunik, s j ludueña, s costantino, o e mart́ınez, fluorescent two-photon nanolithography, j. microsc. (oxford) 229, 540 (2008). 010008-10 papers in physics, vol. 7, art. 070017 (2015) received: 18 october 2015, accepted: 10 november 2015 edited by: e. mizraji reviewed by: j. lin, department of physics, washington college, maryland, usa. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070017 www.papersinphysics.org issn 1852-4249 how we move is universal: scaling in the average shape of human activity dante r. chialvo,1 ana maŕıa gonzalez torrado,2 ewa gudowska-nowak,3 jeremi k. ochab,4 pedro montoya,2 maciej a. nowak,3, 4 enzo tagliazucchi5 human motor activity is constrained by the rhythmicity of the 24 hours circadian cycle, including the usual 12-15 hours sleep-wake cycle. however, activity fluctuations also appear over a wide range of temporal scales, from days to a few seconds, resulting from the concatenation of a myriad of individual smaller motor events. furthermore, individuals present different propensity to wakefulness and thus to motor activity throughout the circadian cycle. are activity fluctuations across temporal scales intrinsically different, or is there a universal description encompassing them? is this description also universal across individuals, considering the aforementioned variability? here we establish the presence of universality in motor activity fluctuations based on the empirical study of a month of continuous wristwatch accelerometer recordings. we study the scaling of average fluctuations across temporal scales and determine a universal law characterized by critical exponents α, τ and 1/µ. results are highly reminiscent of the universality described for the average shape of avalanches in systems exhibiting crackling noise. beyond its theoretical relevance, the present results can be important for developing objective markers of healthy as well as pathological human motor behavior. 1 consejo nacional de investigaciones cient́ıficas y tecnológicas (conicet), rivadavia 1917, buenos aires, argentina. 2 institut universitari d’investigacions en ciències de la salut (iunics) & universitat de les illes balears (uib), palma de mallorca, spain. 3 m. kac complex systems research center and m. smoluchowski institute of physics, jagiellonian university, kraków, poland. 4 biocomplexity department, ma lopolska center of biotechnology, jagiellonian university, kraków, poland. 5 institute for medical psychology, christian albrechts university, kiel, germany. i. introduction the most obvious periodicity of human (as well as animal) motor activity is the circadian twenty four hours modulation. however, smaller fluctuations are evident on a wide range of temporal scales, from days to a few seconds. data shows that the activity evolves in bursts of all sizes and durations which are known to be scale-invariant [1–8] regardless of the origins and intended consequences of such activity. despite the variety of results, the mechanisms underlying the scale-invariant behavior of motor activity remain to be elucidated. considering the intermittent nature of human motor activity comprising brief activity excursions separated by periods of quiescence a natural approach would be to study the average shape of the events, following recent results [9–12] which show that for a large class of processes, the average shape is a scaling function 070017-1 papers in physics, vol. 7, art. 070017 (2015) / d. chialvo et al. determined mostly by the temporal correlations of the process and its nonlinearities [13]. in the present work, long time series of human motor activity are analyzed, recorded via wristwatch accelerometer, lasting approximately one month. we establish first the presence of truncated scale-invariance in the distribution of the durations of the events as well as in its power spectral density, as described previously in similar type of data. afterwards, we uncover the average shape of the bursts of activity and derive the scaling function and its associated exponents. finally, we discuss the origins of such scaling and some possible applications. ii. materials and methods the recordings analyzed were part of a larger study and included six healthy, non-smokers, drug-free volunteers (mean age 50.1 years, s.d. = 6.8). the study was approved by the bioethics commission of the university of isles baleares (spain). participants were informed about the procedures and goals of the study, and provided their written consent. after determining their handedness, each subject was provided with a wristwatch-sized activity recorder (actiwatch from minimitter co., or, usa) measuring acceleration changes in the forearm in any plane. each data point of activity corresponded to the number of zero crossings in acceleration larger than 0.01 g (sampled at 32 hz and integrated over a 30-second window length). records of several thousands of data points were kept in the device’s internal memory until being downloaded to a personal computer every week. subjects wore the device in their non-dominant arm continuously for up to several weeks (mean 28.1 days, s.d.= 4.). after careful visual inspection of the data to exclude sets with gaps (due to subject non-compliance), a combined total of 280 days of data was available for further analysis. iii. results for ease of presentation, we will use recordings from a single subject to describe the main results. nevertheless, results are robust as well as similar for the entire group of subjects in the study. a typical recording is presented in fig. 1. panel a shows 6 8 10 12 14 16 18 20 22 0 2 4 6 mon12/14 tue12/15 wed12/16 thu12/17 fri12/18 sat12/19 sun12/20 mon12/21 tue12/22 wed12/23 thu12/24 fri12/25 sat12/26 sun12/27 mon12/28 tue12/29 wed12/30 thu12/31 fri01/01 sat01/02 sun01/03 mon01/04 tue01/05 wed01/06 thu01/07 fri01/08 sat01/09 sun01/10 mon01/11 tue01/12 hour d at e 6 8 10 12 14 16 18 20 22 0 2 4 6 0 0.5 1 a ct iv ity (a .u ) 0 7 14 21 28 −5 0 5 time (days) i ( n ) / s .d . −5 0 5 −4 −3 −2 −1 0 i /s.d. lo g 1 0 p ( i ) / p ( 0 ) −6 −5 −4 −3 −2 −6 −4 −2 0 log10f (sec − 1) lo g 1 0 s ( f ) b c d a 1 day 1 hour 1 min. figure 1: example data set, distribution of successive increments and their spectral power. panel a: time series of activity x(n) recorded continuously from a subject during a month. individual traces correspond to consecutive days. the top subpanel depicts daily activity averaged over the entire month. panel b: time series of successive increments i(n) = x(n + 1) −x(n) (normalized by its sd) for the same data. panel c: probability density distribution of the time series of successive increments i(n) (continuous line), exhibiting exponential tails (compare with the dotted line, a gaussian of the same variance). panel d: power spectral density (black line) of the time series of successive increments i(n) of panel b. this is scale invariant s(f) ∼ fγ with γ = 0.9 (dashed line). in contrast, for the randomly shuffled increments, the serial correlations vanish and a flat spectral density is obtained (red). a full month of continuously recorded activity from this subject, who is particularly regular in her daily routines. the subject wakes up with the alarm clock at 6:45 a.m. on week days and has lunch followed by a short nap each day (between 2:00 p.m. and 4:00 p.m. panel b displays the time series of 070017-2 papers in physics, vol. 7, art. 070017 (2015) / d. chialvo et al. 102 103 104 t 102 103 104 105 100 101 102 103 104 105 s,wt,t 10-3 10-2 10-1 100 c c d f s wt t a b α' τ' µ + 1 figure 2: scaling of activity events in a single subject (same dataset as in fig. 1). panel a: the complementary cumulative distribution function (ccdf) for event durations (t) and sizes (s) obeys power-laws with exponents α′ = 0.70 and τ′ = 0.44, respectively (dashed lines). note that here the densities are cumulative, thus the exponents of the respective pdfs are α = α′ + 1 and τ = τ′ + 1. the waiting time between events falls exponentially. panel b: the average size of a given duration is well described (for small t) by 〈s〉(t) ∼ tµ+1 with µ + 1 = 1.59 (blue dashed line) comparable with results obtained from fitting within the scaling region (red filled symbols) giving µ + 1 = 1.61. the successive increments of the signal x(n), defined as i(n) = x(n + 1) −x(n). the large-scale statistical features of the time series presented in fig. 1 are already well known. the density distribution of the successive increments i(n) is non-gaussian, as can be appreciated by a joint plot with a gaussian distribution of the same variance (fig. 1, panel c). it is known that the power spectrum of the activity decays as s(f) ∼ fβ [1, 2]. because this type of processes are likely to be non-stationary, it is best to estimate the exponents of the spectral density by doing the calculations over the time series of successive increments, whose density distribution is stationary. for instance, for brownian motion (which is summed white noise), the power spectrum decays s(f) ∼ fβ with β = −2 and for white noise β = 0; the summed time series has an exponent +2 larger than the non-summed time series. as discussed in [14], this can be generalized for all self-affine processes: summing a self-affine time series shifts the theoretical power-spectral density exponent by +2, and the reverse process is also true: the differences in consecutive values (the “first differences”) of a brownian motion result in white noise, thus taking the first differences shifts the theoretical powerspectral density exponent β by −2. in our case, the exponent obtained for the time series of successive increments i(n) was γ = 0.9. thus, the exponent of the raw data is β = γ − 2 = −1.1 [14]. for comparison, the spectral densities of the actual signal and of a surrogate obtained after randomly shuffling the increments are jointly displayed in panel d of fig. 1. to further study the time series from the perspective of individual bursts of activity, we introduce the definition of an event. we consider the time series of activity x(n) and select a threshold value u to be vanishingly small. an event is defined by the consecutive points starting when x(n) > u and ending when x(n) < u. this is equivalent to the definition of avalanches in other contexts [9, 15]. in the following part, we will be concerned primarily with the statistics of event lifetimes t, as well as of their average size s and shape. in all subjects, we found that the distributions of event durations and sizes (defined by the area, i.e., the integral of the signal corresponding to the individual events) can be well described, for relatively small values, by a power-law (fig. 2, panel a). in contrast, the distribution of waiting times between events demonstrated an exponential decay. in addition to the scale invariance, we found that the longer an event lasted, the stronger the motor activity executed by the subject. the plot of average event size 〈s〉 as a function of duration t follows a power-law (for small values of t) described by 〈s〉(t) = tµ+1 with µ + 1 = 1.59. the exponents in this power-law are 070017-3 papers in physics, vol. 7, art. 070017 (2015) / d. chialvo et al. 0 0.2 0.4 0.6 0.8 1 t/t 0 5 10 15 / t µ 0.2 0.4 0.6 µ 0.1 0.15 0.2 ε 0 0.2 0.4 0.6 0.8 1 t/t 0 0.2 0.4 0.6 0.8 1 < x> / t µ / max (4(t/t)(1-t/t))0.48 (4(t/t)(1-(t/t))0.59 0 600 1200 1800 time(sec) 0 150 300 a ct iv ity < x( t) > t a b c figure 3: collapse of events of different duration into a single functional form. panel a: three examples of typical events of duration t=480, 960 and 1920 sec.. panel b: the heterogeneous events shown in panel a can be collapsed onto the average shape (dashed black line) by normalizing t to t/t and 〈x(t)〉 to 〈x(t)〉/tµ. the inset shows the cumulative variance for a range of µ. panel c: the average event shape, i.e., fshape(t/t), recovered from six data sets (thin lines). the best fit using an inverted parabola is shown as a red dashed line (µ = 0.49) as well as the one expected from the critical exponent µ = 0.59 as a dot-dashed blue line. robust across subjects and to changes of threshold over a reasonable range of values. this type of scaling is well known in the statistical mechanics of critical phenomena [15]. examples range from earthquakes [16] to active transport processes in cells [17], crackling noise [11], the statistics of barkhausen noise in permalloy thin films [10] and plastic deformation of metals [18]. in all these cases, the distributions obey universal functional forms: f(s) ∼ s−τ, (1) f(t) ∼ t−α, (2) 〈s〉(t) ∼ t1/σνz, (3) where f denotes the probability density functions of the size of the event s and its duration t, and 〈s〉(t) is the expected size for a given duration. the parameters τ, α and 1/σνz are the critical exponents of the system and are expected to be independent of the details, being related to each other by the scaling relation: α− 1 τ − 1 = 1 σνz . (4) we found that the empirical exponents very closely fulfill the expression above. using the fitting approach introduced by clauset [19] in the scaling regions depicted in panel a of fig. 2, we found τ = 1.44 and α = 1.70 . thus, from eq. (4) a value of 1/σνz = µ + 1 = 1.59 is expected. the experimental data points are very close to this theoretical expectation (dashed line), especially for the relatively small t values within the scaling region of panel a (where a linear fit estimates µ + 1 = 1.61), while those for relatively larger t values (corresponding to the cutoff of the distributions) are a bit apart, probably due to undersampling. after repeating this analysis for all subjects in our sample, the average exponents were all within 5% of the reported values. from scaling arguments, it is expected that the average shape of an event of duration t 〈x(t,t)〉 scales as : 〈x(t,t)〉 = tµfshape(t/t). (5) thus, the shapes of events of different durations t rescaled by µ should collapse on a single scaling function given by fshape(t/t). note that µ corresponds in this context to the wandering exponent (i.e., the mean squared displacement) of the activity [13, 20]. examples of this collapse are presented in panels a and b of fig. 3. considering the number of 070017-4 papers in physics, vol. 7, art. 070017 (2015) / d. chialvo et al. 102 103 t 102 103 104 0 2000 4000 s,wt,t 10-4 10-3 10-2 10-1 100 c c d f s wt t a b figure 4: scaling is absent in a null model resulting from defining events after randomly reordering the time series x(n). panel a: density distributions (ccdf) for event duration, size, and waiting time. all the distributions are exponential (note the logarithmic-linear scale). panel b: the expected average size for a given duration in the null model is a linear function of t (the dashed line represents the fit with slope 1), therefore, µ = 0 and there is no collapse. events here averaged (in the order of n ∼ 102), the data collapse is quite satisfactory, while the value of the exponent (µ = 0.48) does not exactly match the one predicted in eq. (4), µ = 0.59 (likely a consequence of insufficient sampling). to determine the generality of our results, we extended this analysis to six other data sets. for each data set, the value of µ was first determined. subsequently, the x(t,t) obtained from the events were rescaled with tµ and their average computed. to account for individual differences in mean activity, shape functions were normalized by their mean value. the results for the six datasets are presented in panel c of fig. 3. they can be accurately described by an inverted parabola, as in other systems previously studied using this method. the best fit disagrees with the empirical functions near their peak, the latter being flatter, likely an effect related to saturation observed in long events. finally, we turn to discuss simple null models. we consider two extreme cases, in both of them the raw time series are randomly shuffled to remove serial correlations. in the first case, we remove all temporal correlations by randomly reordering x(n), thus attaining a flat power spectral density. after repeating the above analysis in this surrogate data set, it becomes clear (as shown in fig. 4) that the scale invariance is absent in all the statistics under study: size s, waiting time wt and duration t of events (note that the distributions are here plotted using a logarithmic-linear scale). results in panel b show that µ + 1 = 1, thus µ = 0, implying that there is not collapse, because with tµ = 1 in eq. (5), the amplitude of the individual events remains invariable. to consider the second case, we need first to reorder randomly the time series of increments i(n) and then proceed to integrate the increments. since each increment is now a random variable, the power spectral density for this surrogate process obeys fβ with β = −2 , and as shown analytically by baldassari et al. [13], for this case µ = 1/2 and the scaling function is a semicircle. please note that the fluctuations of human activity described here differ from a simple auto-regressive process: indeed successive increments i(n) are anticorrelated and the power spectral density corresponds to non-trivial power law correlations (i.e., β 6= −2). iv. discussion the present findings can be summarized by six stylized facts describing bursts of human activity: i) the spectral density of the time series of activity x(t) obeys a power law, with exponent β ∼ 1; ii) successive increments i(n) are anti-correlated with a spectral density obeying a power law with exponent γ ∼ 1, which corresponds to a spectral density for the raw data fβ with β ∼ −1; iii) the pdf of the increments i(n) is definitely non-gaussian; iv) the pdf of duration and sizes of events obeys truncated power laws with exponents 1 < τ < 2 and 1 < α < 2; v) the aver070017-5 papers in physics, vol. 7, art. 070017 (2015) / d. chialvo et al. age size of the events scales with its lifetime t as 〈s〉(t) ∼ tµ, where µ + 1 = (α − 1)/(τ − 1); vi) the time series of individual events can be appropriately rescaled via a transformation of its duration t and amplitude x(t) onto a unique functional shape: 〈x(t,t)〉 = tµfshape(t/t). we are aware that these observations are novel only for human activity, because similar statistical regularities of avalanching activity are well known for a large variety of inanimate systems [9–12]. the rescaling of the average shape is not surprising because, placed in the appropriate context, it can be traced back to mandelbrot’s study of the fractal properties of self-affine functions [21]. a curve or a time series are said to be self-affine if a transformation can be found, such that rescaling their x,y coordinates by k and kµ, respectively, and the variance in y is preserved (with µ = 1 corresponding to self-similarity). in that sense, the successful collapse of the events shape is a trivial consequence of the overall self-affinity of the x(t) time series. thus, it is clear that the existence of the scaling uncovered here is not informative per se of the type of mechanism behind: scale-invariance can be constructed via different processes, ranging from critical phenomena [15] to simple stochastic autoregressive dynamics [13, 20]. what is then the mechanism by which the above six facts are generated? it seems that this question cannot be easily answered by the type of experiments reported here. fluctuations of this type could have either an intrinsic (i.e., brain-born) origin but also could be the reflection of a collective phenomena (including humans and its environment). in either case, the correlations observed seem to reject the case of independent random events starting and stopping human actions, because neither the distribution of the increments i(n), nor the exponents match the case of a random walk. in terms of brain-born process, it is hard to accept some of the implications of the scaling function in the activity shape. the average parabolic shape means that the very beginning of the motion activity contains information about how long the activity will last, in the same sense that the initial trajectory of a projectile predicts when and where it will land. this proposal is hardly realistic, because there is hardly a reasonable physiological argument in support of any motor planning for the length of time we are observing (∼ 103 secs). in terms of collective processes, the results here suggest that the interaction with other humans could determine when and where, on the average, we start and stop moving. despite our current relative ignorance, a possibility that sounds interesting is to determine in children, as they grow, if their behavioral product of parental (and otherwise) education are reflected in the shape of their individual scaling function. this seems reasonable given the fact that “tireless running around” is almost a definition of early age well-being, which gives way to less hectic activity as children mature. in the same line of thoughts, if changes in the scaling function can be quantitatively traced to behavioral changes, one could also consider to explore applications of these techniques to monitor eventual progress in the treatment of hyperactivity disturbances such as in the subjects affected by the attention deficit hyperactivity disorder syndrome. the converse, i.e., cases in which the average activity diminish, as in elderly subjects shall be also explored. further experiments and analysis should shed light on these possibilities. in, the meantime, the present results provide a guide and six important constraints for the models that should best capture the physics (and biology) of the process. acknowledgements work supported by national science center of poland (ncn.gov.pl, grant dec-2011/02/a/st1/00119); state secretary for research and development (grants psi2010-19372 and psi2013-48260) from spain and by conicet from argentina.) [1] t nakamura, k kiyono, k yoshiuchi, r nakahara, z r struzik, y yamamoto, universal scaling law in human behavioral organization, phys. rev. lett. 99, 138103 (2007). [2] t nakamura, et al., of mice and men universality and breakdown of behavioral organization, plos one 3, e2050 (2008). [3] k hu, p c ivanov, z chen, m f hilton, h e stanley, s a shea, non-random fluctuations and multi-scale dynamics regulation of human activity, physica a 337, 307 (2004). 070017-6 papers in physics, vol. 7, art. 070017 (2015) / d. chialvo et al. [4] l a n amaral, d j b soares, l r da silva, l s lucena, m saito, h kumano, n aoyagi, y yamamoto, power law temporal auto-correlations in day-long records of human physical activity and their alteration with disease, europhys. lett. 66, 448 (2004). [5] c anteneodo, d r chialvo, unravelling the fluctuations of animal motor activity, chaos 19, 033123 (2009). [6] k christensen, d papavassiliou, a de figueiredo, n r franks, a b sendova-franks, universality in ant behaviour, j. r. soc. interface 12, 20140985 (2014). [7] a proekt, j banavar, a maritan, d pfaff, scale invariance in the dynamics of spontaneous behavior, proc natl acad sci usa 109, 10564 (2012). [8] j k ochab, et al., scale-free fluctuations in behavioral performance: delineating changes in spontaneous behavior of humans with induced sleep deficiency, plos one 9, e107542 (2014). [9] l laurson, x illa, s santucci, k t tallakstad, k j maloy, m j alava, evolution of the average avalanche shape with the universality class, nature comm. 4, 2927 (2013). [10] s papanikolaou, f bohn, r l sommer, g durin, s zapperi, j p sethna, universality beyond power laws and the average avalanche shape, nature phys. 7, 316 (2011). [11] j p sethna, k a dahmen, c r myers, crackling noise, nature 410, 242 (2001). [12] n friedman, s ito, b a w brinkman, m shimono, r e l deville, k a dahmen, j m beggs, t c butler, universal critical dynamics in high resolution neuronal avalanche data, phys. rev. lett. 108, 208102 (2012). [13] a baldassarri, f colaiori, c castellano, average shape of a fluctuation: universality in excursions of stochastic processes, phys. rev. lett. 90, 060601 (2003). [14] b d malamud, d l turcotte, self-affine time series: i. generation and analyses, adv. geophys. 40, 1 (1999). [15] p bak. how nature works. the science of selforganized criticality, copernicus, new york (1996). [16] b gutenberg, c f richter, magnitude and energy of earthquakes, ann. geofis. 9, 1 (1956). [17] b wang, j kuo, s granick, burst of active transport in living cells, phys. rev. lett. 111, 208102 (2013). [18] l laurson, m j alava, 1/f noise and avalanche scaling in plastic deformation, phys. rev. e 74, 066106 (2006). [19] a clauset, c r shalizi, m e j. newman, power-law distributions in empirical data, siam rev. 51, 661 (2009). [20] f colaiori, a baldassarri, c castellano, average trajectory of returning walks, phys. rev. e 69, 041105 (2004). [21] b b mandelbrot, self-affine fractals and fractal dimension, physica scripta 32, 257 (1985). 070017-7 papers in physics, vol. 6, art. 060002 (2014) received: 29 december 2013, accepted: 27 may 2014 edited by: e. mizraji reviewed by: j. brum, instituto de f́ısica, facultad de ciencias, universidad de la república, montevideo, uruguay. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060002 www.papersinphysics.org issn 1852-4249 study of the characteristic parameters of the normal voices of argentinian speakers e. v. bonzi,1, 2∗ g. b. grad,1† a. m. maggi,3‡ m. r. muñóz3§ the voice laboratory permits to study the human voices using a method that is objective and noninvasive. in this work, we have studied the parameters of the human voice such as pitch, formant, jitter, shimmer and harmonic-noise ratio of a group of young people. this statistical information of parameters is obtained from argentinian speakers. i. introduction the voice is a multidimensional phenomenon that must be evaluated using special tools for determining acoustic parameters. these parameters are: the pitch or voice tone, the timbre, considered as the personality of the voice that is particular of each person (determined by fundamental frequency, its harmonics and formants) and the degree of hoarseness. during sustained vibration, the vocal fold will exhibit variations of fundamental frequency and amplitude; these phenomena are called “frequency perturbation” (jitter) and “amplitude perturbation” (shimmer). they reflect fluctuations in tension and biochemical characteristics of the vocal ∗e-mail: bonzie@famaf.unc.edu.ar †e-mail: grad@famaf.unc.edu.ar ‡e-mail: alicia.maggi@hotmail.com §e-mail: eudaimonia13@hotmail.com 1 facultad de matemática, astronomı́a y f́ısica, universidad nacional de córdoba, ciudad universitaria, 5000 córdoba, argentina. 2 instituto de f́ısica enrique gaviola (conicet), ciudad universitaria, 5000 córdoba, argentina. 3 escuela de fonoaudioloǵıa, facultad de ciencias médicas, universidad nacional de córdoba, ciudad universitaria, 5000 córdoba, argentina. folds, as well as variation in their neural control and the physiological properties of the individuals voices. the acoustic analysis is one of the major advances in the study of voice, increasing the accuracy of diagnosis in this area. normal values as standards are important and necessary to guide voice professionals. there are not many studies performed for the latin languages [1–3]. however, there are several of them for the english language, such as those in refs. [4–8]. in the same way, the software used for voice therapy is in general designed for other languages than spanish. a comparison has been made, though, between the two vowel systems of english and spanish (the variation spoken in madrid, spain), which triggered relatively large versus small vowel inventories [9]. that is the reason why we consider it is very important and necessary to produce more results for the spanish speaking population. we analyzed 72 audio files of female and male voices from an argentinian spanish speaking population to obtain the acoustical parameters using the praat program [10]. our data were compared to bradlow [9], hualde [11] and casado morente et al. [12]. the pitches measured were lower than expected and the first formant of the /a/ and /u/ 060002-1 papers in physics, vol. 6, art. 060002 (2014) / e. v. bonzi et al. figure 1: wave shape of the /a/ sound. figure 2: wave shape of the /i/ sound. vowels is higher than the published data. additionally, the harmonic to noise ratio (hnr) values discriminated per vowel are presented. ii. measurement methodology pitch, first and second formants, jitter, shimmer and harmonic to noise ratio (hnr) are the cornerstones of acoustic measurement of voice signals, and are often regarded as indices of the perceived quality of both normal and pathological voices [13]. in this work, we analyzed the audio files from the five spanish vowels produced by 72 female and male individuals, in order to study the parameters previously mentioned. the individuals are argentinian university students whose ages range between 20 and 30, coming from different regions without any special geographical distribution. the voices were recorded using a behringer c-1u (usb) cardioid microphone and a notebook. the microphone was placed at a distance of 10 cm respect to the mouth of the subjects while they were pronouncing the vowels with an intensity and tone that was comfortable in an acoustically treated room. each sound was sustained for, figure 3: harmonics of the /a/ vowel. figure 4: harmonics of the /i/ vowel. at least, five seconds. the praat program, commonly used in linguistics for the scientific analysis of the human voice [10], was used to record, analyze the wav files and obtain all the parameters presented in this work. a sample rate of 44100 hz was used to record the sound file. the wave shapes of the sounds corresponding to /a/ and /i/ vowels are shown in figs. 1 and 2. in figs. 3 and 4, the harmonic components obtained by applying fourier transform to the respective vowel signal are shown. pitch the pitch is a perceptual attribute of sound closely related to frequency, being this perception a subjective notion. in psychoacoustics, the pitch is related to the fundamental frequency of vibration of the vocal cords, allowing the perception of the tone frequency. nevertheless, for praat program [10], the pitch is coincident with the fundamental harmonic of the wave and we used this definition in this work. this parameter depends on gender, being higher for women and lower for men. 060002-2 papers in physics, vol. 6, art. 060002 (2014) / e. v. bonzi et al. formants the voice is created in the vocal cord, shaped as complex sound with harmonics and modified in the vocal tract by the resonating frequencies. then, the amplitude of harmonics frequencies are enveloped forming a spectrum of energy, the peaks or maximum observed in these spectra are named “formants.” consequently, a formant is a concentration of acoustic energy around a particular frequency in the speech wave. there are several formants, each one at a different frequency corresponding to a resonance in the vocal tract, and especially the first two are related to the movement of the tongue. the high-low magnitude of the first one (f1) is inversely related to the up-down tongue position and the second formant (f2) is related to the front tongue position. jitter and shimmer the naturalness factor of sustained vowels is attributed to a fundamental frequency and the signal amplitude. still there are unwanted variations in time of the sound signal properties in the voice production. while jitter indicates the variability or perturbation of fundamental frequency, shimmer refers to the same perturbation but, in this case, related to amplitude of sound wave, or intensity of vocal emission. jitter is affected mainly by lack of control of vocal fold vibration and shimmer by reduction of glottic resistance and mass lesions in the vocal folds, which are related to the presence of noise at emission and breathiness [10, 14]. harmonic to noise ratio hnr the amount of energy conveyed in the fundamental frequency (f0) and its harmonics, divided by the energy in noise frequencies, is defined as the harmonic-to-noise ratio. frequencies that are not integer multiples of f0 are regarded as noise. this parameter is related to the perception of vocal roughness and hoarseness [10]. normal voices have a low level of noise and high hnr. on the contrary, the degree of hoarseness increases the noise component and decreases hnr. iii. results and discussion the measured data were processed statistically and the results are shown in the tables 1, 4, 5, 6 and figs. 5 and 6. 200 300 400 500 600 700 800 900 1000 500 1000 1500 2000 2500 3000 f ir s t f o rm a n t f 1 [ h z ] second formant f2 [hz] /i/ /e/ /a/ /o/ /u/ female vowels figure 5: female formant chart. 200 300 400 500 600 700 800 900 1000 500 1000 1500 2000 2500 3000 f ir s t f o rm a n t f 1 [ h z ] second formant f2 [hz] /i/ /e/ /a/ /o/ /u/ male vowels figure 6: male formant chart. the pitches for female and male individuals are shown in table 1. we used the minimum and maximum values to address the dispersion instead of the standard deviation because the data distribution was not normal. our values are in general lower for both genders compared to the published data [9, 11, 12]. tables 2 and 3 show the first and second formants values and figs. 5 and 6 show the chart of formants corresponding to female and male populations obtained in this work. we have compared our male results with formant data of male spanish speakers published by bradlow [9]. in general, the first (f1) and second (f2) formants values are comparable to the published ones. in particular, the f1 formants for the /a/ and 060002-3 papers in physics, vol. 6, art. 060002 (2014) / e. v. bonzi et al. female male maximum 314 196 medium 225 128 minimum 155 85 table 1: pitch values of female and male subjects in hz. /u/ vowels are higher than the reported ones, 12 and 21 %, respectively. the second formant, f2, for the /o/ vowel is lower than bradlow by 12 %. on the other hand, we cannot compare our female formant values with published results because we could not find results for female individuals in the literature. comparing female versus male f1 formants, we observed that most of them are higher by 20 % but in the case of the /o/ vowel the difference is 11 %. comparing f2 formants, the female values are higher than the male ones, reaching almost the 25 % for /a/ and /i/ vowels. furthermore, the f2 of the /u/ vowel in our samples show an important scatter for both genders, female and male. in the tables 4 and 5, the obtained jitter and shimmer values for each vowel are shown. they are comparable to the jitter and shimmer averages obtained by casado morente et al. [12] in a study that involves a group of normal people. in our work, we have observed that the jitter and the shimmer values of the /a/ vowel are bigger than the corresponding ones of the other vowels. finally, the hnr results, see table 6, are according to the average value presented by casado morente et al. [12]. however, we could not find in the bibliography the hnr values for each of the five spanish vowels, so we had to make the comparison with the average of them. in the present work, we have found that the vowels show an increasing hnr value from /a/ to /u/, meaning that /u/ has better signal to noise ratio than the other vowels. iv. concluding remarks the objective of this research was to measure acoustical properties of the spanish voices of argentinian speakers. vowels f1 [hz] f2 [hz] /i/ 370 ± 45 2600 ± 110 /e/ 525 ± 40 2300 ± 130 /a/ 900 ± 55 1500 ± 100 /o/ 550 ± 40 1000 ± 80 /u/ 440 ± 40 1150 ± 430 table 2: first and second formant of female. vowels f1 [hz] f2 [hz] /i/ 300 ± 25 2220 ± 100 /e/ 450 ± 35 1935 ± 90 /a/ 715 ± 55 1260 ± 60 /o/ 490 ± 35 900 ± 45 /u/ 390 ± 45 970 ± 430 table 3: first and second formant of male. vowels shimmer local [%] jitter local [%] /a/ 2.7 ± 1.1 0.31 ± 0.10 /e/ 2.1 ± 0.7 0.28 ± 0.08 /i/ 2.2 ± 0.6 0.29 ± 0.07 /o/ 2.0 ± 0.7 0.26 ± 0.11 /u/ 2.1 ± 0.7 0.27 ± 0.09 table 4: shimmer and jitter of female subjects. vowels shimmer local [%] jitter local [%] /a/ 3.0 ± 0.9 0.36 ± 0.10 /e/ 2.3 ± 0.8 0.33 ± 0.09 /i/ 2.3 ± 0.7 0.28 ± 0.08 /o/ 2.2 ± 0.8 0.29 ± 0.10 /u/ 2.3 ± 0.9 0.25 ± 0.07 table 5: shimmer and jitter of male subjects. vowels female male /a/ 21 ± 3 20 ± 2 /e/ 20 ± 2 21 ± 2 /i/ 22 ± 3 22 ± 2 /o/ 25 ± 3 24 ± 3 /u/ 25 ± 4 25 ± 3 table 6: harmonic to noise ratio of female and male subjects in db. 060002-4 papers in physics, vol. 6, art. 060002 (2014) / e. v. bonzi et al. these voice parameters are generally assessed subjectively by several authors. this form of perceptual analysis of voice has significant limitations and the subtle interpretative judgments of verbal classifications may not be accurate. the differences we found in the parameters of the vowels measured in a group of people from argentina compared to the parameters obtained from spanish speaking people living in spain suggests the region of study has an important influence in the results, as expected. this kind of studies are very useful to compare the properties of normal and pathological voices of people from different regions. it is necessary to test the same parameters in female spanish speakers as well. such work should be performed in larger quantities and should be extended to other countries or regions of latin america, especially where different ethnic groups can be found. [1] w r rodŕıguez, o saz, e lleida, análisis robusto de la voz infantil con aplicación en terapia de voz, areté 10, 70 (2010). [2] t cervera, j l miralles, j gonzález-álvarez, acoustical analysis of spanish vowels produced by laryngectomized subjects, j. speech lang. hear res. 44, 988 (2001). [3] j muñoz, e mendoza, m d fresneda, g carballo, p lopez, acoustic and perceptual indicators of normal and pathological voice, folia phoniatr. logop. 55, 102 (2003). [4] h k vorperian, r d kent vowel acoustic space development in children: a synthesis of acoustic and anatomic data, j. speech lang. hear res. 50, 1510 (2007). [5] s p whiteside, sex-specific fundamental and formant frequency patterns in a cross-sectional study, j. acoust. soc. am. 110, 464 (2001). [6] p white, formant frequency analysis of childrens spoken and sung vowels using sweeping fundamental frequency production, j. voice, 13, 570 (1999). [7] r o coleman, male and female voice quality and its relationship to vowel formant frequencies, j. speech lang. hear res. 14, 565 (1971). [8] s bennett, vowel formant frequency characteristics of preadolescent males and females, j. acoust. soc. am. 69 231 (1981). [9] a r bradlow a comparative acoustic study of english and spanish vowels, j. acoust. soc. am. 97, 1916 (1995). [10] p boersma, d weenink, praat: doing phonetics by computer [computer program], version 5.3.51, retrieved 2 june 2013 from http://www.praat.org/. [11] j i hualde, the sounds of spanish, cambridge university press, cambridge (2005). [12] j c casado morente, j a adrián torres, m conde jiménez, d piédrola maroto, v povedano rodŕıguez, e muñoz gomariz, e cantillo baños, a jurado ramos, estudio objetivo de la voz en población normal y en la disfońıa por nódulos y pólipos vocales, acta otorrinolaringol. esp. 52, 476 (2001). [13] j kreimana, b r gerrattb, perception of aperiodicity in pathological voice, j. acoust. soc. am. 117, 2201 (2005). [14] h f wertzner, s schreiber, l amaro, analysis of fundamental frequency, jitter, shimmer and vocal intensity in children with phonological disorders, rev. bras. otorrinolaringol. 71, 582 (2005). 060002-5 papers in physics, vol. 2, art. 020003 (2010) received: 19 october 2009, accepted: 29 september 2010 edited by: a. g. green reviewed by: t. ehrhardt, math. dept., univ. california, santa cruz, usa licence: creative commons attribution 3.0 doi: 10.4279/pip.020003 www.papersinphysics.org issn 1852-4249 expansions for eigenfunction and eigenvalues of large-n toeplitz matrices leo p. kadano� 1, 2∗ this paper constructs methods for �nding convergent expansions for eigenvectors and eigenvalues of large-n toeplitz matrices based on a situation in which the analogous in�niten matrix would be singular. it builds upon work done by dai, geary, and kadano� [h dai et al., j. stat. mech. p05012 (2009)] on exact eigenfunctions for toeplitz operators which are in�nite-dimension toeplitz matrices. one expansion for the �nite-n case is derived from the operator eigenvalue equations obtained by continuing the �nite-n toeplitz matrix to plus in�nity. a second expansion is obtained by continuing the �nite-n matrix to minus in�nity. the two expansions work together to give an apparently convergent expansion for the �nite-n eigenvalues and eigenvectors, based upon a solvability condition for determining eigenvalues. the expansions involve an expansion parameter expressed as an inverse power of n. a variational principle is developed, which gives an approximate expression for determining eigenvalues. the lowest order asymptotics for eigenvalues and eigenvectors agree with the earlier work [h dai et al., j. stat. mech. p05012 (2009)]. the eigenvalues have a (ln n)/n term as their leading �nite-n correction in the central region of the spectrum. the 1/n correction in this region is obtained here for the �rst time. i. introduction i. history this paper is a continuation of recent work by dai, geary, and kadano� [1] (which we shall hereafter cite as paper i) and lee, dai and bettleheim [2] on the spectrum of eigenvalues and eigenfunctions for singular toeplitz matrices. a toeplitz matrix is one in which the matrix elements, tj,k, are functions of the di�erence between indices. we de�ne all matrix elements in terms of a single function: the symbol, a(z), where z = e−ip is on the unit circle. thus we ∗e-mail: lkadano�@gmail.com 1 the perimeter institute, waterloo, ontario, canada. 2 the james franck institute, the university of chicago, chicago, il usa. write tj,k = tj−k = ∮ dz 2πiz a(z) zj−k (1) the toeplitz matrix is then de�ned by having the indices j and k live in the interval [0,n− 1]. (note that i use the subscript notation to describe behavior in coordinate space, and argument notation to describe behavior in fourier space.) the basic problem under consideration here is the de�nition of a good method for calculating the eigenvalues and eigenfunctions of toeplitz matrices for large values of n. previous work [3, 4] has described the toepltiz matrix problem by pointing out that the eigenvalues approach the spectrum of the analogous problem in which the indices vary over the set [−∞,∞]. this latter problem may be solved by fourier transformation and has an eigenfunction ψj = e −ipj and a corresponding eigenvalue a(e−ip). the set of all such eigenvalues, for real p, is termed image of the symbol. widom speculates 020003-1 papers in physics, vol. 2, art. 020003 (2010) / l. p. kadano� [3, 4] that in the large-n limit, the discrete spectrum of the �nite-n problem approaches that image, at least for the case in which the symbol has a singularity on the unit circle. previous work [2, 1] has established how this approach occurs for the speci�c case in which the symbol has the form of singularity introduced by fisher and hartwig [9, 10], speci�cally a(z) = (2 −z − 1/z)α(−z)β (2) note that this singularity is de�ned by two parameters, α, which de�nes a zero in the symbol and, β, which de�nes a discontinuity. for this symbol, lee, dai, and bettleheim [2] found the spectrum for large n and α = 0, while dai, geary and kadano� described a part of the spectrum for real parameters, α and β, obeying 0 < α < |β| < 1 in paper i. the spectrum of the toeplitz matrix is invariant under a re�ection, β →−β, in the sign of β. paper i considered the behavior of toeplitz operators [7, 8] constructed from the symbol of fisher and hartwig. these are toeplitz matrices in which the indices run through the interval [0,∞]. the analysis was carried on for situations in which 0 < α < 1 so that the image of the symbol forms a closed curve. the �nite-n eigenvalues sit within that curve and approach it as n goes to in�nity. two cases should be di�erentiated: case i. 0 > β > −1. all points within the image of the symbol are right eigenvalues of the toeplitz operator [1]. case ii. 0 < β < 1. the toeplitz operator has no right eigenvalues [1]. (there is one more very interesting special case: α = 0. in this situation, if −1 < β < 1, the image of the symbol is a curved line segment, and the eigenvalue spectrum consists of all points which can be reached by connecting two points of that curve. once again, the eigenvalues for �nite-n approach the curve while sitting within the region de�ned by the in�nite-n eigenvalues [2]. we do not consider this α = 0 case in this paper.) the distinction between cases i and ii above describes whether or not the toeplitz operator has or has not right eigenvectors. the transposition operation, tj,k = tj−k → tk,j = tk−j (3) is just the parity operation on j − k, and can be represented by �ipping the sign of β in the symbol. thus if t is in the category of case i, its transpose is in case ii � and vice versa. this distinction carries over in a subtle manner to the toeplitz matrices. in case i, the right eigenvectors, ψj for the toeplitz matrices decay exponentially as j increases. the corresponding left eigenvectors for the operator with the same (negative) value of β grow exponentially with increasing j. this growth can be seen from an additional symmetry of the fisher-hartwig toeplitz matrix under the re�ection operation that changes the index value, j, into n− 1 − j. tj,k = tj−k → t(n−1−j)−(n−1−k) = tk−j (4) and has the same e�ect as the transposition operation. the re�ection interchange �ips the sign of β and also makes the decay of the right eigenvector with j into a growth with j. thus, the case i-case ii distinction is interchanged for both toeplitz operator and toeplitz matrix under the transposition symmetry, and is equally interchanged for the matrix under the re�ection operation. ii. the previous calculational strategy in the previous paper, paper i, we studied the toeplitz eigenvalue equation for case i n−1∑ k=0 tj−kψk = �ψj for 0 ≤ j ≤ n− 1 (5a) by studying the related toeplitz operator equation ∞∑ k=0 tj−kψk = �ψj for 0 ≤ j ≤∞ (5b) for an eigenvalue for which both equations equally had solutions. we could only solve the �rst equation numerically. we had an exact method, the wiener-hopf technique, for solving the second equation. the crucial result was that, for case i situations and for large n, the solution of the second equation provided an excellent approximation for the eigenfunction of the �rst one, at least in the situation in which one is given the correct eigenvalue. what happened was that the extension of the equation being solved into the region between j = n and j = ∞ hardly changed the solution of eq. (5a), at least for j not too close to n. 020003-2 papers in physics, vol. 2, art. 020003 (2010) / l. p. kadano� the next step will be to study an equation arising from extending eq. (5a) toward minus in�nity, speci�cally n−1∑ k=−∞ tj−kωk = �ωj + γj for −∞≤ j ≤ n− 1 (5c) notice the forcing term, γ, on the right hand side of this equation. in the same case i, situation in which eq. (5b) has eigenvalue solutions, eq. (5c) has none so that the forcing term produces a unique, �nite result. very similar methods to the ones which solve eq. (5b) will also solve eq. (5c). iii. plan of paper roughly speaking, the plan of this paper is to produce and combine two di�erent expansions for the toeplitz matrix equation. first, one will get an approximate solution to eq. (5b), albeit with some small terms left over. next, the methods used to solve eq. (5c) will be used to calculate these leftover terms while treating the terms previously determined as forcings. we shall thereby close the equations for the toeplitz matrix eigenvector. the previous work, paper i, had a rather heuristic method for estimating the size of the corrections to the eigenvalue and eigenfunction estimates. here we have an exact, testable expansion. however, the expansion does start from the premise that the �nite-n spectrum of eigenvalues does approach the in�nite-n spectrum, a premise that is true for a wide class of toeplitz matrices with singular symbols [5, 6]. the next chapter includes the two analyses respectively based upon the two toeplitz operator equations obtained by extending our matrix in the two possible directions. the third chapter puts the two analyses together to get equations which will yield an asymptotic expansion for eigenvalues and eigenfunctions. the �nal chapter describes toeplitz problems left unresolved by this paper. ii. a pair of expansions i. de�nitions in all three cases de�ned by eqs. (5), our analysis will be generated by extending the range of the index variables to (−∞,∞), which will then permit us to use fourier transform techniques. the equation for the toeplitz matrix's eigenvector can be cast in terms of three di�erent kinds of functions which are respectively indicated by superscripts −, 0, and +. the �rst superscript indicates a function which is non-zero only for j < 0; the superscript 0 de�nes a function non-zero for 0 ≤ j ≤ n−1; while the third superscript describes a function non-zero in [n,∞). the eigenfunction we wish to calculate is ψ0j and it obeys ∞∑ k=−∞ kj,kψ 0 k = φ − j + φ + j, (6) which holds for all integer values of j. here, the matrix k is kj,k = tj−k − �δj,k (7) eq. (6) will be analyzed in fourier transform language, with z being the fourier variable, as in eq. (1). thus, the four quantities de�ned in that equation will be written as k(z) = a(z) − �,φ−(z),ψ0(z), and φ+(z), which will respectively contain powers of z extending from −∞ to ∞; only negative powers of z; non-negative powers extending up to zn−1; and powers from zn to z∞. we also need to de�ne a notation for the decomposition of the k operator. we write, for case i, the wiener-hopf factorization k(z) = k>(z)/(zk<(z)) (8) where k> has all its singularities and zeros outside the unit circle and k< has all its singularities and zeros inside the unit circle. the reader should recall, from paper i, if and only if � is inside the curve described by a(z). (for case ii, the z would appear in the numerator rather than the denominator.) the functions k>(z) and 1/k>(z) have neither zero nor singularity inside the unit circle so that they can be expanded in a power series in z. similarly, k>(z) and 1/k>(z) are regular outside the unit circle so that they can be expanded in 1/z. as a result , the fourier transforms of these functions obey k>j−l = 0 for j < l while 020003-3 papers in physics, vol. 2, art. 020003 (2010) / l. p. kadano� k) and (1/k<) in coordinate space by fourier transformation, as for example, (1/k>)j = ∮ dz 2πiz z−j/k>(z) acting to the right, k>, (1/k>) and z all carry information toward larger j values, while k<, (1/k<) and 1/z carry information toward lower j values. ii. wiener-hopf analysis for toeplitz operator this section is not at all new. it is all contained in paper i and in earlier work [7, 8]. however, the notation is slightly di�erent here. we set n = ∞ and note that φ+ must be zero. to distinguish the solution for the toeplitz operator from the one for the toeplitz matrix, we write ψ for the operator eigenfunction and φ− for the auxiliary function φ−. we then note that eqs. (5b), (7) and (8) imply k>ψ = zk<φ− (9) note that ψ contains only non-negative powers of z, while φ− contains only negative powers. eq. (9) is constructed to enable us to follow the usual wiener-hopf strategy [11]. the only possible common behavior of the two sides of eq. (9) is that both sides may contain a constant term, independent of z. then eq. (9) has the solution k>ψ = c (10a) zk<φ− = c (10b) with c being simply an arbitrary constant in this fourier transform language. (in coordinate space, c becomes cδ(j, 0)). the solution can then be written in terms of two functions: ψj = c(1/k >)j (11a) which vanishes for j < 0, while the other function is φ−j = c(1/k <)j+1 with (1/k<)j = ∮ dz 2πiz z−j/k<(z). (11b) this integral vanishes for j > 0. note that the arbitrary parameter, c, is a normalization constant for the eigenfunction and its auxiliary function, φ−. the analysis in paper i enables us to describe the asymptotic structure of these functions for values of |j| much bigger than one in the previously analyzed case 0 < α < −β < 1. recall from paper i that � = a(zc), that zc = e −ipc is outside the unit circle, and therefore � is inside the curve formed by a(z), with z on the unit circle. the fourier transforms of both functions contain a weak singularity at z = 1 proportional to (1−z)2α. this zero, then, produces a real term which decays as 1/j1+2α for large values of |j|. the function, k>(z), has, in addition to the weak zero, a simple zero at z = zc, just outside the unit circle. this zero describes the eigenvalue of the toeplitz matrix. the zeros give an asymptotic form for large j containing two terms (1/k>)j → a>(e−ipc )j−1−2α + b(e−ipc )eipcj, (12a) the notation, a>(e−ipc ) and b(e−ipc ), indicates that these coe�cients depend upon the eigenvalue. for (1/k<) and thus for the auxiliary function, φψ0 = zk<φ− + zk<φ+ (15) 020003-4 papers in physics, vol. 2, art. 020003 (2010) / l. p. kadano� the second term on the right hand side of this equation can be split up into parts which contain exponents of z which are respectively negative, between zero and n − 1 inclusive, and above n − 1 in the form zk<φ+ = (zk<φ+)− + (zk<φ+)0 + (zk<φ+)+ then this equation can be rearranged into a form in which terms in non-positive powers of z appear on one side and non-negative powers on the other. i.e. k>ψ0 − (zk<φ+)0 − (zk<φ+)+ = zk<φ− + (zk<φ+)− = c (16) the second equal sign in this equation sets both sides equal to a constant, c, as in eq. (10). the leading terms in both the wave function and the auxiliary function, φ−, are set by c since eq. (16) implies a leading order behavior k>0 ψ 0 = k<0 φ − −1 = c (17) for j's not too far from zero. later on, we shall use an analogous result from an analysis of forcings in eq. (14) to obtain a solvability equation for determining the eigenvalues. notice that we have used the symbol c to describe the normalization constant in this situation, while we used c for the same purpose in the toeplitz-matrix eigenvector. these two quantities are analogous, but need not be the same. we solve for ψ0, �nding ψ0 = (1/k>)c + (1/k>)(zk<φ+)0 +(1/k>)(zk<φ+)+. since (1/k>), acting to the right pushes coordinate indices toward higher values, we can project onto the subspace 0 �note that the projection of the �nal term is zero� and thus �nd ψ0 = [(1/k>)c]0 + [(1/k>)(zk<φ+)0]0 (18a) the equation for the negative j domain in eq. (16) may be solved for φ− to give φ− = (zk<)−1c− (zk<)−1(zk<φ+)− (18b) as in the analysis in subsection ii. the auxiliary function φ− does not directly enter the analysis of the eigenfunction, which is there denoted as ψ, and is here called ψ0. however, there are important differences between the result here and the one in subsection ii.: in contrast to the case of the toeplitz operator, the solution for the eigenfunction requires a knowledge of a subsidiary function, here φ+. this function provides the forcing term which renders our lowest order solution inexact. further, for the toeplitz operator, only the function k> is needed to determine the eigenfunction. here, both k> and k< are involved. note the 1/z to the right of the equal sign in eq. (18b). this factor has the e�ect of making the leading term in the expansion of φ− be 1/z, which is then followed by higher powers of 1/z. this is precisely the right structure for the expansion of φ−. eq. (18) gives us expressions for two of the quantities we need to know. however, we are far from done. these equations give us relatively simple expressions for ψ0 and φ−, but we do not yet have an equivalently simple expression for φ+. in both of the two subequations in eq. (18), we can evaluate the �rst term directly, while the second term could be evaluated by quadratures if we but knew φ+. note that the �rst terms on the right in both of these subequations are precisely the same as in the solution for the toeplitz operator eigenvector. our previous results [1] show that for small and intermediate values of j in the set [0,n−1], the �rst term in eq. (18a) varies over a wide range, being of order c for small values of j and of order cλ ∼ c/n2α+1 for j of order n. similarly the �rst term in eq. (18b) varies from being of order c for −j of order unity to being of order cλ for −j of order n. it will turn out that the second term in each of these equations is a correction of order cλ and therefore smaller by a factor of λ than the maximum value of the �rst term. iv. forcing analysis for toeplitz matrix it appears that we have usable lowest order results for two of the three unknown functions. the third unknown, φ+, contributes correction terms but it is hard to see a direct way to get it from eq. (16). however, we can use the forcing form of the toeplitz matrix to obtain additional information. to do this, rewrite eq. (15) while interchanging 020003-5 papers in physics, vol. 2, art. 020003 (2010) / l. p. kadano� the role of k> and k< and �nd (1/k<)ψ0 = (z/k>)φ− + (z/k>)φ+ (19) we may then split up the term (z/k<)φ− into the di�erent regions (−, 0, and +) to derive a result analogous to eq. (16), namely (1/k<) ψ0 − [(z/k>)φ−]0 − [(z/k>)φ−]− = (z/k>)φ+ + [(z/k>)φ−]+ = 0 (20) there is, however, a substantial di�erence between eq. (16) and eq. (20). look at the region to the right of the �rst equal sign in both equations. the former has a zφ− in it, while the latter has a zφ+. the former has a component which extends beyond the region − into the region 0, while the latter has no term projecting our of the region +. thus, we put a constant, c, on the right hand side of eq. (16) to cancel the extending term, but put a zero on the right of eq. (20), since there is no such term. this zero is, at bottom, a re�ection of the fact that the transposed wiener-hopf operator equation has no eigenvalue solutions. now look at the terms between the two equal signs in eq. (20). the lowest power of z in the �rst such term is zn+1. the next term contains as its smallest power of z, a term in zn. there is nothing to balance this term. it must vanish. it follows that [(z/k>)φ−]n = 0 (21) this statement will turn out to be the integrability condition that will �x the eigenvalue in our analysis. we now use eq. (20) to write the analogs of eq. (18a) and eq. (18b), which are ψ0 = [k<((z/k>)φ−)0]0 (22a) and φ+ = −(k>/z)[(z/k>)φ−]+ (22b) eq. (22a) is another expression for the eigenfunction, analogous to eq. (18a). we hope that the two equations are equivalent. eq. (22b) gives us a usable expression for φ+, which can then be employed to give explicit values to the correction terms in eq. (18a) and eq. (18b). note that eq. (22a) and eq. (22b) are simpler than their analogs, derived earlier, because they do not have terms in c. these four equations will give us the results we need for the various eigenfunctions. they are exact; there are no approximations made in their derivation. the integrability condition, eq. (21), is also exact. iii. results i. estimation of eigenvalues the eigenvalue of the toeplitz matrix can be estimated with the help of eq. (21). when written out, this equation reads ∞∑ j=1 (1/k>)n+j−1 φ − −j = 0 (23) to obtain a lowest order version of this equation, one replaces φ− by its lowest order approximant, as given by the �rst term on the right hand side of eq. (18b). we thereby obtain ∞∑ j=1 (1/k>)n+j−1k < 1−j = 0 (24) as our lowest order eigenvalue condition. in the situation in which n is large, one can use the asymptotic form of (1/k>) as given by eq. (12a) to replace the �rst factor under the summation in eq. (24) so that the eigenvalue condition becomes ∞∑ j=1 [ a>(e−ipc )(n + j − 1)−1−2α +b(e−ipc )eipc(j−1)eipcn ] k<1−j = 0. the main contribution to this equation converges rapidly with j, so for large n we neglect j in comparison to n and �nd an expression for the momentumvalue, pc: eipcn = −n−1−2α [ a>(e−ipc ) ∑∞ j=1 k < 1−j ][ b(e−ipc ) ∑∞ k=1 e ipc(k−1)k<1−k ] this equation is then solved to get an asymptotic expansion for the m−th value of the momentum 020003-6 papers in physics, vol. 2, art. 020003 (2010) / l. p. kadano� p(m)c = 2π(−1 + m/n) − (1 + 2α)i(ln n)/n + (δp)i/n + o(1/n), with (δp) = ln{[−a>(zc)k<(1)]/[b(zc)k<(zc)]}, and zc = e −ipc (25) here, m is a label for the di�erent eigenvalues, which takes on values between zero and n − 1. in eq. (25), a> and b are non-singular functions of m/n, except for extra corrections which appear when m is close to its endpoints. thus, δp gives smaller, slowing varying corrections to the earlier terms on the right of eq. (25). the estimates re�ected in eq. (25) were all predicted in paper i, except for the precise value of δp, which appears here for the �rst time. ii. equations for eigenfunctions we argue about the relative sizes of the various terms in the equations by saying that 1/k> and 1/k< serve as propagators which connect the regions described by the symbols −, 0, and +. any connection between − and + is necessarily small, as is any connection of cδj,0 to +. i assert that these connections are of order λ ∼ 1/n1+2α. this smallness makes terms involving several regions necessarily small and makes it possible for our expansions for the eigenfunction and auxiliaries, given below, to be rapidly convergent. to obtain such expansions, one starts with the unknown that determines φ−, as seen in eq. (18b), i.e. x− = (zk<φ+)− (26) according to the arguments we have given so far, this term should be at maximum of order cλ2. then rewrite eq. (22b) for φ+ in terms of this small unknown quantity as φ+ = −(k>/z) [ [1/(k>k<)] [c−x−] ]+ (27) we can next rewrite this equation as in integral equation for the unknown as x− = − ( (k>k<) [ (k>k<)−1 (c−x−) ]+)− (28) since x− is of order λ2 relative to c, eq. (28) can be solved iteratively by expanding the right hand side in a power series in x−. once x− is determined, eq. (18b) will give the value of φ− and eq. (27) will determine φ+. using the value of φ+ one can then determine the eigenfunction via eq. (18a). iii. an unconventional eigenvalue condition it appears that we have a convergent expansion for our eigenfunction. however, there is a potential di�culty. the expansions cannot always converge. indeed, the equations for the various functions determine an eigenvalue, and cannot possibly converge unless the eigenvalue condition is met. the reader might not be sure that the integrability condition of eq. (21) is the correct eigenvalue condition. one conventional way of �nding eigenvalues is through the use of an extremal principle. such a principle always exists for a hermetian matrix. the matrix, t , is not hermetian, but, as a toeplitz matrix, it has a built-in re�ection symmetry which can be used in a roughly similar manner. let χj be a vector with indices, j's, in the interval [0,n − 1]. then, the re�ection of this eigenvector is χ̃j = χn−1−j (29) as discussed in paper i, if χ is an eigenvector of the this toeplitz matrix, t , then the re�ected vector is automatically an eigenvector of the transpose of t . in symbols n−1∑ k=0 kj−kχk = 0 implies n−1∑ k=0 χn−1−kkk−j̃ = 0 (30) where j̃ = n−1−j. thus, if the left-hand statement is true for all j in [0,n − 1], then the right-hand statement is equally true. now we de�ne an analog of an extremal principle for the toeplitz matrix. consider the quantity: q[χ] = n/d with (31) d = n−1∑ l=0 χ̃l χl = χ̃ ·χ n = n−1∑ j,k=0 χ̃j [ tj−k − �δj,k ] χk = [χ̃ ·k ·χ] this quantity, q, reduces to zero when � is an eigenvalue of the n-th order toeplitz matrix and χ is 020003-7 papers in physics, vol. 2, art. 020003 (2010) / l. p. kadano� the corresponding eigenfunction. if χ deviates from this eigenfunction by a small amount, then q is of order of the square of the deviation. if we choose a variational function that is an eigenfunction, but one with the �wrong� eigenvalue, the variational function will have the value of the di�erence between � and that eigenvalue. since the eigenvalues vary by an amount of order unity, we might expect that a completeness argument might imply that for an arbitrarily chosen smoothly varying χ, q would be of order unity. of course, if � is not an eigenvalue of the toeplitz matrix, this extremal property is lost. unfortunately, the extremal property is not a minimum or a maximum property. hence, it might be that an incorrect variational function would nonetheless give the variational function a value zero. if q is far from zero, χ deviates considerably from the eigenvector. however, the converse is not true: q might be zero, while χ is nonetheless far from being an eigenvector. however, we shall use a small size of q as some indication of a good approximation to an eigenvector. now look at the special case in which χ is our lowest order approximation for the eigenfunction as given by the �rst term in eq. (18a), χj = ψj = (1/k >)(j) for 0 ≤ j ≤ n− 1 (32) (for simplicity, we have chosen c = 1.) with this choice the denominator has the value d = n−1∑ l=0 (1/k>)(l) (1/k>)(n− 1 − l) (33a) the �rst step in simplifying the numerator is to replace the sum over k = [0,n − 1] by a sum over k = [0,∞] minus a sum over k = [n,∞]. the sum over [0,∞] vanishes so that n = − n−1∑ j=0 ∞∑ k=n ψn−1−j ψk kj−k the j-sum is split into pieces. we sum over j = [0,∞] and subtract the piece j = [n,∞]. eq. (9) then gives a result in which n comes out as the sum of two terms, n = n++ + n+− which respectively have the values n++ = ∞∑ j,k=0 ψj+n ψk+n k−j−k−1−n = ∞∑ j,k=0 (1/k>)j+n (1/k >)k+n k−j−k−1−n (33b) and n+− = − ∞∑ k=0 ψk+n φ−k−1 = − ∞∑ k=0 (1/k>)k+n (1/k <)−k−1 (33c) the results of paper i enable us to estimate the order of magnitude of the various terms in eq. (33). the denominator eq. (33a) has the magnitude d = o(nλ), unless it is made smaller by a cancellation in the sum. the numerator term n++ has the order of magnitude n++ = o(nλ3) since each k in the product is of order nλ and the sum converges by falling o� algebraically. the numerator term n+− has the order of magnitude n+− = o(λ) since (k<) falls o� quite rapidly from its values, of order 1, for small values of k. in fact, we already calculated precisely this term when we gave our lowest order equation for the eigenvalue in eq. (24) derived from it our lowest order estimate estimate for pc in eq. (25). thus, our lowest order estimate was a demand that n+− vanish. if we use the condition n++ + n+− = 0 to obtain another estimate of pc, that estimate may be expected to be of higher order in λ than the previous one. thus, we gain additional con�dence that our previous analysis is correct. in principle, we could carry out our expansions of the eigenfunction and auxiliary functions to any derived order and thereby make our presumed exact condition of eq. (21) true to any order. our result for the eigenvalue is �exact�, but it is not rigorous since we have not proved that our approach converges. iv. looking forward we have now completed our task of constructing an analytic (albeit heuristic) structure for an eigenfunction expansion. the work is plausible but not 020003-8 papers in physics, vol. 2, art. 020003 (2010) / l. p. kadano� proven. the next step might be to construct proofs of the convergence and exactness of these results, or alternatively, to back them up with good numerical work. the work, in fact, lacks two checks which one might hope to put into place using purely analytic means. i have not checked that the analytics yields the fisher and hartwig [9, 10] results for the product of eigenvalues. in fact, i cannot see from where they might arise. i also have not checked that somewhat di�erent approaches of subsection iii. and subsection iv. give exactly the same expression for the eigenvector. in addition, i don't know how the eigenvectors and eigenvalues at the ends of the spectrum behave. as shown by numerical evidence, they behave di�erently from the ones at the middle of the spectrum, but the di�erence has been left unexplored up to now. the di�erence arises because there are two di�erent zeros in k(z) that are well separated in the middle of the spectrum, but come together at the ends. but those words do not tell us the answer without further work. of course, there is much room for analysis of further regions of the parameters α and β. it is, in some respects, very pleasing to see that there is yet room for good additional work on this problem. acknowledgment i appreciate the help given to me by michael fisher and jacques h.h. perk. i had useful conversations with peter constantin, hui dai and seung yeop lee. this work was completed during a visit to the perimeter institute, which is supported by the government of canada through industry canada and by the province of ontario through the ministry of research and innovation. this work was also supported in part by the university of chicago mrsec program under nsf grant number dmr0213745. [1] h dai, z geary, l p kadano�, asymptotics of eigenvalues and eigenvectors of toeplitz matrices, j. stat. mech. p05012 (2009). [2] s y lee, h dai, e bettelheim, asymptotic eigenvalue distribution of large toeplitz matrices, arxiv:0708.3124v1 (2007). [3] h widom, toeplitz determinants with singular generating functions, amer. j. math. 95 333, (1973). [4] h widom, eigenvalue distribution of nonselfadjoint toeplitz matrices and the asymptotics of toeplitz determinants in the case of nonvanishing index, operator theory: adv. and appl. 48 387, (1990). [5] e basor, k morrison, the fisher-hartwig conjecture and toeplitz eigenvalues, linear algebra and its applications 202, 129 (1994). [6] r beam, r warming, the asymptotic spectrum of banded toeplitz and quasi-toeplitz matrices, presented in the 14th biennial conference on numerical analysis, dundee, scotland (1991). [7] b m mccoy, t t wu, the two-dimensional ising model, harvard university press, princeton (1973). [8] a böttcher, b silbermann, analysis of toeplitz operators , 2nd ed, springer, berlin (2006). [9] m e fisher, r e hartwig, toeplitz determinants, some applications, theorems and conjectures, adv. chem. phys. 15, 333 (1968). [10] m e fisher, r e hartwig, asymptotic behavior of toeplitz matrices and determinants, arch. rat. mech. anal. 32, 190 (1969). [11] http://en.wikipedia.org/wiki/ wiener%e2%80%93hopf_method 020003-9 papers in physics, vol. 9, art. 090002 (2017) received: 8 november 2016, accepted: 4 january 2017 edited by: d. gomez dumm licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.090002 www.papersinphysics.org issn 1852-4249 an alternative derivation of the dirac operator generating intrinsic lagrangian local gauge invariance brian jonathan wolk 1∗ this paper introduces an alternative formalism for deriving the dirac operator and equation. the use of this formalism concomitantly generates a separate operator coupled to the dirac operator. when operating on a cli�ord �eld, this coupled operator produces �eld components which are formally equivalent to the �eld components of maxwell's electromagnetic �eld tensor. consequently, the lagrangian of the associated coupled �eld exhibits internal local gauge symmetry. the coupled �eld lagrangian is seen to be equivalent to the lagrangian of quantum electrodynamics. i. introduction the dirac equation [1] arises from a lagrangian which lacks local gauge symmetry [2�6]. in the usual quantum �eld theoretic development, local gauge invariance is thus made an external condition of and on the lagrangian [3�6]. introduction of a vector �eld aµ that couples to the dirac �eld ψ must then be introduced in order to satisfy the imposed local symmetry constraint [2�4]. more satisfactory from a theoretic standpoint would be a formalism in which derivation of the dirac operator equation is associated with a lagrangian exhibiting internal local gauge symmetry. such a formalism would alleviate both the need to impose local gauge invariance as an external mandate as well as the need to invent and introduce a vector �eld to satisfy the constraint. symmetry would exist ab initio. this paper presents such an approach and derivation. ∗e-mail: attorneywolk@gmail.com 1 3551 blairstone road, tallahassee, fl 32301 suite 105, usa. ii. alternative formalism i. the standard approach one consequence of the standard approach [1�6, 8, 20] in deriving the dirac operator �∂ ≡ γα∂α = γ0∂/∂t−γ·∇, with γ = (γ1,γ2,γ3), which is related to the d'alembertian operator � ≡ ∂µ∂µ associated with the klein-gordon equation �ψ = −m2ψ [2� 6,20], is that cli�ord-dirac elements {γµ} arise as necessary structures of the dirac operator �∂, with the following properties [2�6,8,10,21] γα = gαδγδ; γµγ µ = 4 (1) γnγm = −γmγn(n 6= m) (γn) 2 = −1; ( γ0 )2 = 1 (γµ) † = γ0γµγ0 (γn) † = −γn � = �∂ 2 . dirac's equation �∂ψ = ±imψ follows for a fermionic �eld ψ such as the electron [2�6,21]. 090002-1 papers in physics, vol. 9, art. 090002 (2017) / b. wolk ii. an alternate formalism two conditions are set forth for developing an alternative formalism for deriving an operator, call it o, which operates on the wave function ψ for the subject fermionic particle and generates the equation governing its evolution. the �rst condition is that since the wave function ψ is a spinor, the cli�ord elements must act, if at all, as operators on it [8, 13, 20]. therefore, the applicable operator o should contain cli�ord algebra elements. the second condition is that � should be derivable from o [2,4,6];1 there must exist a mapping z : o → �, and thus the governing equation itself must satisfy2 z(o)ψ = z(±im)ψ −→ �ψ = −m2ψ. (2) to satisfy the d'alembertian condition that z : o → �, the mapping must make use of the partial derivative operators, and so the operator ∂ ≡ (∂/∂t,∇) is de�ned. to meet the cli�ord condition that o contains cli�ord elements, the operator η ≡ (γ0,γ) is put. written explicitly, these fundamental operators are ∂ = ∂/∂t + i∂/∂x + j∂/∂y + k∂/∂z, (3) and η = γ0 + iγ1 + jγ2 + kγ3. (4) we wish to use these fundamental operators in constructing o. to do so, use is made of the equivalence between the ring of quaternions h with basis (1, i,j,k) and r4 the four-dimensional vector space over the real numbers: {q ∈ h : q = u01 + bi + cj + dk|u0,b,c,d ∈ r} [10�12,14], with i2 = j2 = k2 = −1. the quaternion q can then be divided into its scalar and vector portions: {q = (u0,u) |u0 ∈ r,u ∈ r3} [11,12,14]. 1a condition also imposed by dirac. ref. [2], p. 86. 2putting z (±im) = −m2 in eq. (2) presumes that the form and domain of the mapping z is known. the more general relation would be z (f [m]) = −m2, with f [m] = ±im to be subsequently deduced. but once the form of o is discovered, the form of z becomes evident, namely z = []2, and deducing f [m] becomes trivial. additionally however, given the fact that the rhs of eq. (2) involves a square, one can intuit the correct form for z in the �rst instance. in this way, the operators given in (3) and (4) can be conceived as quaternionic operators, with the relations between the quaternionic basis elements and the cli�ord elements being [11,13] 1 = γ0γ0, i = γ2γ3, j = γ3γ1, k = γ1γ2. (5) the γµ are then the �rst-order, primary entities [8,10,20] from which the quaternionic basis is constructed.3 to generate a new operator using the fundamental operators, the product η∂ is taken. the product of two quaternionic operators v = (v0,v) and w = (w0,w) may be written as a product of their scalar and vector components in the r4 representation using the formula (v0,v)(w0,w) = (v0w0 −~v · ~w,v0 ~w + ~vw0 + ~v× ~w), (6) where v →~v, and w → ~w [10�14]. this gives η∂ ≡ (γ0,γ)(∂/∂t,∇), (7) producing the operator η∂ ≡([η∂]0, [η∂]∧) (8) =(γ0∂/∂t−γ ·∇,γ0∇ + γ∂/∂t + γ ×∇). the operator η∂ is composed of two coupled operators (and thus will operate on two coupled �elds). its �rst component operator is [η∂]0 = γ0∂/∂t−γ ·∇. (9) setting z = []2 gives a mapping z : [η∂]0 → �. this mapping satis�es eq. (2) [6,20]. the operator [η∂]0 thus satis�es both the d'alembertian and cli�ord conditions. putting [η∂]0 = o and noting the obvious equivalence o ≡ �∂, the dirac operator is thus seen to be derived from the new formalism. given eq. (2), we have oψ = ±imψ as a possible fermion �eld equation of motion. as any solution to oψ = ±imψ is also a solution to the kleingordon equation [2,6,21], this equation is naturally postulated as governing a fermionic particle such as the electron. 3note is made that η = γ0γ0γ0 + γ1γ2γ3 + γ1γ2γ3 + γ1γ2γ3. 090002-2 papers in physics, vol. 9, art. 090002 (2017) / b. wolk iii. the coupled operator a new operator which is coupled to �∂ is seen to arise within this formalism. this operator is the vector component of η∂ in eq. (8), namely [η∂]∧ = γ0∇ + γ∂/∂t + γ ×∇. (10) to maintain consistency with the formalism used with �∂, the operator [η∂]∧ is also written in the γµbasis. designating this operator as ��c we have [η∂]∧ →��c = γ0(̂i∂/∂x + ĵ∂/∂y + k̂∂/∂z) + γ1(̂i∂/∂t− ĵ∂/∂z + k̂∂/∂y) + γ2(̂i∂/∂z + ĵ∂/∂t− k̂∂/∂x) + γ3(−î∂/∂y + ĵ∂/∂x + k̂∂/∂t). (11) since the operators ( �∂,��c ) are coupled, when �∂ operates on some �eld so should ��c. inspection of eq. (11) shows that ��c's operation must be of a di�erent sort and on a di�erent yet coupled �eld. to see how ��c operates and on what, some notation is �rst required. a = a0(x) + a1(x)̂i + a2(x)ĵ + a3(x)k̂ represents a four-vector �eld, for which we can associate the cli�ord �eld �a = aµγ µ, with aµ ≡ aµ(x) being the �eld components of a. there is thus a component-wise bijection between �a and a. a cli�ord vector �eld is de�ned as �c = cµγµ, with each cµ being its own vector �eld. in this way, a general cli�ord vector �eld operator is de�ned as ��4 = 4 αγα, with each component 4α being its own vector �eld operator. in standard vector analysis, vector �eld operators operate on scalar �elds [15]. following suit, in order for a cli�ord vector �eld operator's (��4) component vector �eld operators (4α) to operate on the scalar �elds aµ of a cli�ord �eld �a, an operation · must be de�ned such that ��4·�a = 4 µγµγ µaµ = 4µaµ = 4µaνgµν. (12) using this formalism, the components c α of ��c are given by eq. (11).4 choosing a cli�ord �eld of the general form 4for instance, c 0 = ∇. �φ ≡ γµφµ = γ0φ0 + γ1φ1 + γ2φ2 + γ3φ3, (13) with φµ ≡ φµ(x), and operating on�φ with ��c gives ��c ·�φ = (̂i∂/∂x + ĵ∂/∂y + k̂∂/∂z)φ0 + (̂i∂/∂t− ĵ∂/∂z + k̂∂/∂y)φ1 + (̂i∂/∂z + ĵ∂/∂t− k̂∂/∂x)φ2 + (−î∂/∂y + ĵ∂/∂x + k̂∂/∂t)φ3. (14) we have then the coupled �eld (ψ, φµ) through action of the operator η∂. unlike ψ, the φµ are not 4-element column matrices and are not spinor �elds, since operating through in eq. (14) excises the cli�ord elements. rearranging terms give the following set of six vector �eld components: (∂φ0/∂x + ∂φ1/∂t)̂i (∂φ0/∂y + ∂φ2/∂t)ĵ (∂φ0/∂z + ∂φ3/∂t)k̂ (∂φ2/∂z −∂φ3/∂y)̂i (∂φ3/∂x−∂φ1/∂z)ĵ (∂φ1/∂y −∂φ2/∂x)k̂ (15) these equations can be identi�ed with the components of two vector �elds x = −∇φ0 −∂~φ/∂t (16) and y = ∇× ~φ, (17) with ~φ = (φ1, φ2, φ3). these equations represent the six independent components of an antisymmetric �eld tensor h, which ��c ·�φ has generated. there is thus a one-to-one and onto correspondence: {±��c ·�φ ↔ h}. therefore, h can be written as the curl of the cli�ord scalar �eld components hµν ≡ ∂µφν −∂νφµ. (18) h is then formally equivalent to the electromagnetic �eld tensor [6, 16, 19, 22]. using the component-wise bijection stated above: {�a ↔ a}, 090002-3 papers in physics, vol. 9, art. 090002 (2017) / b. wolk the components of �φ are identi�ed with the components of the electromagnetic potential vector a: aµ ≡ φµ. this being the case, aµ5 represents a massless vector �eld (the photon) abiding by the gauge invariance condition [2,3,6,9,17�19,22] aµ −→ aµ + ∂µλ. (19) i. the coupled locally gauge symmetric lagrangian the gauge invariance condition, eq. (19), can be exploited to impose an additional constraint on the potential aµ, namely the lorenz condition ∂µa µ = 0 [2, 6].6 with the aid of the lorenz gauge, the lagrangian for the �eld aµ with source j µ [2,6,18] can be written as laµ = − 1 16π hµνhµν − 1 c jµaµ. (20) the lagrangian for the dirac �eld ψ is given by [2,6] lψ = i~cψγµ∂µψ −mc2ψψ. (21) while exhibiting global gauge invariance, the dirac lagrangian lψ is not locally gauge invariant [2� 6]. the usual quantum �eld theoretic approach is to mandate local gauge symmetry [3, 6], thereby requiring subsequent introduction of a new vector �eld aµ in order to meet this mandate [2�6]. the current formalism does not require such a method. the lagrangian for the coupled �eld is thus l(ψ,aµ) ≡lψ + laµ =[i~cψγµ∂µψ −mc2ψψ] − 1 16π hµνhµν − (eψγµψ)aµ, (22) where ceψγµψ = jµ is the quantum �eld current density satisfying the conservation equation [2,6,7] ∂µj µ = 0. (23) this is an important result; for the conservation equation is a consequence of the intrinsic 5where aµ is now taken to represent the electromagnetic four-vector potential. 6this gauge condition is often incorrectly referred to as the lorentz condition, vice the correct attribution as the lorenz condition [23]. gauge symmetry of l(ψ,aµ), since j µ is simply the noether current corresponding to the local phase transformation ψ → eiα(x)ψ concomitant with eq. (19) as part of the local gauge invariance transformation [21]. as the ward identity, given by kµmµ (k) = 0, is an expression which results from this current conservation,7 it follows that the ward identity is intrinsically manifest as well in the current formalism as a consequence of the inherent local gauge symmetry of the lagrangian.8 the form of the interaction term (eψγµψ)aµ of l(ψ,aµ) arises naturally in this formalism. an intrinsically coupled �eld must have a coupling parameter in this case e, the electric charge and a lagrangian interaction term [2,3,6]. further, in relativistic quantum mechanics, the probability current ψγµψ takes the role of the conserved current jµ of the wave function ψ [2, 7, 21]. it is natural then to integrate the coupling parameter along with the probability current into the interaction term of eq. (20). this results in the selfsame interaction term found via the standard derivation through imposed local gauge symmetry [2,6,21,22]. l(ψ,aµ) is locally gauge invariant [2, 3, 6, 7, 22]. the alternative formalism thus produces a coupled �eld (ψ,aµ) which is represented by an internally local gauge symmetric lagrangian. there is no need then to either mandate local gauge invariance or thereafter to introduce an external �eld to meet the mandate, as both are inherent to the formalism; symmetry exists from inception. lastly, it is seen that l(ψ,aµ) ≡ lqed, the lagrangian of quantum electrodynamics.9 in canonically quantizing the theory this equivalence of lagrangians is conditioned on modi�cation of the 7ref. [21], sections 5.5 and 7.4. where m(k) = �µ(k)mµ(k) is the amplitude for some quantum electrodynamic process involving an external photon with momentum k. 8ref. [2], section 13.2.4 (local gauge invariance ←→ current conservation ←→ ward identities). 9this paper does not contemplate the yang-mills generalization and extension of gauge invariance to non-abelian groups such as u(1)⊗su(2) of the weak interaction or quantum chromodynamic's su(3) [21,22], but only a formalism for an intrinsic local u(1) symmetry of qed. therefore, such symmetries as the becchi, rouet, stora and tyutin (brst) symmetry which is typically covered in quantization of nonabelian gauge theories is not addressed herein, but is left to the possible extension of this paper's formalism to such non-abelian generalizations with their associated invariant full e�ective lagrangians [22]. 090002-4 papers in physics, vol. 9, art. 090002 (2017) / b. wolk lorenz condition relied on above in generating laµ. for the canonically quantized formalism, gupta-bleuler's weak lorenz condition given by ∂µa µ+ |ψ〉 = 0 replaces the lorenz condition, in which aµ+ acts as the photon lowering quantum �eld operator and |ψ〉 represents a ket of any number of photons [2,21,22].10 it follows from this conditioned equivalence that the new formalism generates all of electrodynamics and speci�es the current produced by the subject dirac �elds [2,3,6,21].11 iv. conclusion local gauge symmetry plays the central, dominant role in modern �eld theory [22]. that being the case, it would be preferable that the intrinsic structure of fundamental physical theories exhibit this symmetry ab initio. therefore, a formalism which produces the dirac operator equation exhibiting inherent local gauge invariance while also jettisoning the need for invention of an auxiliary vector �eld in order to satisfy an imposed symmetry constraint is more satisfying from a theoretic standpoint. this paper's formalism achieves such an internal local symmetry, and in doing so naturally generates the fundamental equations of quantum electrodynamics. such a uni�ed description of these basic equations and their processes may also lead to a deeper understanding of the origin of these phenomena. [1] p a m dirac, the quantum theory of the electron, proc. roy. soc. lond. a117, 610 (1928); ibid, part ii, a118, 351 (1928). [2] r d klauber, student friendly quantum �eld theory, sandtrove press, fair�eld, iowa (2013). [3] s weinberg, the quantum theory of �elds, vol. i, cambridge university press, cambridge (2005). [4] p j e peebles, quantum mechanics, princeton university press, princeton, new jersey (1992). 10this modi�cation functions as a necessary constraint on the longitudinal and scalar photons in any given quantum state, permitting their mutual cancellation when calculating the hamiltonian expectation value of the quantized �eld theory [2,22]. 11see, e.g., ref. [6], p. 360 and ref. [21], p. 78. [5] a zee, quantum �eld theory in a nutshell, princeton university press, princeton, new jersey (2003). [6] d gri�ths, introduction to elementary particles, 2nd rev. ed., wiley-vch, weinheim, germany (2008). [7] w pauli, wave mechanics, pauli lectures on physics, vol. 5, dover publications inc., mineola, new york (1973). [8] r penrose, w rindler, spinors and spacetime, vol. i, cambridge university press, cambridge (1984); ibid vol. ii (1986). [9] p a m dirac, directions in physics, john wiley & sons, new york (1978). [10] d hestenes, g sobczyk, cli�ord algebra to geometric calculus, reidel, dordrecht (1999). [11] j dieudonne, foundations of modern analysis, academic press, new york (1960). [12] o veblen, j w young, projective geometry, ginn & co., boston (1918). [13] p lounesto, cli�ord algebras and spinors, cambridge university press, cambridge (2001). [14] t w judson, abstract algebra, pws publishing, texas (1994). [15] g e hay, vector and tensor analysis, dover publications inc. mineola, new york (1953). [16] m schwartz, principles of electrodynamics, dover publications inc., mineola, new york (1972). [17] j r lucas, p e hodgson, spacetime and electromagnetism: an essay on the philosophy of the special theory of relativity, clarendon press, new york (1990). [18] w pauli, theory of relativity, pergamon press, oxford (1958). [19] w greiner, classical electrodynamics, springer-verlag, new york (1998). [20] r penrose, the road to reality:a complete guide to the laws of the universe, vintage books, new york (2004). 090002-5 papers in physics, vol. 9, art. 090002 (2017) / b. wolk [21] m e peskin, d v schroeder, an introduction to quantum �eld theory, (economy edition), westview press, reading, massachusetts (2016). [22] g sterman, an introduction to quantum �eld theory, cambridge university press, cambridge (1993). [23] j d jackson, l b okun, historical roots of gauge invariance, rev. mod. phys. 73, 663 (2001). 090002-6 papers in physics, vol. 2, art. 020002 (2010) received: 23 december 2009, accepted: 24 february 2010 edited by: d. a. stariolo licence: creative commons attribution 3.0 doi: 10.4279/pip.020002 www.papersinphysics.org issn 1852-4249 multilayer approximation for a confined fluid in a slit pore g. j. zarragoicoechea,1, 2∗ a. g. meyra,1 v. a. kuz1 a simple lennard–jones fluid confined in a slit nanopore with hard walls is studied on the basis of a multilayer structured model. each layer is homogeneous and parallel to the walls of the pore. the helmholtz energy of this system is constructed following van der waals-like approximations, with the advantage that the model geometry permits to obtain analytical expressions for the integrals involved. being the multilayer system in thermodynamic equilibrium, a system of non-linear equations is obtained for the densities and widths of the layers. a numerical solution of the equations gives the density profile and the longitudinal pressures. the results are compared with monte carlo simulations and with experimental data for nitrogen, showing very good agreement. i. introduction the effects on phase transition of confined fluids in a slit-like pore have been studied by simulation and different theories [1–11]. in a previous work, we constructed a generalized van der waals equation for a fluid confined in a nanopore [12, 13]. the shift of the critical parameters was in good agreement with lattice model and numerical simulation results, and the predicted critical temperature remarkably reproduced the experiment. in that work, we concluded that the confined van der waals fluid theory seemed to work better than the bulk one, maybe due to the fact that the higher virial contributions not considered in both theories were less important in the confined fluid than in the bulk. a similar treatment was used previously by schoen and diestler [14]. following that ∗e-mail: vasco@iflysib.unlp.edu.ar 1 iflysib-instituto de f́ısica de ĺıquidos y sistemas biológicos (conicet, unlp, cicpba), 59 no. 789, 1900 la plata, argentina. 2 cicpba-comisión de investigaciones cient́ıficas de la prov. de buenos aires. line of reasoning, here we study a simple fluid confined between two infinite parallel hard walls (slit pore). the walls are at a distance l apart. to study the confined fluid, we propose a multilayer model [15]: the fluid is distributed in n thin layers, one beside the other. each layer has a uniform density, and can be observed as a non-autonomous phase. a particle in a given layer interacts with its neighbors inside the layer, and with every particle in the other layers. defay and prigogine and murakami et al. have shown that, in a liquid gas interface, the deviation from the gibbs’ adsorption equation becomes practically negligible in the case of a two layer model [16], and that as the number of transition layers grows, the multilayer model becomes perfectly consistent with the gibbs’ equation [17]. the van der waals-like approximations made in developing this multilayer model theory limit its validity to the low density regime. ii. theory the model system consists of a fluid of n lennard– jones particles confined in a slit nanopore. the hard walls of the pore, separated at a distance l (in 020002-1 papers in physics, vol. 2, art. 020002 (2010) / g. j. zarragoicoechea et al. the x direction), have a surface area s (s → ∞). we divided the fluid into n layers, each layer being parallel to the pore walls. the layer i has ni particles (n = ∑n i=1 ni), a width lxi (l= ∑n i=1 lxi), and a volume vi = slxi. then the helmholtz energy [18] can be written as a = −kt ln  znλ−3nn∏ i=1 ni!   . (1) the configuration integral zn for a pair potential vij may be approximated as zn = ∫ n∏ i=1 ii ninjv ni−1 i v nj−1 j n∏ k=1 k 6=i,k 6=j v nk k ∫ ri ∫ rj f12dr1dr2 + n∏ i=1 v nii . the first term in eq. (3) stands for particles in the layer i . the second term comes from the interaction of one particle in layer i with one particle in layer j . in a compact form, and assuming that a layer sees three nearest neighbor layers, zn = ( n∑ i=1 n2i 2v 2i ii (4) + n−1∑ i=1 i+3∑ j=i+1 j≤n ni vi nj vj iij + 1 ) n∏ i=1 vi ni. the integrals ii and iij , for the slit pore geometry and after low density approximations, can be analytically solved to give ii = ∫∫ ri f12dr1dr2 ≈− ∫∫ |r1−r2|<σ dr1dr2 − ∫∫ |r1−r2|≥σ v12 kt dr1dr2 = −2viσ3(b−bi) − 2viσ 3ε kt ai (5) ii, i+1 = ∫ ri ∫ ri+1 f12dr1dr2 ≈− ∫∫ |r1−r2|<σ dr1dr2 − ∫∫ |r1−r2|≥σ v12 kt dr1dr2 = −viσ3bi − viσ 3ε kt ai, i+1 (6) ii, i+2 = ∫ ri ∫ ri+2 f12dr1dr2 ≈ − ∫∫ |r1−r2|≥σ v12 kt dr1dr2 = −viσ 3ε kt ai, i+2 (7) ii, i+3 = − viσ 3ε kt ai, i+3 (8) in the above expressions vij was taken to be the lennard–jones pair interaction, being ε and σ the potential parameters. the integrals ii, i+2 and ii, i+3 do not contain the excluded volume term because we suppose that the layer widths are lxi ≥ σ. the expressions for a and b in the preceding equations, functions of lxi , are given in the appendix a. the helmholtz energy, eq. (1) together with eq. (4), has the final expression a ≈−kt   n∑ i=1 n2i 2v 2 i ii + n−1∑ i=1 i+3∑ j=i+1 j≤n ni vi nj vj ii,j   − n∑ i=1 nikt ln vi ni + nkt (ln λ3 − 1) (9) 020002-2 papers in physics, vol. 2, art. 020002 (2010) / g. j. zarragoicoechea et al. the pressure tensor [12, 13] and chemical potentials are obtained from the following equations pxx,i = − 1 lyilzi ( ∂a ∂lxi ) t,n pyy,i = pzz,i = − 1 lxilyi ( ∂a ∂lzi ) t,n (10) µi = ( ∂a ∂ni ) t,v,nj 6=i if the system is in mechanical and chemical equilibrium, the xx components of the pressure tensor and the chemical potentials for each layer must be equal. from these equations, giving as input the wall separation l and the mean density ρ∗ = ρσ3, it is constructed a system of (n − 1) non-linear equations with (n − 1) unknowns (layer densities and widths) to be numerically solved. the low computational cost is taken for granted given that the code is easily written and the calculations are carried out on a pentium 4 processor running at 2.66 ghz. at a temperature t∗=kt /ε=1, we have explored the cases with l=10σ and l=15σ, at different mean densities. we have also compared the theoretical results with experimental data coming from studies of nitrogen adsorption in graphite slit pores at room temperature [19]. iii. monte carlo simulation for numerical simulations, n lennard–jones particles are confined between hard walls separated at a distance l. the unit cell is build up taken the walls to be of size ly and lz in the y and z directions respectively, directions on which the periodical boundary conditions are applied. the density profiles and pressures were obtained taking average values in fluid slabs parallel to the walls. the pressure tensor was used in the simple virial form, as indicated in references [20, 21]. with t∗=1, and for both slit pore widths l=10σ and l=15σ, the size of the unit cell was set to ly =lz =30σ, taking the number of particles n to correspond with the mean density. the range of the lennard–jones interactions was considered with a cutoff radius of 5σ. figure 1: density profiles for a n=9 layer model of a confined fluid in a slit pore (solid symbols). the temperature is t∗=1.0 and the wall separation is l=10σ, with mean densities ρ∗=1/20 (circles), 1/15 (squares), 1/10 (triangles), 1/5 (diamonds), and 1 4 (stars). open symbols represent the monte carlo simulations. figure 2: zz pressure tensor component. captions as in fig. 1. iv. results in figs. 1 and 2, the density profiles and zz components of the pressure tensor are shown for t∗=1 and l=10σ. the mean densities studied are ρ∗=1/20, 1/15, 1/10, 1/5, and 1/4. the agreement of the theoretical density profiles with the monte carlo simulations is very good. for the pressure there is a rather good correspondence for low densities, up to ρ∗=1/10. for the higher densities, dif020002-3 papers in physics, vol. 2, art. 020002 (2010) / g. j. zarragoicoechea et al. figure 3: density profiles for a n=13 layer model of a confined fluid in a slit pore (solid symbols). the temperature is t∗=1.0 and the wall separation is l=15σ, with mean densities ρ∗=1/10 (triangles), and 1/5 (circles). open symbols represent the monte carlo simulations. ferences appear, though the tendencies are similar. the discrepancies come first from the low density approximations done to get the helmholtz energy. but, while in the simulation slab particles fluctuate and at higher densities some clusterization occurs, in the theory each layer is supposed to have a homogeneous density which makes it hard for the theoretical pressures to follow those obtained by simulation. for the density profiles, averaging the number of particles in each slab evidently compensates the clusterization, and the theory gives good results, at least for the rather low densities studied. the same picture applies to the behavior of the system for t∗=1 and l=15σ, at mean densities ρ∗= 1/10, and 1/5, represented in figs. 3 and 4. the results, as expected for hard repulsive walls, show a low density region next to the walls and an increasing density profile, with a maximum at the center of the slit pore. this behavior is also shown with density functional theory [1] and in other monte carlo simulations [2]. finally, the good agreement of the theory with the experiment can be seen in the results shown in fig. 5. in this figure, the excess number of molecules per unit area of pore surface γ is plotted in function of the external pressure, at t∗=3.18 and l=4σ. these parameters approximate the experimental values [19] t=303 k and l=1.45 nm, if figure 4: zz pressure tensor component. captions as in fig. 3. ε/k = 95.2 k and σ = 3.75 å are used to characterize the nitrogen. in this case, due to the size of the sample, n=3 layers have been used for calculation. γ is defined as γ = n −ng s = (ρ∗ −ρ∗g) l σ3 (11) where ng/ρ∗g is the number/density of particles which would occupy the slit pore in the absence of the adsorption forces. ρ∗g and the external pressure are determined equating the chemical potential inside the slit pore (eq. 10) to the chemical potential coming from the bulk van der waals equation at the same temperature. the theoretical results presented here are similar to the numerical simulation results obtained by the same authors who have done the experiment [19]. they assume that the differences at higher pressures could be a consequence of the uncertainty in the determination of the pore geometry. v. conclusions the application of a simple theory, with van der waals-like approximations to the helmholtz energy, to a particular model of spatial distribution makes it possible to obtain analytical expressions for the thermodynamic quantities. the study of a confined fluid in a slit pore geometry with a multilayer approximation produces good results when compared with monte carlo simulations at low densities. the agreement with a particular experi020002-4 papers in physics, vol. 2, art. 020002 (2010) / g. j. zarragoicoechea et al. figure 5: excess number of molecules per unit area of pore surface γ as function of the external pressure. the full line represents the experiment (digitalized from ref. [19]), and the dots are our theoretical results. ment on nitrogen confined in a graphite slit pore is remarkable, even though an excess quantity is in study. it may be concluded that the confinement reduces the importance that higher virial contributions have on the equation of the state of the confined fluid. classical density functional theory [22] can also be applied to study the slit pore geometry, with very good agreement with experiments and simulations. though the theoretical work developed in these pages is not a competitor of density functional theory, it has the advantages of having analytical expressions, and the possibility of easily introducing two immiscible components: for instance one or two layer lubricants wetting the walls and a gas or a liquid filling the rest of layers forming the capillary volume. acknowledgements this work was partially supported by universidad nacional de la plata and cicpba. g. j. z. is member of “carrera del investigador cient́ıfico” cicpba. appendix a expressions of quantities used in eqs. 5–8: b = 2 3 π; bi = π4 σ lxi ; ai = a1 + a2 lxi + a3 l3 xi + a4 l9 xi a1 = −169 π; a2 = 3 2 π; a3 = −13π; a4 = 1 90 π (a1) a correction has been made to get good critical parameters for the bulk (l →∞). for argon a1= -5.7538 and b=1.3538, and for nitrogen a1= -1.5955 and b=1.0349. ai, i+1 = π90 [ − 1 l9 xi − 1 lxil 8 xi+1 + 1 lxi(lxi+lxi+1)8 ] −π 3 [ − 1 l3 xi − 1 lxil 2 xi+1 + 1 lxi(lxi+lxi+1)2 ] −3 2 π lxi (a2) ai, i+2 = π90 [ 1 l8 xi+1 − 1 (lxi+lxi+1)8 − 1 (lxi+1+lxi+2)8 + 1 (lxi+lxi+1+lxi+2)8 ] 1 lxi −π 3 [ 1 l2 xi+1 − 1 (lxi+lxi+1)2 − 1 (lxi+1+lxi+2)2 + 1 (lxi+lxi+1+lxi+2)2 ] 1 lxi (a3) ai, i+3 = π90 [ 1 (lxi+1+lxi+2)8 − 1 (lxi+lxi+1+lxi+2)8 − 1 (lxi+1+lxi+2+lxi+3)8 + 1 (lxi+lxi+1+lxi+2+lxi+3)8 ] 1 lxi −π 3 [ 1 (lxi+1+lxi+2)2 − 1 (lxi+lxi+1+lxi+2)2 − 1 (lxi+1+lxi+2+lxi+3)2 + 1 (lxi+lxi+1+lxi+2+lxi+3)2 ] 1 lxi (a4) [1] s a sartarelli, l szybisz, correlation between asymmetric profiles in slits and standard prewetting lines, pap. phys. 1, 010001 (2009); l szybisz, s a sartarelli, density profiles of ar adsorbed in slits of co2 : spontaneous symmetry breaking revisited, j. chem. phys. 128, 124702 (2008). [2] m schoen, computer simulation of condensed phases in complex geometries (lecture notes in physics), springer, berlin (1993). [3] m schoen, structure and phase behavior of confined soft condensed matter, in: computational methods in surface and colloid science, ed. m borowko, pag. 1, marcel dekker, new york (2000). 020002-5 papers in physics, vol. 2, art. 020002 (2010) / g. j. zarragoicoechea et al. [4] s dietrich, fluids in contact with structured substrates, in: new approaches to problems in liquid state theory, eds. c caccamo, j p hansen, g stell, pag. 197, kluwer, dordrecht (1999). [5] j p r b walton, n. quirke, capillary condensation: a molecular simulation study, mol. simul. 2, 361 (1989). [6] l d gelb, k e gubbins, r radhakrishnan, m sliwinska-bartkowiak, phase separation in confined systems, rep. prog. phys. 62, 1573 (1999). [7] a maciolek, a ciach, r evans, critical depletion of fluids in pores: competing bulk and surface fields, j. chem. phys. 108, 9765 (1998). [8] a maciolek, r evans, n b wilding, effects of weak surface fields on the density profiles and adsorption of a confined fluid near bulk criticality, j. chem. phys. 119, 8663 (2003). [9] p b balbuena, k e gubbins, classification of adsorption behavior: simple fluids in pores of slit-shaped geometry, fluid phase equilib. 76, 21 (1992). [10] p b balbuena, k e gubbins, theoretical interpretation of adsorption behavior of simple fluids in slit pores, langmuir 9, 1801 (1993). [11] o pizio, a patrykiejew, s sokolowski, phase behavior of lennard-jones fluids in slit-like pores with walls modified by preadsorbed molecules: a density functional approach, j. phys. chem. c 111, 15743 (2007). [12] g j zarragoicoechea, v a kuz, van der waals equation of state for a fluid in a nanopore, physical review e 65, 021110 (2002). [13] g j zarragoicoechea,v a kuz, critical shift of a confined fluid in a nanopore, fluid phase equilib. 220, 7 (2004). [14] m schoen, d j diestler, liquid-vapor coexistence in a chemically heterogeneous slitnanopore, chem. phys. letters 270, 339 (1997). [15] r defay, i prigogine, surface tension and adsorption, longmans, london (1966). [16] r defay, i prigogine, surface tension of regular solutions, trans. faraday soc. 46, 199 (1950)). [17] t murakami, s ono, m tamura, m kurata, on the theory of surface tension of regular solution, j. phys. soc. japan 6, 309 (1951). [18] l d landau, e m lifshitz, f́ısica estad́ıstica, reverté, barcelona (1969), pp. 269–273. [19] k kaneko, r f cracknell and d nicholson, nitrogen adsorption in slit pores at ambient temperatures: comparison of simulation and experiment, langmuir 10, 4606 (1994). [20] m schoen, d j diestler, analytical treatment of a simple fluid adsorbed in a slit-pore, j. chem. phys. 109, 5596 (1998). [21] m p allen, d j tildesley, computer simulation of liquids, oxford university press, london (1987), pag. 46–47. [22] j wu, density functional theory for chemical engineering: from capillarity to soft materials, aiche j. 52, 1169 (2006). 020002-6 papers in physics, vol. 7, art. 070018 (2015) received: 2 november 2015, accepted: 27 november 2015 edited by: r. dickman reviewed by: m. hutter, australian national university, canberra, australia. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070018 www.papersinphysics.org issn 1852-4249 bayesian regression of piecewise homogeneous poisson processes diego j. r. sevilla1∗ in this paper, a bayesian method for piecewise regression is adapted to handle counting processes data distributed as poisson. a numerical code in mathematica is developed and tested analyzing simulated data. the resulting method is valuable for detecting breaking points in the count rate of time series for poisson processes. i. introduction bayesian statistics have revolutionized data analysis [1]. techniques like the generalized lombscargle periodogram [2] allow us to obtain oscillation frequencies of time series with unprecedented accuracy. the gregory and loredo method [3] goes further allowing us to find and characterize periodic signals of any period and shape. to detect non-periodical variations, the exact bayesian regression of piecewise constant functions by marcus hutter (hereafter hutter’s method) [4] is valuable. it permits to estimate the most probable partition of a data set in segments of constant signals, determining the number of segments and their borders, and in-segments means and variances. hutter’s method works with two continuous distributions: normal, and cauchy-lorentz. the latter —the canonical example of a pathological distribution with undefined moments—, is also suitable to analyze data with other symmetric probability distributions, especially with heavy tails. in the case of counting processes, especially for ∗e-mail: dsevilla@fceia.unr.edu.ar 1 departamento de f́ısica y qúımica, escuela de formación básica. facultad de ciencias exactas, ingenieŕıa y agrimensura. universidad nacional de rosario, av. pellegrini 250, s2000btp rosario, argentina. low rates, when data consist in non-negative small integers, methods specially designed to discrete probability distributions are necessary. some regression methods, specially for non-homogeneous poisson processes [5], were developed. in this paper, hutter’s method is adapted for analyzing data distributed as poisson. the results are summarized in a code in mathematica [6]. it can be used to analyze data of several physical processes which follow the poisson distribution (e.g., detection of photons in x-ray astronomy, particles in nuclear disintegration, etc.), if sudden changes in detection rates are suspected. ii. method hutter’s method is summarized in table 1 of ref. [4] in a pseudo c code which is divided in two blocks. the first one calculates moments akij with k = 0, 1, 2 of the pdf of the statistical models for segments of data dij := {ni+1, . . . ,nj}. the second one performs the regression from moments akij. the code developed in this work is divided in three blocks. as the members of the poisson distributions family are identified by one parameter -the mean rate r of the poisson process-, the pdf of the models 070018-1 papers in physics, vol. 7, art. 070018 (2015) / d. j. r. sevilla for a segment dij is [1] p(r|dij,i) = p(r|i) p(dij|r) p(dij) , (1) where p(r|i) is the prior of parameter r, p(dij|r) is the likelihood of segment dij for a given r, p(dij) is the global likelihood of the family, and i represents a prior information. usually, the prior information consists of global quantities calculated from d := d0n , i.e., from all the data set. for poisson processes, only one quantity is necessary: the mean rate r̂. considering the conjugate prior of the poisson distribution [7], the prior results p(r|r̂) = rr̂−1 e−r γ(r̂) . (2) for a poisson process with rate r, the likelihood of a segment dij is p(dij|r) = j∏ t=i+1 rnt e−r nt! . (3) so, the moments of the posterior can be expressed in an analytical form akij = γ(k + r̂ + ∑j t=i+1 nt) ∏j t=i+1 1 nt! γ(r̂) (j − i + 1)k+r̂+ ∑j t=i+1 nt . (4) code block 1 calculates akij. it needs as input the time series to be analyzed (list data). the output are functions a0[i,j], a1[i,j] and a2[i,j] and integer n, which is the length of data. code block 1: mathematica code to calculate akij. n=length[data]; r=mean[data]; do[ do[ d=j-i; m=r+sum[data[[t]],{t,i+1,j}]; a0[i,j]=(m -1)!/( gamma[r]*(d+1)^m)* product [1/ data[[t]]!,{t,i+1,j}]; a1[i,j]=m*a0[i,j]/(d+1); a2[i,j]=(m+1)*a1[i,j]/(d+1); ,{j,i+1,n}]; ,{i,0,n}]; as the second block of hutter’s code only needs the moments akij as inputs, it could work properly with no changes. it computes the evidence, the probability for k segments and its map estimation k̂, the probability of boundaries locations and the map locations of the k̂ boundaries, the first and second in-segment moments, and an interesting regression curve that smooths the final result. nevertheless, for our specific problem, once the segments boundaries are obtained, we can estimate their means and variances straightforwardly, so we only use a part of hutter’s second block, which is shown in code block 2. the logical of the algorithm is explained in ref. [4]. code block 2 needs as inputs a0[i,j], a1[i,j], a2[i,j] and n, all calculated in code block 1, and integer kmax, which is the maximum number of segments to be considered. the outputs are the evidence (e), the probability for k segments (c[k]), its map (khat), the probability of boundaries locations (b[i]), and their map (that[p]). code block 2: mathematica code to calculate breaking points. do[ l[0,i]= kroneckerdelta[i,0]; r[0,i]= kroneckerdelta[i,n]; ,{i,0,n}]; do[ do[ l[k+1,i]=sum[l[k,h]*a0[h,i],{h,k,i-1}]; r[k+1,i]=sum[r[k,h]*a0[i,h],{h,i+1,n-k}]; ,{i,0,n}] ,{k,0,kmax -1}]; e=1/ kmax*sum[l[k,n]/ binomial[n-1,k-1],{k,1,kmax }]; do[ c[k]=l[k,n]/( binomial[n-1,k-1]* kmax*e) ,{k,1,kmax }]; khat =1; do[if[c[khat]reg[[bp[k]]],1,-1]; bp1[k]=bp[k]+s*ethat[k]; bp2[k]=bp[k]-s*ethat[k]; ,{k,1,nbp -1}]; re1=flatten[table[table[(nn[k]-sqrt[nn[k]])/mm[k] ,{bp1[k]-bp1[k-1]}] ,{k,1,nbp }]]; re2=flatten[table[table[(nn[k]+sqrt[nn[k]])/mm[k] ,{bp2[k]-bp2[k-1]}] ,{k,1,nbp }]]; iii. applications and discussion figure 1 (top) shows, in blue dots, data simulated using mathematica. data consist of 150 poisson distributed elements, the first 50 with rate 1.5, the second 50 with rate 0.5, and the last 50 with rate 1.0. applying the first 2 blocks of code on data, 0 50 100 150 0 1 2 3 4 5 0 1 element number c o u n ts �� b o u n d a r y p r o b . figure 1: top: simulated data (blue dots) and boundaries location probability (red line). bottom: regression curve and its error estimation (black dashed line and gray zone), and the rate curve used in simulation (blue line). we can see that the probability of having 2 breaking points is very high. figure 1 (top) also shows, in red line, the probability for the boundaries locations. applying code block 3, we obtain the regression [fig. 1 (bottom), black dashed curve] and its error estimation [fig. 1 (bottom), gray zone]. the continuous blue line in fig. 1 (bottom) indicates the rates used in simulation. the regression in the example above fits very well with the rate curve used in simulation. but sometimes regressions result qualitatively different to the rate curve, showing more or less breaking points, even for data simulated in the same conditions. this effect is due to chance. to show this issue, 2000 simulations with the same conditions were performed. in 992 of them, two breaking points were found. in the others, there were found zero (44), one (208), three (419), four (155), five (73), six (41), and seven or more (68) breaking points. for cases in which two breaking points were found, statistics of the most likely boundaries loca070018-3 papers in physics, vol. 7, art. 070018 (2015) / d. j. r. sevilla figure 2: top: histogram of the boundaries locations. bottom: histogram of the in-segment mean rates. both figures were calculated for a set of 2000 simulations similar to that shown in fig. 1. tions and in-segment mean rates were calculated. figure 2 shows histograms of those statistics. figure 2 (top) shows histograms for boundaries locations. it is clear that the bigger the step, the smaller the uncertainty on its location. figure 2 (bottom) shows histograms of in-segment rates. it is clear that the greater the rate, the smaller its relative statistical error. figure 3 shows data and boundaries location probabilities for a simulation similar to the previous ones, but now with rates 3.0, 1.0 and 2.0. comparing fig. 1 (top) and fig. 3 (top), we can see that in the latter one the boundaries locations are found more accurately. again, 2000 simulations with the same conditions were performed. in 1159 of them, two breaking points were found, while in the others, there were found zero (1), one (33), three (499), four (167), five (75), six (32), and seven or more (34) breaking points. figure 4 shows histograms of the statistics of the most likely boundaries locations and in-segment rates for the simulations with two 0 50 100 150 0 1 2 3 4 5 6 7 8 0 1 element number c o u n ts �� b o u n d a r y p r o b . figure 3: top: simulated data (blue dots) and boundaries location probability (red line). bottom: regression curve and its error estimation (black dashed line and gray zone), and the rate curve used in simulation (blue line). breaking points found. comparing with fig. 2, we can see that the histograms are now narrower. these results confirm what was stated above. it is important to note that the probability for the real curve to be completely inside the region defined by the error estimations of the regression is significantly less than one. it is easy to see why: if the errors were independent and equal to the standard error, the probability of satisfying n error conditions simultaneously would be 0.68n. but even the actual probability could be lower, since it is clear that the errors must be dependent. nevertheless, the error estimations presented here are useful to get an idea of the accuracy of the regression. finally, the capability to detect a breaking point with this code was tested for different count rates. to do this, simulated data sets of a single step in the count rate were used. data sets consist in 100 poisson distributed elements, the first 50 for a rate r1 and the last 50 for a rate r2. 1000 simulations were 070018-4 papers in physics, vol. 7, art. 070018 (2015) / d. j. r. sevilla figure 4: top: histogram of the boundaries locations. bottom: histogram of the in-segment rates. both figures were calculated for a set of 2000 simulations similar to that shown in fig. 3. table 1: statistics of successful detections for a single step. r1\r2 3.0 2.0 1.6 1.2 0.8 0.4 0.86 0.74 0.70 0.63 0.46 0.8 0.77 0.70 0.60 0.36 1.2 0.69 0.61 0.30 1.6 0.65 0.28 2.0 0.60 done for each pair (r1,r2). statistics of successful detections are presented in table 1. a successful detection is considered when only one breaking point between elements 40 and 60 is detected. table 1 shows that the smaller is the mean rate difference and the smaller are the mean rates, the more difficult is the detection of the step. this result is expected because in a poisson distribution the variance is equal to the mean. iv. conclusions in this work, a code for bayesian regression of piecewise constant functions was adapted to handle data from poisson processes. for this purpose, equations for calculating the moments of the posteriors of segments of data were found through bayes theorem, considering the conjugate prior of the poisson distribution as prior. these results, as well as part of hutter’s method, were used to calculate the most probable number of segments and their boundaries. procedures for calculating in-segments mean rates and the uncertainties of mean rates and boundaries locations are also provided. the resulting method is summarized in a code in mathematica. the code was applied to simulated data. firstly, two examples with tree segments were analyzed. the code performed well in both cases considering the dispersion of data, and the results improved in the case of higher mean rates and mean rates differences. this occurs because of the statistical dispersion of poisson distributed data, which is greater than the mean rate if the mean rate is lower than one. finally, simulations of data of a single step were analyzed for different rates, and statistics of the regressions with only one breaking point are presented in a table. this table shows the effect of the rates and rate differences in the regression accuracy, and, together with the errors estimations provided by the code, can serve as an indicator of the reliability of the method. supplementary material including the source code for the algorithms can be found at the journal website [8]. acknowledgements this work was partially supported by the national university of rosario. [1] p c gregory, bayesian logical data analysis for the physical sciences, cambridge university press, cambridge, uk (2004). [2] g l bretthorst, lecture notes in statistics, springer, berlin (1988). 070018-5 papers in physics, vol. 7, art. 070018 (2015) / d. j. r. sevilla [3] p c gregory, t j loredo, a new method for the detection of a periodic signal of unknown shape and period, astrophys. j. 398, 146 (1992). [4] m hutter, exact bayesian regression of piecewise constant functions, bayesian analysis 2, 635 (2007). [5] j f lawless, regression methods for poisson process data, j. am. stat. assoc. 82, 399 (1987). [6] wolfram research inc., mathematica version 9.0, wolfram research, inc., champaign, illinois (2012). [7] a gelman, j b carlin, h s stern, d b rubin, bayesian data analysis, taylor & francis, uk (2014). [8] mathematica codes and examples by the author can be found at http://www.papersinphysics.org. 070018-6 papers in physics, vol. 1, art. 010007 (2009) received: 14 october 2009, accepted: 28 december 2009 edited by: a. c. mart́ı licence: creative commons attribution 3.0 doi: 10.4279/pip.010007 www.papersinphysics.org issn 1852-4249 parametric study of the interface behavior between two immiscible liquids flowing through a porous medium alejandro david mariotti,1∗ elena brandaleze,2† gustavo c. buscaglia3‡ when two immiscible liquids that coexist inside a porous medium are drained through an opening, a complex flow takes place in which the interface between the liquids moves, tilts and bends. the interface profiles depend on the physical properties of the liquids and on the velocity at which they are extracted. if the drainage flow rate, the liquids volume fraction in the drainage flow and the physical properties of the liquids are known, the interface angle in the immediate vicinity of the outlet (θ) can be determined. in this work, we define four nondimensional parameters that rule the fluid dynamical problem and, by means of a numerical parametric analysis, an equation to predict θ is developed. the equation is verified through several numerical assessments in which the parameters are modified simultaneously and arbitrarily. in addition, the qualitative influence of each nondimensional parameter on the interface shape is reported. i. introduction the fluid dynamics of the flow of two immiscible liquids through a porous medium plays a key role in several engineering processes. usually, though the interest is focused on the extraction of one of the liquids, the simultaneous extraction of both liquids is necessary. this is the case of oil production and of ironmaking. the water injection method used in oil production consists of injecting water back into the reservoir, usually to increase pressure ∗e-mail: mariotti.david@gmail.com †e-mail: ebrandaleze@frsn.utn.edu.ar ‡e-mail: gustavo.buscaglia@icmc.usp.br 1 instituto balseiro, 8400 san carlos de bariloche, argentina. 2 departamento de metalurgia, universidad tecnológica nacional facultad regional san nicolás, 2900 san nicolás, argentina. 3 instituto de ciências matemáticas e de computação, universidade de são paulo, 13560-970 são carlos, brasil. and thereby stimulate production. normally, just a small percentage of the oil in a reservoir can be extracted, but water injection increases that percentage and maintains the production rate of the reservoir over a longer period of time. the water displaces the oil from the reservoir and pushes it towards an oil production well [1]. in the steel industry, this multiphase phenomenon occurs inside the blast furnace hearth, in which the porous medium consists of coke particles. the slag and pig iron are stratified in the hearth and, periodically, they are drained through a lateral orifice. the understanding of this flow is crucial for the proper design and management of the blast furnace hearth [2]. in both examples above, when the liquids are drained, a complex flow takes place in which the interface between the liquids moves, tilts and bends. numerical simulation of multiphase flows in porous media is focused mainly in upscaling methods, aimed at solving for large scale features of interest in such a way as to model the effect of the small scale features [3–5]. other authors [6–8] use 010007-1 papers in physics, vol. 1, art. 010007 (2009) / a. d. mariotti et al. the numerical methods to model the complex multiphase flow that takes place at the pore scale. in this work, we numerically study the macroscopic behavior of the interface between two immiscible liquids flowing through a porous medium when they are drained through an opening. the effect of gravity on this phenomenon is considered. we define four nondimensional parameters that rule the fluid dynamical problem and, by means of a numerical parametric analysis, an equation to predict the interface tilt in the vicinity of the orifice (θ) is developed. the equation is verified through several numerical cases where the parameters are varied simultaneously and arbitrarily. in addition, the qualitative influence of each non-dimensional parameter on the interface shape is reported. ii. parametric study the numerical studies in this work were carried out by means of the program fluent 6.3.26. different models to simulate the two-dimensional parametric study were used. the volume of fluid (vof) method was chosen to treat the interface problem [9]. the drag force in the porous medium was modeled by means of the source term suggested by forchheimer [10]. the source term for the ith momentum equation is: si = − ( µ α vi + 1 2 ρc| −→ v |vi ) . (1) for the constants α and c in eq. (1), we use the values proposed by ergun [10]: α = ε3d2 150(1 − ε)2 , (2) c = 1.75 (1 − ε) ε3d , (3) where ε is the porosity, d is the particle equivalent diameter, ρ is the density, v is the velocity and µ is the dynamic molecular viscosity. considering that the subscript 1 and 2 represent the fluid 1 and the fluid 2 respectively, three nondimensional parameters were considered in the parametric study: viscosity ratio, µr = µ1/µ2, density figure 1: sketch of the numerical 2d domain. ratio, ρr = ρ1/ρ2, and nondimensional velocity, vr = v0ρ2l/µ2; where v0 is the outlet velocity and l a reference length. i. domain description the numerical domain considered to carry out the parametric study was a two-dimensional one composed by the porous medium sub-domain and the outlet sub-domain. the porous sub-domain is a rectangle 10m wide and 10m tall. inside of it, a rigid, isotropic and homogeneous porous medium was arranged. we use a porosity and particle diameter of 0.32 and 0.006m, respectively. for the outlet domain we use a rectangle 0.02m wide and with a height l = 0.01m divided into two equal parts and located at the center of one of the lateral edges. the part located at the end of the outlet domain is used to impose the outlet velocity. figure 1 shows a complete description of the domain. a quadrilateral mesh with 2.2 × 104 cells was used, where the outlet sub-domain mesh consists of 200 elements in all the cases studied. as boundary conditions, on edge 1 we define a zero gauge pressure condition normal to the boundary and impose that only the fluid 1 can enter to the domain through it. on edge 2 the boundary condition is the same as on edge 1 but the fluid consider in this case is fluid 2. on edge 7 we impose a zero gauge pressure normal to the boundary but in this 010007-2 papers in physics, vol. 1, art. 010007 (2009) / a. d. mariotti et al. figure 2: interface evolution when the initial position is below the outlet level, without gravity. case the fluids can only leave the domain. on the other edges (edges 4, 5, 6, and 8) we impose a wall condition where the normal and tangential velocity is zero except for edge 3, at which the tangential velocity is free and the stress tangential to the edge is zero. ii. interface evolution to illustrate how an interface reaches the stationary position from an initially horizontal one, three sets of curves were obtained. figure 2 shows the interface evolution for the case without the gravity effect and the interface initial position is below the outlet level. the interface modifies its tilt to reach the exit and it changes its shape to reach the stationary profile. figures 3 and 4 show the interface evolution when the gravity is present but the interface initial position is below and above the outlet level respectively. iii. viscosity effect one of the most important parameters to modify is the dynamical viscosity of fluid 1. we maintain the properties of the fluid 2 as the properties of water (density 998kg/m3, and dynamical viscosity 0.001 pa.s) and the density of fluid 1 as the density of the oil (850 kg/m3). the dynamical viscosity of fluid 1 was varied from values smaller than those of fluid 2, to values much greater. figure 3: interface evolution when the initial position is below the outlet, with gravity. figure 4: interface evolution when the initial position is above the outlet, with gravity. two sets of curves were obtained, one considering the effect of gravity and the other without considering it. figure 5 shows the stationary interface profiles for the different values of viscosity, without gravity. a value of vr = 1.5 × 104 and ρr = 1.17 were chosen. it is possible to observe that, when fluid 1 has a viscosity higher than that of fluid 2, the interface profile is above the outlet and points downwards at the outlet. if fluid 1 has a lower viscosity, the opposite happens. when considering gravity, the value of vr was changed to 1 × 105 (v0 = 10m/s), since for smaller values the interface may not reach the outlet (this is 010007-3 papers in physics, vol. 1, art. 010007 (2009) / a. d. mariotti et al. figure 5: stationary interface profiles modifying the fluid 1 viscosity without the gravity effect. figure 6: stationary interface profiles modifying the fluid 1 viscosity with the gravity effect. later studied in fig. 10). figure 6 shows the curves obtained for this situation, where the interface only lies over the outlet level for the higher µr values. iv. outlet velocity effect v0 is varied from a small value, similar to the porous medium velocity (v0 = 0.2m/s or vr = 2000), to a very large one (v0 = 50m/s or vr = 5×105). maintaining the properties of fluid 2 similar to those of water, two sets of curves were obtained (µr > 1 and µr < 1), shown in figs. 7 and 8, respectively. when the effect of gravity was considered, two additional sets of curves (figs. 9 and 10) were obfigure 7: stationary interface profiles for several values of vr, without gravity, for µr > 1 (µr = 35). figure 8: stationary interface profiles for several values of vr, without gravity, for µr < 1 (µr = 0.01). tained. figure 7 shows the effect of vr when the viscosity of fluid 1 is greater than that of fluid 2, without gravity. it is possible to see that, as vr increases, the interface tilt at the outlet is maximal for vr = 1 × 105. on the other hand, fig. 8 shows the interface profiles when the viscosity of fluid 1 is smaller than that of fluid 2. we observe that as vr increases the interface tends to the horizontal position. figure 9 shows the different stationary interface 010007-4 papers in physics, vol. 1, art. 010007 (2009) / a. d. mariotti et al. figure 9: stationary interface profiles for several values of vr, with gravity, for µr > 1 (µr = 35). figure 10: stationary interface profiles for several values of vr, with gravity, for µr < 1 (µr = 0.01). positions when the gravity effect is present for µr > 1. the effect of gravity is quite significant, the interface ascends but only for the highest value of vr it lies above the outlet level. figure 10 shows the curves when the viscosity of fluid 1 is lower than that of fluid 2 (µr < 1). the behavior is different from that without gravity. in fact, there exists a minimum outlet velocity below which the interface does not reach the outlet. v. density effect the effect of the density ratio on the interface profile was studied in the presence of gravity. keeping figure 11: stationary interface profiles for several values of the density of fluid 1, with vr = 1.54 and µr = 35. fluid 2 with the properties of water and the viscosity of fluid 1 as 0.035pa.s (µr = 35), the density of fluid 1 was varied from its original value to one three times smaller than that of fluid 2. in fig. 11 it is seen that as ρr increases, the interface profile ascends significantly, with a less significant change in the tilt angle at the outlet. iii. generic expression from the study on the influence of each nondimensional parameter on the interface behavior, an equation that predicts the interface angle at the immediate vicinity of the outlet (θ) was crafted. for practical reasons, the cases where gravity is present were considered to develop the equation. in sect. 2.2, it is possible to see that when the nondimensional parameters ρr, µr and vr are constant the interface changes its shape until it reaches a stationary profile. for this reason, a fourth nondimensional parameter is considered, the volume fraction of fluid 1 in the outlet flow (vf). a generic expression [(eq. (4)], consisting of three terms and containing 22 constants, was adjusted by trial and error until satisfactory agreement with the numerical results was found. 010007-5 papers in physics, vol. 1, art. 010007 (2009) / a. d. mariotti et al. a -29.1 i 1.162 × 107 p 0.1 b -0.04 j -0.18 q 0.72 c 0.1 k -1.64 r 2763.1 d 0.45 l -1 s 0.4 e 206 m 0.63 t -0.6 f 0.035 n 12.74 u -1.82 g -0.05 o 0.085 v 3.2 h -1.26 table 1: constant values in the generic expression. pt d ε 1/α c resistance a 0.005 0.3 10.88 × 107 1.81 × 104 very high b 0.006 0.32 5.88 × 107 1.21 × 104 high c 0.02 0.2 3 × 107 1.75 × 104 medium d 0.02 0.25 1.35 × 107 8400 low e 0.05 0.17 8.41 × 106 1.18 × 104 very low table 2: porous medium types. θ = αµbrv c rρ d r + eµfrv g rρ h r exp(−iµ j rv k r ρ l rv f m) + nµorv p rρ q r exp(−rµ s rv t rρ u rv f v) (4) table 1 shows the values of the constants in the generic expression. i. equation verification to verify that the generic expression (4) predicts the value of θ correctly when the parameters are arbitrarily modified, 21 additional numerical cases were simulated. these cases cover a wide range of physical properties of the liquids and of the characteristics of the porous medium (by means of the coefficients 1/α and c). the interface angle obtained from the generic expression (θge ) was compared with the interface angle obtained from the simulations (θs ). table 2 shows the five porous medium types (pt) that were chosen. some cases (1, 2, 4, 5, 9, 11, 17-21) were chosen based on the possible combinations of immiscible liquids that can be manipulated in real situations. for the remaining cases, the properties of the liquids were fixed at arbitrary values (fictitious liquids) so that a wide range of the nondimensional case fluid 1 fluid 2 µ1 µ2 ρ1 ρ2 1 heavy oil water solution 0.4 0.005 850 998 2 light oil water emulsion 0.012 0.06 850 998 3 – – 0.08 0.008 4000 4680 4 kerosene water 24 × 10−4 0.001 780 998 5 acetone water 3.3 × 10−4 0.001 791 998 6 – – 0.003 0.01 400 720 7 – – 0.2 0.01 1500 2700 8 – – 0.3 0.004 3300 5940 9 light slag hot pig iron 0.02 0.001 2800 7000 10 – – 0.5 0.013 500 1250 11 medium pig 0.4 0.005 2800 7000 slag iron 12-16 – – 0.08 0.008 4000 4680 17-21 heavy slag pig iron 0.4 0.005 2800 7000 table 3: cases description. case pt ρr µr vr vf 1 b 1.17 80 5 × 104 2.8 2 b 1.17 0.2 5 × 104 87.2 3 b 1.17 10 10 × 104 28.4 4 b 1.28 2.4 5 × 104 96.1 5 b 1.26 0.33 1 × 105 96.8 6 b 1.8 0.3 1 × 105 93.2 7 b 1.8 20 1 × 105 16.1 8 b 1.8 80 8 × 104 18.4 9 b 2.5 20 1 × 105 100.0 10 b 2.5 40 1 × 105 5.5 11 b 2.5 80 5 × 104 70.8 12 a 1.2 10.0 1 × 105 23.1 13 b 1.2 10.0 1 × 105 28.4 14 c 1.2 10.0 1 × 105 41.5 15 d 1.2 10.0 1 × 105 53.4 16 e 1.2 10.0 1 × 105 63.7 17 a 2.5 80.0 5 × 104 15.8 18 b 2.5 80.0 5 × 104 24.3 19 c 2.5 80.0 5 × 104 43.2 20 d 2.5 80.0 5 × 104 70.8 21 e 2.5 80.0 5 × 104 94.0 table 4: parameter values and pt for all cases described in table 3. parameters was covered. table 3 shows the description of each numerical case, while table 4 shows the corresponding nondimensional parameter values and pt. 010007-6 papers in physics, vol. 1, art. 010007 (2009) / a. d. mariotti et al. figure 12: comparison between the predictions of the generic expression and the numerical result for the 21 validation cases. we define an error (e = 100|∆θ|/180) as the percentage of the absolute value of the difference between the interface angles (∆θ = θs − θge ) divided by the interface angle range (180◦). figure 12 shows the comparison between the generic expression and the numerical cases. it is seen that the generic expression (4) predicts the interface angle, for the cases used in this study, with an error smaller than 10%. iv. conclusions a numerical study of the macroscopic interface behavior between two immiscible liquids flowing through a porous medium, when they are drained through an opening, has been reported. four nondimensional parameters that rule the fluiddynamical problem were identified. thereby, a numerical parametric analysis was developed where the qualitative observation of the resulting interface profiles contributes to the understanding of the effect of each parameter. in addition, a generic expression to predict the interface angle in the immediate vicinity of the outlet opening (θ) was developed. to verify that the generic equation predicts the value of θ correctly, 21 numerical cases with widely different parameters were simulated. considering that the cases encompass a large class of liquids and porous media, the prediction of θ within an error of 10% is considered satisfactory. acknowledgements a. d. m. and e. b. are grateful for the support from metallurgical department and deytema (utnfsrn). g. c. b. acknowledges partial financial support from cnpq and fapesp (brazil). [1] w c lyons, g j plisga, standard handbook of petroleum & natural gas engineering 2nd ed, gulf professional publishing, burlington (2005). [2] a k biswas, principles of blast furnace ironmaking: theory and practice, cootha publishing house, brisbane (1981). [3] a westhead, upscaling for two-phase flows in porous media, phd thesis: california institute of technology, pasadena, california (2005). [4] r e ewing, the mathematics of reservoir simulation, siam, philadelphia (1983). [5] m a cardoso, l j durlofsky, linearized reduced-order models for subsurface flow simulation, j. comput. phys. 229, 681 (2010). [6] m j blunt, flow in porous media porenetwork models and multiphase flow, curr. opin. colloid interface sci. 6, 197 (2001). [7] z chen, g huan, y ma, computational methods for multiphase flows in porous media, siam, philadelphia (2006). [8] y efendiev, t houb, multiscale finite element methods for porous media flows and their applications, appl. num. math. 57, 577 (2007). [9] fluent 6.3 user’s guide, fluent inc. (2006). [10] j bear, dynamics of fluids in porous media, dover publications inc., new york (1988). 010007-7 papers in physics, vol. 8, art. 080004 (2016) received: 30 october 2015, accepted: 4 february 2016 edited by: g. mart́ınez mekler reviewed by: j. mateos, departamento de sistemas complejos, instituto de f́ısica, universidad nacional autónoma de méxico, méxico. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.080004 www.papersinphysics.org issn 1852-4249 invited review: fluctuation-induced transport. from the very small to the very large scales g. p. suárez,1 m. hoyuelos,1∗ d. r. chialvo2 the study of fluctuation-induced transport is concerned with the directed motion of particles on a substrate when subjected to a fluctuating external field. work over the last two decades provides now precise clues on how the average transport depends on three fundamental aspects: the shape of the substrate, the correlations of the fluctuations and the mass, geometry, interaction and density of the particles. these three aspects, reviewed here, acquire additional relevance because the same notions apply to a bewildering variety of problems at very different scales, from the small nano or micro-scale, where thermal fluctuations effects dominate, up to very large scales including ubiquitous cooperative phenomena in granular materials. i. introduction much of the efforts devoted to particle transport were triggered by the famous challenge at very small scales presented by feynman in 1959: “a biological system can be exceedingly small. many of the cells are very tiny, but they are very active. (...) consider the possibility that we too can make a thing very small, which does what we want — that we can manufacture an object that maneuvers at that level!” [1]. at the scales discussed by feynman, our most usual notions of work, energy and transport seem to break down, including some counterintuitive obser∗e-mail: hoyuelos@mdp.edu.ar 1 instituto de investigaciones f́ısicas de mar del plata (ifimar conicet) and departamento de f́ısica, facultad de ciencias exactas y naturales, universidad nacional de mar del plata, deán funes 3350, 7600 mar del plata, argentina. 2 consejo nacional de investigaciones cient́ıficas y técnicas (conicet), godoy cruz 2290, buenos aires, argentina. vations. as discussed in these notes, these findings are not restricted to small scales since work in the last decades shows similar dynamics arising anytime there is a peculiar interplay of fluctuations, nonlinearity and correlations resulting in various classes of fluctuation-induced transport. to visualize the problem, consider a gedankenexperiment involving, for the sake of discussion, our desk. elementary physics explains how all the objects at the desk stay in place and/or which forces are needed to displace them. now, consider the imaginary case in which we progressively shrink all the objects up to a size of a few nanometers. it will be noticed that while at the natural scale objects remain steady without any energy expenditure, at the nanometers scale things move around, our “nano cell phone” which was quiet at the natural scale desk, moves and falls off the “nano desk. this exercise reminds us that at the brownian domain, energy would be required even to stay quiet since the basic macroscopic methods of controlling energy flow no longer remain valid. this nonintuitive phenomenon in the function of molecular machines was described by astumian as follows [2]: 080004-1 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. “any microscopic machine must either work with brownian motion or fight against it, and the former seems to be the preferable choice”. analogous observations, with some additional caveats due to inertial forces, can be made if instead of shrinking the mass, we apply an increasingly large external fluctuating field, making now our real size desk to shake around. this brief review is dedicated to discuss the essence of three elementary results of fluctuationinduced transport including the potential shape, the correlations of the fluctuations and the particle interactions and how they work, calling attention to some common lessons that can be borrowed from problems in apparently far apart scales and fields, from cellular biology to technological applications and applied physics. it should be noted that it is not our intention to cover the extent of the field, this is neither a fair, nor historically correct, exhaustive or updated review of the relevant literature; it only encompasses some interesting results which, in our opinion, warrant further exploration. the reader will find comprehensive reviews covering specific topics, including those on brownian motors in [3–7], on the more general subject of molecular motors in [2, 8–12], on a more biological perspective of molecular springs and ratchets in [13], or on a systematic analysis of the space-time symmetries of the equations in [14]. the paper is organized as follows. the next section revisits pioneer works on these types of problems, carried on a hundred years ago. next, we discuss the three fundamental aspects of the problem, including the substrate, the correlations of the fluctuations and the particle interactions. we start by briefly introducing the different realizations of fluctuation-induced transport as popularized two decades ago, i.e., in the so-called correlation ratchets. after that, the two elementary ways to break the symmetry are reviewed, either in the temporal or in the spatial aspects of the system, to conclude introducing yet another way to affect transport, the correlations born out of many particle interactions. the review closes with a discussion of some applications and new directions. figure 1: feynman’s imaginary microscopic ratchet, comprised by vanes, a pawl with a spring, two thermal baths at temperatures t1 > t2, an axle and wheel, and a load m. ii. smoluchowski-feynman’s ratchet as a heat engine feynman famous lectures [15] include an imaginary microscopic ratchet device to illustrate the second law of thermodynamics. the basic idea belongs to smoluchowski who discussed it during a conference talk in münster in 1912 (published as proceedingsarticle in ref. [16]). as seen in fig. 1, it consists of a ratchet, a paw and a spring, vanes, two thermal baths at temperatures t1 > t2, an axle and wheel, and a load. the ratchet is free to rotate in one direction, but rotation in the opposite direction is prevented by the pawl. the system is assumed small so that molecules of the gas at temperature t1 that collide with the vanes produce large fluctuations in the rotation of the axle. fluctuations are rectified by the pawl. the net effect is a continuous rotation of the axle that can be used to produce work by, for example, lifting a weight against gravity. the pawl becomes a materialization of maxwell’s demon, a small agent able to manipulate fluctuations at a microscopic level in order to violate the second law of thermodynamics, since in this case a given amount of heat is completely transformed into work. a closer inspection shows that such violation does not really take place. feynman demonstrated that, if t1 = t2, no net rotation of the axle is produced. the reason is that the pawl has its own thermal fluctuations that, from time to 080004-2 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. u(x) x λ q   figure 2: typical ratchet potential u(x). time, allow a tooth of the ratchet to slip in the opposite direction. not even demons are free from thermal fluctuations. in order for the machine to work as intended, the pawl should be colder than the vanes, t2 < t1. but in this case, there is a heat flux between thermal baths. the mechanical link between vanes and ratchet through the axle implies that the baths are not thermally isolated [17], even when the materials are perfect insulators. the system performs as a heat engine: some work is generated while some heat is transferred from a cold reservoir to a hot reservoir. in summary, feynmann’s ratchet —and brownian motors— actually work, but without violating the laws of thermodynamics. iii. breaking the symmetry: time, space and interactions feynman’s deep thinking motivated an entire generation of models around the same idea. the model ratchet is a fluctuations-driven overdamped nonlinear dynamical system described by ẋ = −u′(x) + f(t), u(x) = u(x + λ), 〈f(t)〉 = 0, (1) where u(x) is a periodic potential, such as the one illustrated in fig. 2, and f(t) is zero-mean fluctuation of some type. in general, the initial theoretical problem is to find the stationary current density j = 〈ẋ(t)〉 in the ratchet given the statistical properties of the fluctuation f(t) and the shape of u(x), and to be able to determine the most efficient conditions for the transformation of fluctuations into a net current. multiple variations and extensions of the same problem were studied in the 90s resulting on a jargon of names such as on-off ratchets [8], fluctuating potential ratchets [18,19], temperature ratchets [20, 21], chiral ratchets [22–24], and so on. in any case, three elements are always present: a particle which eventually will execute some motion and two forces, one coming from the external applied field and another given by the particular shape of the potential (i.e., the substrate where the particle resides). thus, an isolated particle “feels” two forces, but while such information is available to an observer, it is important to realize that the particle has no way to distinguish or separate these sources. thus, the break of symmetry resulting in average directed motion of a particle could come from either spatial or temporal sources. yet, a third force needs to be considered in the cases in which the concentration of particles becomes relevant and then particles mutual interactions are not negligible anymore, an aspect crucial to understand flow in channels. we will consider all these cases in the following sections. i. asymmetries in the substrate figure 3 summarizes the two basic ways in which asymmetries in space (or some other degree of freedom of the system, such as phase [25]) contribute to noise-induced transport. the common situation involves an asymmetric periodic potential that breaks the spatial inversion symmetry combined with a temporal, zero mean, forcing periodicity. in panel a, the case of turning on and off the asymmetric potential is depicted and panel b shows the case in which a tilting force is added. the first important result was due to magnasco in [19] who considered the case of the piecewise linear potential u(x) shown in fig. 4a, which is exactly solvable [26] for slow fluctuation f(t); it has a characteristic time much larger than the ratchet’s relaxation time. the potential is periodic and extends to infinity in both directions. λ measures the spacing of the wells, λ1 and λ2 the inverse steepnesses of the potential in opposite directions out of the wells, and q the well depths. the particle undergoes overdamped brownian motion due to its coupling with a thermal bath of temperature t , and an external driving f(t) which represents the forces. these two ingredients compose what 080004-3 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. λ1 λ2 qon on a q-fλ2 q-fλ1 +f –f λ1 λ2 b off figure 3: spatial asymmetry. panel a: the socalled flashing ratchet is a type of ratchet in which an asymmetric potential is periodically switched off and on. particles (green circles) diffuse evenly during the off period while the asymmetric potential favors the drift in one direction, producing a net transport to the left. panel b: in a rocked ratchet, an asymmetric potential is tilted periodically determining a (right directed) transport of the particle trough the relative lowest q−fλ2 value of the right potential barrier. we called the fluctuation f(t). the expression for the current in the adiabatic limit, which measures the work done by the ratchet was shown to be j(f) = p22 sinh(λf/2kt) kt ( λ q )2 p3 − ( λ q ) p1p2 sinh ( λf 2kt ), (2) p1 = δ + λ2 − δ2 4 f q , p2 = [ 1 − δf 2q ]2 − ( λf 2q )2 , p3 = cosh[(q− δf/2)/kt] − cosh(λf/2kt), where λ = λ1 + λ2 and δ = λ1 −λ2. the average current, the quantity of primary interest, is given by j = 〈j〉 = 1 τ ∫ τ 0 j(f(t)) dt, (3) where τ is the period of the driving force f(t), which is assumed longer than any other time scale of the system in this adiabatic limit. the current is maximized for a given value of the periodic forcing amplitude. interestingly, numerical computations showed robustness in the results when the forcing is not periodic. the key feature is that it should have a long time correlation. according to magnasco, “all that is needed to generate motion and forces in the brownian domain is loss of symmetry and substantially long time correlations” [19]. indeed, if the forcing is white noise, the system is at thermal equilibrium and j = 0. however, if the fluctuation auto-correlations are non-vanishing, i.e., for colored noise, the system is no longer in thermal equilibrium, and in general j 6= 0. since onset of a current means breaking the “right-left” symmetry, currents may only arise, in the case of additive noise, if the potential u(x) is asymmetric with respect to its extrema. it could be argued [27] that the emergence of current can be viewed as an example of “temporal order coming out of disorder”, since the current is apparently time-irreversible, whereas stationary noise does not distinguish “future” from the “past”; we notice, however, that eq. (1) implies relaxation and is thus time-irreversible itself. the flashing or pulsating ratchet depicted in panel a of fig. 3 was introduced in [28] and reintroduced in a more general theoretical context in [29]. despite the huge structural complexity of biological brownian motors, the majority of the models are compatible with a simplified description based on the flashing ratchet. the description 080004-4 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. is in terms of only one variable x that may represent, for example, the position of a molecule or the coordinate of a complex reaction with many intermediate steps. the environment, composed by some aqueous solution acts, on one hand, as a heat bath and, on the other hand, as a source or sink of atp, adp and pi molecules of the chemical reaction cycle that provides energy to the motor. in this simplified model, a periodic asymmetric potential is periodically turned on and off, as shown in fig. 3a. the situation is generalized to stochastic variations of the potential with a characteristic correlation time. as happens for the rocked ratchet, the current vanishes for zero correlation time, or fast pulsating limit (white noise). it also disappears in the slow pulsating limit, i.e., when the potential is left on or off for a diverging time. there is an optimum value of the correlation time that maximizes the current. a recent example of an experimental realization of a rocked ratchet in a mesoscopic scale can be found in [30], where dielectric particles suspended in water are affected by a ratchet potential given by a periodic and asymmetric light pattern. ii. temporal asymmetries figure 4b summarizes one type of ratchet in which the higher order statistics of the driving force can be responsible for the transport. indeed, the work of millonas [25, 31, 32] and others [33] showed that directed motion can be induced with an unbiased driving force, deterministic or stochastic, as long as it has asymmetric correlations: non zero odd correlation of order higher than one [27]. the case analyzed in the seminal work of magnasco [19] only considered f(t) symmetric in time f(t) = f(nτ−t). instead, the work of millonas [31] considered the same setting but studying a more general case in which the driving force still is non biased zero mean, 〈f(t)〉 = 0, but which is asymmetric in time, f(t) = { ( 1+� 1−� ) a 0 ≤ t < τ(1−�) 2 , mod τ −a τ(1−�) 2 < t ≤ τ, mod τ (4) as shown in fig. 4b. in this case, the time averaged a b u(x) a(1+ℇ)/(1-ℇ) τ(1+ℇ)/2 τ x q 0 f(t) τ(1-ℇ)/2 t a   λ λ1 λ2 figure 4: panel a: the simplest piecewise ratchet potential, where the spatial degree of asymmetry is given by the parameter δ = λ1−λ2. panel b: fluctuation’s temporal asymmetry. the driving force f(t) preserves the zero mean 〈f(t)〉 = 0. the temporal asymmetry is given by the parameter �. current can be easily calculated, 〈j〉 = 1 2 (1 + �)j(−a) + 1 2 (1 − �)j((1 + �)a/(1 − �)) (5) solving for different values of parameters, it was shown that the current is a peaked function both of kt (see fig. 5a) and of the amplitude a of the driving. as expected, the driving, the potential, and the thermal noise in fact play cooperative roles. for low temperatures, any transport depends on very large a values, while for large noise the features of the potential and of the driving are washed out. the most striking results are concerning the competition between the temporal asymmetry and the spatial asymmetry, as pictured in fig. 5b, resulting on the switching of the direction of the current as the asymmetry factor � is varied. this reversal represents the competition of the spatial asymmetry, which dominates for small � an the temporal asymmetry, which dominates for large �. 080004-5 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. 〈j〉 〈j〉 figure 5: temporally asymmetric fluctuations with mean zero can optimize or reverse the current in the ratchet of fig. 4 (with q = 1,λ = 1). panel a corresponds to the case of a symmetric potential (i.e., δ = 0) which shows a peaked function of the net current 〈j〉 as a function of temperature kt for � = 1 and three values of the driving amplitude a = 1,a = 0.8,a = 0.5 labeled (a), (b) and (c), respectively. panel b shows 〈j〉 versus the temporal asymmetry parameter � for three asymmetries in the shapes of the potential. curve (a) is for a symmetric potential, (i.e., δ = 0) and those labeled (b) and (c) for two cases of asymmetric potentials δ = −0.3 and δ = −0.7, respectively (with kt = 0.01, and a = 2.1) temporal asymmetry and spatial asymmetry relate to the problem of nonequilibrium transport in precisely the same way. in both cases, a net effect arises due to an interplay between the strength of a fluctuation, the time it acts, and the underlying dynamics. in the case of a spatial asymmetry, a fluctuation to the right with a given strength which lasts a given time will tend to take the system over the right-hand barrier while the same fluctuation with sign reversed does not lift it over the left-hand barrier. in the case of temporal asymmetry, the probabilities of the fluctuations to the right or to left are different, so the net effect arises in the absence of spatial asymmetries. what both of them show is that even a subtle asymmetry in the shape of the potential or in the shape of the spectral properties of the noise will give rise to an effect even when the net force due to each vanishes. the time asymmetry of the mean zero fluctuations discussed above can be cast in several different ways. dichotomic noise (a type of “kuboanderson” process) was used to demonstrate phase transport in a pair of josephson junctions [25]. there are also types of continuous noise exhibiting similar asymmetry, including shot noise (common in quantum electronics) which are of this type. mean zero shot noise, which is temporally asymmetric, can be produced if the frequency and amplitude distribution are slightly different for positive and negative fundamental pulses. another trivial example of temporally asymmetric driving force is a simple bi-harmonic signal which constitutes a curiosity since it results from adding two (zero mean symmetric) periodic process of harmonic frequencies. iii. particle interactions the previous discussions were limited to the cases in which an isolated or a few particles were present in the potential. as the concentration is increased, interaction among particles becomes relevant, and it can be the cause of a reduction, and even of a reversal [35–38] of the current. we present two examples. a. vortex current in a 2d array of josephson junctions current reversal has been experimentally observed in a two dimensional array of josephson junctions [39]. it was numerically analyzed in [40]. a ratchet potential for vortices is generated by modulating the gap between superconducting islands. the density of vortices is controlled by an external magnetic 080004-6 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. figure 6: diffusion in a periodic channel with asymmetric cavities when a force to the right is applied; a(x) is the witdth of the channel. parameters: total average concentration c = 0.5; force β af = 0.5, where a is the lattice spacing; lx = ly = 100 a. field. there is a repulsive vortex-vortex interaction. the results show a preferred direction of the vortex motion, parallel to the ratchet modulation, when an alternating force is applied. but as the vortex concentration is increased, this direction is reversed for appropriate values of the periodic forcing intensity. (vortex current reversal is also observed for a fixed value of the concentration when the periodic forcing, or ac current amplitude, is varied, see fig. 4 in [39] or fig. 2 in [40]). the vortex current reversal produced by the increase of concentration is a consequence of the following symmetry [39]. let us consider that the external magnetic field is such that positive vortices are produced. for small concentration (frustration parameter between 0 and 1/2), we have a small discrete number of positive vortices. for large concentration (frustration between 1/2 and 1), we can consider that there is a background of positive vortices in which some negative vortices move. but the movement of this negative vortices is in the opposite direction. for them, the ratchet potential is inverted so the rectification effect of the ratchet is inverted too. b. particle diffusion in a channel with asymmetric cavities the same effect is observed in a different context. in the next paragraphs, we refer to the hard core interaction between particles that diffuse in a channel with a transverse section a(x) that has a ratchet shape, see fig. 6. an external periodic forcing is applied in the direction of the channel. there is a particle-hole symmetry. but before going into the interaction effects, let us consider the low concentration regime, where interactions can be neglected. several interesting experiments have been performed with particles suspended in a liquid and contained in a channel qualitatively as the one shown in fig. 6. there are basically two ways to apply the periodic external forcing. in one case, a periodic variation of the pressure is used: particles are drifted back and forth by the movement of the liquid; see [41, 42] and the critical report [43] (cavities of order 5 µm). in the other case, the liquid remains still and the force is directly applied on the particles by an external field as, for example, an electric field on charged particles [44] (cavities of order 50 µm). such a system has been proposed for separation of particles of different size [45]. the idea is based on the difference between rectification effects for different size particles. when a periodic —unbiased— forcing is applied, particles move in the forward direction because of the ratchet; but, in general, larger particles move faster than smaller ones. now we apply a bias, a constant force in the backward direction that reduces the velocity of the larger particles and reverses the velocity of the smaller ones. then we have that larger particles end up in one extreme of the channel and smaller particles in the opposite one, with an estimated purity of 99.997 % according to the authors of [45]. 080004-7 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. figure 7: rectified particle current ∆i against average particle concentration c for different force amplitudes. units of ∆i: d/a2, where d is the diffusion coefficient and a is the lattice spacing; units of f: (βa)−1. lower left inset: scheme of the channel composed by an array of triangular cavities. the fick-jacobs equation [46] gives an appropriate description of the particle density in the channel as long as its dependence on the transverse direction, y, can be neglected, i.e., n(x,y,t) ' n(x,t). let us consider the transverse integral of the concentration: ρ(x,t) = ∫ dy n(x,y,t) ' a(x)n(x,t) (the two-dimensional channel can be easily extended to a three-dimensional tube). if d is the diffusion coefficient and f(t) is the total applied force, the fick-jacobs equation is ∂ρ(x,t) ∂t = ∂ ∂x [ d ( ∂ρ ∂x + β ∂h ∂x ρ )] , (6) where ∂h ∂x = −f(t)−β−1 d dx ln a(x) and β−1 = kt . the expression β−1 ln a(x) is called entropic potential due to the similarity with the thermodynamic relation among energy, free energy, temperature and entropy: h = u−ts, with ∂u ∂x = −f(t). in a first approximation, the diffusion coefficient is constant; a further refinement considers a dependence on a′(x), see [47]. now, let us consider the hard core interaction between particles, of the same size, diffusing in a lattice. a jump of a particle to the right is equivalent to a jump of a hole to the left. a concentration c of figure 8: particle concentration against longitudinal position x for one cavity of the channel depicted in fig. 6, in a stationary state (cavity length normalized to 1). dots correspond to monte carlo simulations. the curve is obtained from numerical integration of (7). concetration c = 0.5, more details in [49]. particles subjected to a force f is equivalent to a concentration 1−c of holes subjected to a force −f. this symmetry is the cause of the shape of fig. 7, where monte carlo results of the rectified current ∆i against the average concentration c for different values of the forcing amplitude is plotted [48]. a square wave in the limit of low frequency was used for the applied force; in this limit, the rectified current is equal to the difference between the current for the force in the positive phase and the current for the force in the negative phase. the particlehole symmetry is evident in the figure: changing c → 1 − c and ∆i → −∆i (a consequence of the change f → −f), we recover the same curves. le us note the current reversal for large concentration. it is the same effect that was mentioned in the previous section for vortex current in 2d arrays of josephson junctions. a description based on the fick-jacobs equation is also possible for particles with hard-core [49]. its derivation starts from the non-linear fokker-planck equation for fermions [50], where pauli exclusion principle plays the role of the hard core interaction. following the same steps used for derivation of the linear fick-jacobs equation [47], we can arrive at the following non 080004-8 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. linear version: ∂n(x,t) ∂t = 1 a ∂ ∂x da ( ∂n ∂x −βfn (1 −n) ) . (7) the non linear term, n (1 − n), is the responsible of the interaction. fig. 8 shows numerical integration of (7) in good agreement with monte carlo simulation results. iv. engineers knew it... the principles discussed above ruling the correlation ratchets have, in a macro scale, important technological applications. of course, some of them were applied even before a detailed statistical understanding was available. vibratory conveyors, or vibratory bowl feeders, are regularly used in many branches of industry such as food processing, synthetic materials or small-parts assembly mechanics, to mention just a few [51]. the conveying speed of these devices was theoretically and experimentally studied, for example, in [52] and references cited therein. there are many parameters involved in the operation of a vibratory conveyor: amplitude, frequency and mode of the vibrations, inclination angle and friction coefficient are only some of them. a classification in terms of the vibratory modes is as follows: sliding (linear horizontal vibration), ratcheting (linear vertical vibration) or throwing (circular, elliptical or linear tilted vibration). for throw conveyors, the material being transported loses contact with the through during part of the cycle, see fig. 9a. appropriate for granular materials o small objects, particles are forced to perform repeated short flights with a preferred direction, combined with rest and slide phases. the other two types: sliding and ratcheting involve temporal and spatial asymmetries, respectively. the sliding type of vibratory conveyors allows transport over a deck that vibrates back and forth with asymmetric motion (see fig. 5a) in the horizontal direction. the particle or object moves relative to the deck due to alternate stick and slip steps driven by the asymmetric oscillations, as shown in fig. 9b. b c slowfast a figure 9: three types of vibratory conveyors which share some of the principles of small scale ratchets. panel a: throwing conveyor with linear tilted vibration. panel b: sliding conveyor; transport is induced by asymmetric horizontal oscillations with zero mean, of the kind shown in fig. 4b. panel c: ratchet conveyor with vertical oscillations; similar to the flashing ratchet of fig. 3a. the ratchet conveyor achieves transport of granular material using vertical vibrations [38, 53]. directed motion is caused by the broken space symmetry of the deck’s surface, given by a sawtoothshaped profile, see fig. 9c. the ratchet conveyor shares qualitative features with the flash or pulsating ratchet depicted in fig. 3a. one difference is that it includes a ballistic flight phase. v. conclusions spatial or temporal asymmetries, or both, are able to generate directed motion in the presence of fluctuations. in addition to a thermal bath, fluctuations with large correlation time, compared to the characteristic relaxation time of the system, should be included. during the last decades, simple models based on these ideas provided a deeper under080004-9 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. standing of the complex biological machinery at the nano scale. this success stimulated the study of ratchets in a wide variety of contexts, and in larger scales. interactions among transported particles are relevant for high concentration; most noticeable, they may produce an inversion of the purported motion direction. vibrations inducing directed motion are used in industry for the transport of small —macroscopic— objects since, at least, around 1950. vibratory conveyors applied the qualitative features of ratchets, with space or time asymmetries, before a detailed theoretical understanding was available. half a century ago, feynman called attention to the fact that in his view, “there’s plenty of room at the bottom” [1]. we can safely conclude that, even today, there is plenty of room at the top as well. acknowledgements this work was partially supported by consejo nacional de investigaciones cient́ıficas y técnicas (conicet, argentina) [1] r p feynman, there’s plenty of room at the bottom, caltech’s engineering & science magazine, pasadena (1960). [2] r d astumian, making molecules into motors, sci. am. 285, 57 (2001). [3] p reimann, brownian motors: noisy transport far from equilibrium, phys. rep. 361, 57 (2002). [4] p reimann, p hänggi, introduction to the physics of brownian motors, appl. phys. a: mater. sci. process. 75, 169 (2002). [5] r d astumian, p hänggi, brownian motors, phys. today 55, 33 (2002). [6] j m r parrondo, b j de cisneros, energetics of brownian motors: a review, appl. phys. a 75, 179 (2002). [7] p hänggi, f marchesoni, artificial brownian motors: controlling transport on the nanoscale, rev. mod. phys. 81, 387 (2009). [8] f jülicher, a ajdari, j prost, modeling molecular motors, rev. mod. phys. 69, 1269 (1997). [9] m schliwa, g woehlke, molecular motors, nature 422, 759 (2003). [10] w r browne, b l feringa, making molecular machines work, nature nanotech. 1, 25 (2006). [11] m von delius, d a leigh, walking molecules, chem. soc. rev. 40, 3656 (2011) [12] d chowdhury, stochastic mechano-chemical kinetics of molecular motors: a multidisciplinary enterprise from a physicist’s perspective, phys. rep. 529, 1 (2013). [13] l mahadevan, p matsudaira, motility powered by supramolecular springs and ratchets, science 288, 95 (2000). [14] s denisov, s flach, p hänggi, tunable transport with broken space-time symmetries, phys. rep. 538, 77 (2014). [15] r p feynman, r b leighton, m sands, the feynman lectures on physics, addisonwesley, ma (1966). [16] m r von smoluchowski, experimentell nachweisbare derüblichen thermodynamik widersprechende molekularphänomene, physik. zeitschr. 13, 1069 (1912). [17] j m r parrondo, p español, criticism of feynman’s analysis of the ratchet as an engine, am. j. phys. 64, 1125 (1996). [18] r d astumian, m bier, fluctuation driven ratchets: molecular motors, phys. rev. lett. 72, 1766 (1994). [19] m o magnasco, forced thermal ratchets, phys. rev. lett. 71, 1477 (1993). [20] p reimann, r bartussek, r häussler, p hänggi, brownian motors driven by temperature oscillations, phys. lett. a 215, 26 (1996). [21] j d bao, directed current of brownian ratchet randomly circulating between two thermal sources, physica a 273, 286 (1999). 080004-10 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. [22] z c tu, z c ou-yang, a molecular motor constructed from a double-walled carbon nanotube driven by temperature variation, j. phys.: condens. matter 16, 1287 (2004). [23] z c tu, x hu, molecular motor constructed from a double-walled carbon nanotube driven by axially varying voltage, phys. rev. b 72, 033404 (2005). [24] m van den broeck, c van den broeck, chiral brownian heat pump, phys. rev. lett. 100, 130601 (2008). [25] m m millonas, d r chialvo, nonequilibrium fluctuation-induced phase transport in josephson junctions, phys. rev. e 53, 2239 (1996). [26] the straightforward techniques can be found earlier in h. risken, the fokker-planck equation, springer-verlag (2nd ed.) (1984). [27] m m millonas, self-consistent microscopic theory of fluctuation-induced transport, phys. rev. lett. 74, 10 (1995). [28] a l r bug, b j berne, shaking-induced transition to a nonequilibrium state, phys. rev. lett. 59, 948 (1987). [29] a ajdari, j prost, mouvement induit par un potentiel periodique de basse symmetrie: dielectrophorese pulsee, c. r. acad. sci. paris sér. ii 315, 1635 (1992). [30] a v arzola, k volke-sepúlveda, j l mateos, experimental control of transport and current reversals in a deterministic optical rocking ratchet, phys. rev. lett. 106, 168104 (2011). [31] d r chialvo, m m millonas, asymmetric unbiased fluctuations are sufficient for the operation of a correlation ratchet, phys. lett. a 209, 26 (1995). [32] m m millonas, d r chialvo, control of voltage-dependent biomolecules via nonequilibrium kinetic focusing, phys. rev. lett. 76, 550 (1996). [33] m c mahato, a m jayannavar, synchronized first-passages in a double-well system driven by an asymmetric periodic field, phys. letters a 209, 21 (1995). [34] m m millonas, d a hanck, nonequilibrium response spectroscopy and the molecular kinetics of proteins, phys. rev. lett. 80, 401 (1998). [35] m kostur, j luczka, multiple current reversal in brownian ratchets, phys. rev. e 63, 021101 (2001). [36] s savel’ev, f marchesoni, f nori, stochastic transport of interacting particles in periodically driven ratchets, phys. rev. e 70, 061107 (2004). [37] baoquan ai, liqiu wang, lianggang liu, transport reversal in a thermal ratchet, phys. rev. e 72, 031101 (2005). [38] i derényi, p tegzes, t vicsek, collective transport in locally asymmetric periodic structures, chaos 8, 657 (1998). [39] d e shalóm, h pastoriza, vortex motion rectification in josephson junction arrays with a ratchet potential, phys. rev. lett. 94, 177001 (2005). [40] v i marconi, rocking ratchets in twodimensional josephson networks: collective effects and current reversal, phys. rev. lett. 98, 047006 (2007). [41] c kettner, p reimann, p hänggi, f müller, drift ratchet, phys. rev. e 61, 312 (2000). [42] s matthias, f müller, asymmetric pores in a silicon membrane acting as massively parallel brownian ratchets, nature (london) 424, 53 (2003). [43] k mathwig, f müller, u gosele, particle transport in asymmetrically modulated pores, new j. of phys. 13, 033038 (2011). [44] c marquet, a buguin, l talini, p silberzan, rectified motion of colloids in asymmetrically structured channels, phys. rev. lett. 88, 168301 (2002). [45] d reguera, a luque, p s burada, g schmid, j m rub́ı, p hänggi, entropic splitter for particle separation, phys. rev. lett. 108, 020604 (2012). 080004-11 papers in physics, vol. 8, art. 080004 (2016) / g. p. suárez et al. [46] j h jacobs, diffusion processes, pag. 68, springer, new york (1967). [47] r zwanzig, diffusion past an entropy barrier, j. phys. chem. 96, 3926 (1992). [48] g p suárez, m hoyuelos, h martin, transport in a chain of asymmetric cavities: effects of the concentration with hard-core interaction, phys. rev. e 88, 052136 (2013). [49] g p suárez, m hoyuelos, h martin, transport of interacting particles in a chain of cavities: description through a modified fickjacobs equation, phys. rev. e 91, 012135 (2015). [50] t d frank, nonlinear fokker-planck equations, pag. 280, springer, berlin (2005). [51] c a kruelle, a gotzendorfer, r grochowski, i rehberg, m rouijaa, p walzel, granular flow and pattern formation on a vibratory conveyor, in traffic and granular flow ’05, eds. a schadschneider, t poschel, r kuhne, m schreckenberg, d. e. wolf, pag. 111, springerverlag, berlin, heidelberg (2007). [52] e m sloot, n p kruyt, theoretical and experimental study of the transport of granular materials by inclined vibratory conveyors, powder technology 87, 203 (1996). [53] z farkas, p tegzes, a vukics, t vicsek, transitions in the horizontal transport of vertically vibrated granular layers, phys. rev. e 60, 7022 (1999). 080004-12 papers in physics, vol. 7, art. 070005 (2015) received: 20 november 2014, accepted: 29 march 2015 edited by: c. a. condat, g. j. sibona licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070005 www.papersinphysics.org issn 1852-4249 what season suits you best? seasonal light changes and cyanobacterial competition g. cascallares,1, 2∗ p. m. gleiser1, 2† nearly all living organisms, including some bacterial species, exhibit biological processes with a period of about 24 h called circadian (from the latin circa, about and dies, day) rhythms. these rhythms allow living organisms to anticipate the daily alternation of light and darkness. experiments carried out in cyanobacteria have shown the adaptive value of circadian clocks. in these experiments, a wild type cyanobacterial strain (with a 24 h circadian rhythm) and a mutant strain (with a longer or shorter period) grow in competition. in different experiments, the external light dark cycle was chosen to match the circadian period of the different strains, revealing that the strain whose circadian period matches the light-dark cycle has a larger fitness. as a consequence, the initial population of one strain grows while the other decays. these experiments were made under fixed light and dark intervals. in nature, however, this relationship changes according to the season. therefore, seasonal changes in light could affect the results of the competition. using a theoretical model, we analyze how modulation of light can change the survival of the different cyanobacterial strains. our results show that there is a clear shift in the competition due to the modulation of light, which could be verified experimentally. i. introduction circadian rhythms, oscillations with approximately 24 h period in many biological processes, are found in nearly all living organisms. until the mid-1980s, it was thought that only eukaryotic organisms had a circadian clock, since it was assumed that an endogenous clock with a period of τ = 24 h would not be useful to organisms that often divide more rapidly [1]. however, in 1985, several research groups discovered that in cyanobacteria there was ∗email: mgcascallares@gmail.com †email: gleiser@cab.cnea.gov.ar 1 consejo nacional de investigaciones cient́ıficas y técnicas. 2 centro atómico bariloche, 8400 san carlos de bariloche, ŕıo negro, argentina. a daily rhythm of nitrogen fixation [2–4]. huang and co-workers were the first to recognize that a strain of synechococcus, a unicellular cyanobacterium, had circadian rhythms [5]. this transformed synechococcus in one of the simplest models for studying the molecular basis of the circadian clock. the ubiquity of circadian rhythms suggests that they confer an evolutionary advantage. the adaptive functions of biological clocks are divided into two hypotheses. the external advantage hypothesis supposes that circadian clocks allow living organisms to anticipate predictable daily changes, such as light/dark, so they can schedule their biological functions like feeding and reproduction at appropriate times. in contrast to this hypothesis, it has been suggested that circadian clocks confer adaptive benefit to organisms through temporal 070005-1 papers in physics, vol. 7, art. 070005 (2015) / g. cascallares et al. coordination of their internal physiology (intrinsic advantage) [6]. in this case, the circadian clock should be of adaptive value in constant conditions as well as in cyclic environments. in order to study if circadian clocks provide evolutionary advantages, woelfle and co-workers tested the relative fitness under competition between various strains of cyanobacteria [7]. they carried out experiments where a wild-type strain (τ = 25 h) of cyanobacteria and mutant strains, with shorter (τ = 22 h) and longer (τ = 30 h) periods, were subjected to grow in competition with each other under light-dark (ld) cycles of different periodicity. they found that the strain which won the competition was the one whose free-running period matched closely the period of the ld cycle. this difference in fitness was observed despite the fact that the growth rates were not significantly different when each strain was grown with no competition. also, mutant strains could outcompete wild-type strains under continuous light (ll) conditions, suggesting that endogenous rhythms are advantageous only in rhythmic environments [7]. this study provided one of the most convincing evidence so far in support of fitness advantages of synchronization between the endogenous period and the period of environmental cycles. ouyang et al. [8] suggested an explanation for fitness differences: this could be due to competition for limiting resources or secretion of diffusible factors that inhibit the growth of other cyanobacterial strains. roussel et al. [9] proposed mathematical models in order to test which of these hypotheses was more plausible. they found that the model based on mutual inhibition was consistent with the experimental observations of [8]. in this model, the mechanism of competition between cells involves the production of a growth inhibitor, which is produced only during the subjective day (sl) phase, and that growth occurs only in the light phase. each of the experiments and computational simulations mentioned before had equal amounts of light and dark exposure. however, in nature the relationship is not constant, and the duration of sunlight in a day changes according to the season and the latitude. the circadian system has to adapt to day length variation in order to have a functional role in optimizing seasonal timing and generating the capacity to survive at different latitudes [10]. using this as a motivation, we will test how day length variation plays a role in the competition between different strains of cyanobacteria. ii. the model for modeling the growth of each cyanobacterial strain, we use the model introduced by gonze et al. [11] that is based on a diffusible inhibitor with a light sensitive oscillator to represent the cellular circadian oscillator. the evolution equations of cell population ni and the level of inhibitor i are: dni dt = kini(1 − n∑ j=1 nj), di dt = n∑ i=1 ni(pi − vmaxi km + i ) (1) ki = { k in l and i in sl or i < ic 0 otherwise, pi = { p if i in sl 0 otherwise. (2) in these equations, ni is the number of cells of strain i, ki is the growth rate of each strain, p measures the rate of inhibitor production, vmax is the maximum rate and km is the michaelis constant characterizing the enzymatic degradation of the inhibitor. following the work of gonze et al., we use a modified version of the van der pol oscillator to produce sustained circadian oscillations [11]. this simple mathematical model can describe circadian oscillations. dx dt = 24 π 12 (xc + 0.13( x 3 + 4 3 x3 − 256 105 x7) + b(t)), dxc dt = 24 π 12 (−x( 24 τx )2 + b(t) 3 xc − 0.15b(t)x) (3) with b(t) = (1 − x 3 )0.39ρ0.23, ρ = { 5 in light phase (l), 0 in dark phase (d). (4) 070005-2 papers in physics, vol. 7, art. 070005 (2015) / g. cascallares et al. figure 1: following the schematic explanation of gonze et al. [11], we show how the model works. the inhibitor i is produced in 3, during the sl phase, and it is degraded during the entire day. each strain grows in 1, if its sd phase overlaps with l and i < ic, and in 2, when its sl phase overlaps with l. these phenomenological equations produce oscillations in the circadian variable x with a period close to τx. xc is a complementary variable and b(t) represents the coupling between the external ld cycle and the circadian oscillator, which depends on the light intensity ρ. we use this simple and phenomenological model because the detailed mechanisms, both at molecular and population level, have remained unknown in synechococcus elegantus. for example, it is still unclear which phospho-state of kaic, one of the clock genes identified in this organism, is involved in activation or suppression due to inconsistent reports. there are different mathematical models proposed for the molecular mechanisms of synechococcus, but none of them has been experimentally verified [14]. we chose the model by gonze et al [11] since the parameters on their equations are in agreement with the experimental data obtained by woelfle et al. [7]. these values are vmax = 1000, k = 1.8, ic = 0.01, p = 500 and km = 0.05. in fig. 1, we present a schematic plot of the model that shows how the growth of each population is coupled with the circadian oscillator. as it can be seen in this figure, when we modify the length of the light (l) phase then the overlaps 1 and 2 change, so the growth of each strain is altered affecting the competition. this effect is due to the fact that each strain grows always when its subjective phase sl overlaps the external light phase (2), jan feb mar apr may jun jul aug sep oct nov dec 21 th day of each month 4 6 8 10 12 14 16 18 d a y le n g th ( h ) quito, ecuador jujuy, argentina ushuaia, argentina figure 2: day length over the course of 2012 at different latitudes in south america. and when sd overlaps l only if the inhibitor is below a threshold ic (1). the inhibitor is secreted in the sl phase and then it is degraded. iii. results in many organisms, a photoperiodic response is reflected in a physiological change. photoperiodic responses are common among organisms from the equator to high latitudes and have been observed in different types of organisms, from arthropods to plants. diapause (a suspension of development done by insects), migration and gonadal maturation are examples of these annual changes controlled by photoperiod. these biological processes are triggered as soon as the day length reaches certain duration, known as the critical photoperiod. even near the equator, where day length changes are very small through the whole year, they are used to synchronize reproductive activities with annual events. in fig. 2, we show how day length varies depending on the latitude [13]. the figure compares the day length on the twenty-first day of each month in three cities in south america. we show quito (ecuador), which sits near the equator in latitude 0◦15′, jujuy (argentina), which is located near the tropic of capricorn in latitude 24◦01′ and ushuaia, the southernmost city in argentina which 070005-3 papers in physics, vol. 7, art. 070005 (2015) / g. cascallares et al. a b figure 3: (a) the outcome of competition between wild-type (continuous line) and long-period mutant (dashed line) shows coexistence between the two strains for t = 28 h. (b) coexistence ends after ≈ 8 days of adding 3 minutes of light-time per day. the long-period mutant can win competition as the photoperiod now is closer to its free-running period. lies in latitude 54◦48′. we can see that during the equinoxes, all places receive 12 hours of daylight [13]. even when changes in photoperiod are not constant over the year and depend on the season, we decide to modify the amount of light per day with fixed steps in order to simplify the equations. we simulate the seasonal fluctuations in day length by adding or subtracting minutes of lighttime every day to the external ld cycle. for example, if we add 12 minutes of light per day in a ld12:12 cycle, after five days the external cycle of ld has 13 hours of light and 11 of darkness, since the total amount of hours per day is constant. we initiate competition between equal fractions of wild-type strain (τx = 25h) and long-period mutant (τx = 30h) and equal amounts of light and darkness. in order to mimic experiments in cultures that were diluted and sampled every 8 days [7], we dilute the culture after 8 light-dark cycles by dividing by a factor of 100 the variables n1, n2 and i. first, we analyzed the case in which the experiments of [7] showed a phase of coexistence. for t = 28, the period of the ld cycle has an intermediate value between the free-running period (frp) of the two strains and both strains can coexist for a b figure 4: effect of modulation in light time (30 minutes per day) on the outcome of competition between strain 1 (wild-type, τx = 25 h) and strain 2 (long-period mutant, τx = 30 h) carried out in two different conditions: (a) ld12:12 and (b) ld15:15. fraction of strain 1 (left panel) and strain 2 (right panel) are shown as a function of time; red with modulation, so the proportion of l and d changes after each day, and blue with fixed cycles. a long time. however, when we allowed the days to become longer and the nights shorter, after some days the coexistence was broken, as we show in fig. 3. this is due to the increase in the amount of light hours that benefits the long-period mutant. in fig. 4, we show in the left panel the fraction of cells belonging to the wild-type strain as a function of time with fixed ld cycles (red continuous curve) and in the case in which we added 30 minutes of light time each day (blue dashed curve). the first days both curves are similar, but from the second day on, the long-period strain has a frp closest to the period of the ld cycle. the fraction of wildtype strain starts to decrease and is out-competed. in the right panel, we show the corresponding fraction of long-period mutant cells in the same cases. our results show a new way to test competition experimentally. testing results would be very simple, since the cultures do not need to be diluted. in fact, as shown in fig. 4, after eight days of competition between the two strains, a difference of about ten percent in the two populations would be needed 070005-4 papers in physics, vol. 7, art. 070005 (2015) / g. cascallares et al. 0 10 20 30 time (days) 0 0,2 0,4 0,6 0,8 1 n 1 /( n 1 + n 2 ) mutant wild type figure 5: competition between long-period mutant and wild-type strains in a ld12:12 cycle for the same parameters as in fig. 4, but adding 12 minutes per day the light time. a crossover is observed after 8 days. in order to verify the theoretical prediction. starting from this simple test, we looked for non trivial effects in a longer experiment. we found an interesting effect that can be observed in fig. 5. in this simulation, we added 12 minutes of light time each day. in the first days, the growth was as expected. the wild-type strain could outcompete the long period mutant strain, since the external cycle was ld12:12. but after eight days, when the day was longer than 13 hours, a crossover was observed. the mutant strain started to win the competition because its endogenous period was closer now to the external cycle. this effect could also be tested experimentally but in this case the dilution of the culture every 8 days would be needed. iv. conclusions the mechanisms underlying the enhancement of reproductive fitness remain still unknown. despite the fact that numerous models have been tested, each has some evidence that supports it and none can be excluded at this time [12]. in this work, we used a diffusible inhibitor model, so our predictions in the growth rates changes could be useful to test the validity of this mechanism. our study is motivated by fluctuations in the day length throughout the year which are reflected in organisms behavior. we studied how these fluctuations affect the competition between different strains of cyanobacteria. we found non-trivial effects which could be tested experimentally. in the first case, we determined the composition of two strains under competition after eight days when the light is modulated. the prediction of these numerical simulations can be tested in a simple experiment where no dilution is needed. we also propose a second experiment where dilution in the cultures is necessary, which allows for a non trivial effect such as a crossover to be observed. [1] c h johnson, s s golden, m ishiura, t kondo, circadian clocks in prokaryotes, mol. microbiol. 21, 5 (1996). [2] l j stal, w e krumbein nitrogenase activity in the non-heterocystous cyanobacterium oscillatoria sp. grown under alternating lightdark cycles, arch. microbiol. 143, 67 (1985). [3] n grobbelaar, t c huang, h y lin, t j chow, dinitrogen-fixing endogenous rhythm in synechococcus rf-1, fems microbiol. lett. 37, 173 (1986). [4] a mitsui, s kumazawa, a takahashi, h ikemoto, t arai, strategy by which nitrogenfixing unicellular cyanobacteria grow photoautotrophically, nature 323, 720 (1986). [5] t c huang, t j chow, characterization of the rhythmic nitrogen-fixing activity of synechococcus rf-1 at the transcription level, curr. microbiol. 20, 23 (1990). [6] k m vaze, v k sharma, on the adaptive significance of circadian clocks for their owners, chronobiol. int. 30, 413 (2013). [7] m a woelfle, y ouyang, k phanvijhitsiri, c h johnson, the adaptative value of circadian clocks: an experiment assesment in cyanobacteria, curr. biology 14, 1481 (2004). [8] y ouyang, c r andersson, t kondo, s golden, c h johnson, resonating circadian clocks enhance fitness in cyanobacteria, pnas 95, 8660 (1998). 070005-5 papers in physics, vol. 7, art. 070005 (2015) / g. cascallares et al. [9] m roussel, d gonze, a goldbeter, modelling the differential fitness of cyanobacterial strains whose circadian oscillators have different freerunning periods: comparing the mutual inhibition and substrate depletion hypotheses, j. theor. biol. 205, 321 (2000). [10] r a hut, g m beersma, evolution of timekeeping mechanisms: early emergence and adaptation to photoperiod, phil. trans. r. soc. b 366, 2141 (2011). [11] d gonze, m roussel, a goldbeter, a model for the enhancement of fitness in cyanobacteria based on resonance of a circadian oscillator with the external light-dark cycle, j. theor. biol. 214, 577 (2002). [12] p ma, m a woelfle, c h johnson, an evolutionary fitness enhancement conferred by the circadian system in cyanobacteria, chaos, solitons & fractals 50, 65 (2013). [13] http://www.timeanddate.com/worldclock /sunrise.html [14] s hertel, c brettschneider, i m axmann, revealing a two-loop transcriptional feedback mechanism in the cyanobacterial circadian clock, plos comput biol. 9, e1002966 (2013). 070005-6 papers in physics, vol. 6, art. 060001 (2014) received: 19 december 2013, accepted: 20 january 2014 edited by: p. weck reviewed by: j. p. marques, departamento de f́ısica, centro de f́ısica atómica, fac. de ciências, universidade de lisboa, portugal. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060001 www.papersinphysics.org issn 1852-4249 experimental determination of l x-ray fluorescence cross sections for elements with 45 ≤ z ≤ 50 at 10 kev e. v. bonzi,1, 2∗ g. b. grad,1† r. a. barrea3‡ synchrotron radiation at 10 kev was used to experimentally determine the ll, lα, lβi , lβii , lγi and lγii fluorescence cross sections for elements with 45 ≤ z ≤ 50, as part of an ongoing investigation at low energies. the measured data were compared with calculated values obtained using coefficients from scofield, krause and puri et al. i. introduction this work is part of a systematic investigation on elements with 45 ≤ z ≤ 50, which has been carried out at different energies [1–3]. the l x-ray cross sections were measured with monoenergetic excitation beam at 10 kev. we report cross sections for each spectral line, according to the resolution of the si(li) solid state detector used to resolve individual component lines of the spectral emission. the experimental cross sections were grouped considering the transitions scheme, the energy of the emission lines and the detector resolution. in general, the fluorescence cross sections obtained in this work show the same trend with z and broad agreement with the data published by puri et al. [4, 5] and krause [6, 7], calculated using ∗e-mail: bonzie@famaf.unc.edu.ar †e-mail: grad@famaf.unc.edu.ar ‡e-mail: rbarrea@depaul.edu 1 facultad de matemática, astronomı́a y f́ısica, universidad nacional de córdoba. ciudad universitaria. 5000 córdoba, argentina. 2 instituto de f́ısica enrique gaviola (conicet), 5000 córdoba, argentina. 3 physics department, depaul university, chicago, il 60614, usa. scofield’s coefficients [8, 9]. ii. experimental condition the measurements were carried out at the xray fluorescence beam line at the national synchrotron light laboratory (lnls), campinas, brazil [10]. the components of the experimental setup were: • silicon (111) channel cut double crystal monochromator, which can tune energies between 3 and 30 kev. the energy resolution is 3·10−4 to 4·10−4 between 7 and 10 kev. • a si(li) solid state detector, 5 mm thick and 5 mm in diameter, with a resolution of 170 ev at 5.9 kev and a 0.0127 cm thick beryllium window. the model introduced by jaklevic and giauque [11] was used to obtain the detector efficiency. • the whole setup is mounted on a motorized lift table, which allows the vertical positioning of the instruments within the linearly polarized part of the beam. • to limit the beam size, a motorized computer controlled set of vertical and horizontal 060001-1 papers in physics, vol. 6, art. 060001 (2014) / e. v. bonzi et al. slits (located upstream and downstream of the monochromator) was used. a set of foil samples (rhodium, palladium, silver, cadmium, indium and tin ) was used to determine the l fluorescence cross sections of these elements. the foil samples were provided by alfa products inc., with a certified purity of over 99%. the foils thicknesses are shown in bonzi et al. (see table i) [2]. k emission lines of chlorine, calcium, titanium and iron were measured to determine the geometrical and the detector efficiency factors. the kα and lα fluorescent spectra were measured by collecting 2·105 net counts for each element in order to have the same statistical counting error in all measured spectra. a system dead time, lower than 1%, was established measuring the fluorescence emission of a ti sample, adjusting the slit at the exit of the monochromator. all samples were measured with the same slit aperture. unwanted effects, such as piling up, were avoided using this configuration and the geometric factors were ensured to be the same for all samples. this configuration made it unnecessary to carry out corrections for count losses, spectra distortions or modification of the geometrical arrangement. iii. spectra analysis the energy of the emission lines tabulated by scofield [8, 9] and the detector resolution were considered to group the l x-ray fluorescence lines. this line arrangement was used to fit the l spectrum, where the lβ and lγ compound lines have been noted with a roman subscript according to the most intense contribution line, with its corresponding atomic transition: • ll = l3 − m1, • lα = l3 − m5 + l3 − m4, • lβi = l2 −m4 +l1 −m2 +l1 −m3 +l3 −n1, • lβii = l3 − n5 + l3 − o4 + l3 − o5 + l3− o1 + l1 − m5 + l1 − m4 + l3 − n4, • lγi = l2 − n4, • lγii = l1 −n2 +l1 −n3 +l1 −o2 +l1 −o3. the background radiation was fitted using a linear second order polynomial. the area of the fluorescence peaks was determined as the average of the areas obtained by the adjustment using hypermet and gaussian functions. the escape peaks were fitted using a gaussian function. as a consequence of the excitation with a linearly polarized photon beam, the contribution to the background was very low. the linear polarization of the incident beam produces negligible scattered radiation at 900 with respect to the incident beam direction. the detector position is localized at the same height of the storage ring. iv. data analysis the expression for the l experimental fluorescence cross sections is [13] σeli(eo) = ili io.g.�(eli).t(eo,eli) (1) where σeli(eo) = experimental li fluorescence cross sections of the element observed at the energy eo, with li = ll, lα, lβi, lβii, lγi or lγii; ili = measured intensity of the li spectral line; io.g.�(eli) = factor comprising the intensity of the excitation beam io; the geometry of the experimental arrangement g and the detector efficiency �(eli); eo = energy of the incident beam, in this case 10 kev; eli = energy of the li spectral line; the data was obtained from scofield [8]; and t(eo,eli) = correction factor for self absorption in an infinitely thick sample, which is t(eo,eli) = ( µ(eo) sin(θ1) + µ(eli) sin(θ2) )−1 (2) where µ(e) = mass absorption coefficient of the sample at energy e from hubbell and seltzer [14] and θ1 and θ2 = incidence and take off angles, equal to 45o in the current setup. in these measurements, all the samples were considered as infinitely thick for x-ray fluorescence. the factor io.g.�(e) was calculated using the following expression 060001-2 papers in physics, vol. 6, art. 060001 (2014) / e. v. bonzi et al. io.g.�(eki) = iki σωfki (eo).ci.t(eo,eki) (3) where iki = measured intensity of the k spectral line, ci = the weight concentration of the element of interest in the sample, σωfki (e) = k fluorescence cross sections of the element observed at energy e, defined as σωfki (eo) = σki(eo).ωk.fk, with σki(e) = k shell photoionization cross section for the given element at the excitation energy e, from scofield [8], ωk = k shell fluorescence yield, from krause [6, 7] and fk = fractional emission rate for kα or kβ xrays, from khan and karimi [15], defined as fkα = [ 1 + ikβ ikα ]−1 ; fkβ = [ 1 + ikα ikβ ]−1 (4) t(eo,eki) = correction factor for self absorption in the sample, eo = energy of the incident beam and eki = energy of the k spectral line for a given element, from scofield [8]. the factor io.g.�(e) was previously determined in ref. [2], where the same geometry and detector were used. because of this, the io.g.�(e) energy dependence is already known and only a scale factor is needed to obtain the correct beam intensity. four targets: cl (nacl), ca (cahpo4 · 2h2o), ti (ti foil) and fe (fe foil) emitting fluorescent xrays in the range from 2.4 kev to 7.0 kev were used to determine the scale factor in this work. four kα and four kβ lines were used to fit the scale factor. jaklevic and giauque’s [11] model was used to fit the detector efficiency. v. results and discussion l x-ray cross section values obtained in our fluorescence experiment and the theoretical values calculated by using coefficients given by scofield [8, 9], puri et al. [4] and krause [7] are shown in table 1 and figs. 1 to 6. puri et al. predicted theoretical coster kronig and fluorescence values using ab initio relativistic calculations, while krause’s values of ωk, ωli and 5 10 15 20 25 30 35 45 46 47 48 49 50 c ro s s s e c ti o n [ b /a t] z atomic number ll cross sections this work puri krause figure 1: comparison of ll cross sections. 300 400 500 600 700 800 900 45 46 47 48 49 50 c ro s s s e c ti o n [ b /a t] z atomic number lα cross sections this work puri krause figure 2: comparison of lα cross sections. fij were obtained by fitting experimental and theoretical compiled data. in krause’s tables, the theoretical data were calculated for singly ionized free atoms while the experimental data contain contributions from solid state, chemical and multiple ionization effects. the lα cross section values have a better agreement with the theoretical values when the intensity peaks are fitted with an hipermet function instead of a gaussian function. this happens because the hipermet function has a tail on the left side that increases the fitted area. moreover, the tail of the hipermet function used to fit the lα peaks diminishes the area and the cross sections of the ll peaks, accordingly. the experimental ll cross sections show a sim060001-3 papers in physics, vol. 6, art. 060001 (2014) / e. v. bonzi et al. element ll lα lβi lβii γi lγii this work 13 ± 4 373 ± 13 163 ± 9 50 ± 5 22 ± 4 7 ± 2 rh 45 puri 16 438 194 30 11 7 krause 15 396 212 27 11 10 this work 14 ± 2 450 ± 15 233 ± 10 43 ± 7 25 ± 3 11 ± 2 pd 46 puri 19 518 234 43 17 8 krause 17 454 249 38 16 11 this work 19 ± 2 506 ± 13 280 ± 16 59 ± 6 28 ± 3 14 ± 3 ag 47 puri 22 597 277 55 22 10 krause 19 516 295 48 21 14 this work 20 ± 3 579 ± 10 371 ± 17 70 ± 6 32 ± 2 15 ± 2 cd 48 puri 26 686 328 70 29 11 krause 22 598 351 62 28 17 this work 23 ± 2 641 ± 19 447 ± 17 83 ± 5 37 ± 2 23 ± 2 in 49 puri 30 795 386 89 37 13 krause 26 689 413 77 36 20 this work 18 ± 4 579 ± 18 454 ± 17 164 ± 5 45 ± 2 33 ± 3 sn 50 puri 29 765 599 94 51 38 krause 25 676 582 83 48 39 table 1: experimental and theoretical l x-ray fluorescence cross sections in barns/atom at 10 kev. experimental data (this work), theoretical values calculated using scofield [8] and puri [4] and semiempirical coefficients obtained from scofield [8] and krause [6]. 100 200 300 400 500 600 700 45 46 47 48 49 50 c ro s s s e c ti o n [ b /a t] z atomic number lβi cross sections this work puri krause figure 3: comparison of lβi cross sections. ilar z trend, compared to the data obtained using krause and puri et al. values. nevertheless,in general, our results are lower than those. the lα experimental fluorescence cross section data, fig. 2, agree well with krause’s values although for elements with higher z, the experimental values are slightly lower than those from krause. they are even lower than puri’s et al. values, but still showing the same trend with z. the lβi experimental fluorescence cross sections show a very good agreement with the theoretical values when the hipermet function is used to fit the area (see fig. 3). the lβii measured cross sections show a similar dependence on z as both theoretical assemblies (fig. 4). the lβi sn experimental value is lower than the data presented by either puri et al. or krause and the lβii sn experimental value is much higher than both theoretical values. this behavior might be due to the fitting process as both spectra lines are too close in energy; the lβii intensity seems to be overestimated while the lβi intensity seems to be underestimated. a similar behavior is observed for rhodium experimental data although the differences with the theoretical values are much smaller than those for tin. the experimental lγi fluorescence cross sections show some differences with the z trend of the theoretical data: in the lower z range, the experimental values are higher than the theoretical ones while for higher z, this difference becomes smaller. sn values, z = 50, show a different behavior, being lower than both calculated values (see fig. 5). 060001-4 papers in physics, vol. 6, art. 060001 (2014) / e. v. bonzi et al. 0 20 40 60 80 100 120 140 160 180 45 46 47 48 49 50 c ro s s s e c ti o n [ b /a t] z atomic number lβii cross sections this work puri krause figure 4: comparison of lβii cross sections. 0 10 20 30 40 50 60 45 46 47 48 49 50 c ro s s s e c ti o n [ b /a t] z atomic number lγi cross sections this work puri krause figure 5: comparison of lγi cross sections. the lγii experimental values show the general z trend of the values presented by krause and puri et al.. the experimental values are sometimes higher or lower than the theoretical ones but the range of values is similar to them (see fig. 6). to determine the uncertainties of the experimental cross sections, the propagation of errors was carried out in eq. (1). the uncertainty values are in general around 6-10%, and less than 40% in case of the ll line. the uncertainty associated to the io.g.�(e) factor was estimated as the mean quadratic deviation of the experimental values (≤ 2%). for the factor t(eo,eli), a propagation of errors was carried out assuming a 3% error in the values of the mass absorption coefficients, and a 2% error in the sine 0 10 20 30 40 50 45 46 47 48 49 50 c ro s s s e c ti o n [ b /a t] z atomic number lγii cross sections this work puri krause figure 6: comparison of lγii cross sections. of the angles due to the sample positioning errors. krause’s ωk values for elements with 45 ≤ z ≤ 50 have an estimated error of 1%. the uncertainties of the peak areas were established as half the difference between the areas obtained using gaussian and hypermet functions to fit. these uncertainties were the main contribution to the experimental errors of the cross section. vi. conclusions in this investigation, the l x-ray fluorescence cross sections of a group of elements with 45 ≤ z ≤ 50 were measured using a synchrotron radiation source for monoenergetic beams at 10 kev. the polarization properties of the monoenergetic excitation beam and the high resolution of the detector system allowed to reduce the scattered radiation thus obtaining a better signal to noise ratio and a better accuracy for the experimental cross sections. the cross sections of ll, lα, lβi, lβii, lγi and lγii lines were measured considering a more detailed group than the usual sets. in table 1, the comparison between the experimental fluorescence cross section values with the theoretical values calculated using coefficients from scofield [8, 9], puri et al. [4] and krause [6] are shown. our experimental values are in general in good agreement with the calculated data using scofield’s [8, 9] and krause’s [6] coefficients. the l cross sections present uncertainties around 6-10% and the less intensive ll peaks show uncer060001-5 papers in physics, vol. 6, art. 060001 (2014) / e. v. bonzi et al. tainties that in some cases come close to 40%, being the fitting uncertainty the most important error source. the use of the hypermet function is very convenient to fit the lα and lβ peaks (see table 1). the solid state detector used in our experiments does not have enough energy resolution to resolve each spectral line. a higher resolution detection system would be desirable in order to analyze each spectral line separately. the coster kronig coefficients present large fluctuations in this atomic range and that is the cause of the observed discrepancies. acknowledgements this work was carried out under grants provided by secyt u.n.c. (argentina). research partially supported by lnls national synchrotron light laboratory, brazil. [1] e v bonzi, r a barrea, experimental l x-ray fluorescence cross sections for elements with 45 ≤ z ≤ 50 at 7 kev by synchrotron radiation photoionization, x-ray spectrom. 34, 253 (2005). [2] e v bonzi, n m badiger, g b grad, r a barrea, r g figueroa, measurement of l x-ray fluorescence cross sections for elements with 45 ≤ z ≤ 50 using synchrotron radiation at 8 kev, nucl. instrum. meth. b 269, 2084 (2011). [3] e v bonzi, n m badiger, g b grad, r a barrea, r g figueroa, l x-ray fluorescence cross sections experimentally determined for elements with 45 ≤ z ≤ 50 at 9 kev, appl. radiat. isotopes 70, 632 (2012). [4] s puri, d mehta, b chand, n singh, p n trehan, l shell fluorescence yields and costerkronig transition probabilities for the elements with 25 ≤ z ≤ 96, x-ray spectrom. 22, 358 (1993). [5] s puri, b chand, d mehta, m l garg, s nirmal, p n trehan, k and l shell x-ray fluorescence cross sections, atom. data nucl. data 61, 289 (1995). [6] m o krause, atomic radiative and radiationless yields for k and l shells, j. phys. chem. ref. data 8, 307 (1979). [7] m o krause, c w nestor, c j sparks, e ricci, x-ray fluorescence cross sections for k and lrays of the elements, oak ridge national laboratory, report 5399 (1978). [8] j h scofield, theoretical photoionization cross sections from 1 to 1500 kev, lawrence livermore national laboratory, report 51326 (1973). [9] j h scofield, relativistic hartree slater values for k and l x-ray emission rates, atom. data nucl. data 14, 121 (1974). [10] c a perez, m radtke, h tolentino, f c vicentin, r t neuenshwander, b brag, h j sanchez, m rubio, m i s bueno, i m raimundo, j r rohwedder, synchrotron radiation x-ray fluorescence at the lnls: beamline instrumentation and experiments, x-ray spectrom. 28, 320 (1999). [11] j m jaklevic, r d giauque, handbook of x-ray spectrometry: methods and techniques, eds. r van grieken, a markowicz, marcel dekker, new york (1993). [12] m c leypy, j plagnard, p stemmler, g ban, l beck, p dhez, si(li) detector efficiency and peak shape calibration in the low energy range using synchrotron radiation, x-ray spectrom. 26, 195 (1997). [13] d v rao, r cesareo, g e gigante, l x-ray fluorescence cross sections of heavy elements excited by 15.20, 16.02, 23.62 and 24.68 kev photons, nucl. instrum. meth. 83, 31 (1993). [14] j h hubbell, s m seltzer, tables of x-ray mass attenuation coefficients and mass-energy absorption coefficients from 1 kev to 20 mev for elements z=1 to 92 and 48 additional substances of dosimetric interest, nistir, report 5632 (1995). [15] md r khan, m karimi, kβ/kα ratios in energy dispersive x-ray emission analysis, x-ray spectrom. 9, 32 (1980). 060001-6 papers in physics, vol. 3, art. 030001 (2011) received: 12 august 2010, accepted: 18 february 2011 edited by: g. c. barker reviewed by: b. blasius, icbm, university of oldenburg, germany. licence: creative commons attribution 3.0 doi: 10.4279/pip.030001 www.papersinphysics.org issn 1852-4249 sir epidemics in monogamous populations with recombination damián h. zanette1∗ we study the propagation of an sir (susceptible–infectious–recovered) disease over an agent population which, at any instant, is fully divided into couples of agents. couples are occasionally allowed to exchange their members. this process of couple recombination can compensate the instantaneous disconnection of the interaction pattern and thus allow for the propagation of the infection. we study the incidence of the disease as a function of its infectivity and of the recombination rate of couples, thus characterizing the interplay between the epidemic dynamics and the evolution of the population’s interaction pattern. i. introduction models of disease propagation are widely used to provide a stylized picture of the basic mechanisms at work during epidemic outbreaks and infection spreading [1]. within interdisciplinary physics, they have the additional interest of being closely related to the mathematical representation of such diverse phenomena as fire propagation, signal transmission in neuronal axons, and oscillatory chemical reactions [2]. because this kind of model describes the joint dynamics of large populations of interacting active elements or agents, its most interesting outcome is the emergence of self-organization. the appearance of endemic states, with a stable finite portion of the population actively transmitting an infection, is a typical form of self-organization in epidemiological models [3]. occurrence of self-organized collective behavior has, however, the sine qua non condition that information about the individual state of agents must be exchanged between each other. in turn, this re∗e-mail: zanette@cab.cnea.gov.ar 1 consejo nacional de investigaciones cient́ıficas y técnicas, centro atómico bariloche e instituto balseiro, 8400 bariloche, ŕıo negro, argentina. quires the interaction pattern between agents not to be disconnected. fulfilment of such requirement is usually assumed to be granted. however, it is not difficult to think of simple scenarios where it is not guaranteed. in the specific context of epidemics, for instance, a sexually transmitted infection never propagates in a population where sexual partnership is confined within stable couples or small groups [4]. in this paper, we consider an sir (susceptible– infectious–recovered) epidemiological model [3] in a monogamous population where, at any instant, each agent has exactly one partner or neighbor [4, 5]. the population is thus divided into couples, and is therefore highly disconnected. however, couples can occasionally break up and their members can then be exchanged with those of other broken couples. as was recently demonstrated for sis models [6, 7], this process of couple recombination can compensate to a certain extent the instantaneous lack of connectivity of the population’s interaction pattern, and possibly allow for the propagation of the otherwise confined disease. our main aim here is to characterize this interplay between recombination and propagation for sir epidemics. in the next section, we review the sir model and its mean field dynamics. analytical results are then 030001-1 papers in physics, vol. 3, art. 030001 (2011) / d. h. zanette provided for recombining monogamous populations in the limits of zero and infinitely large recombination rate, while the case of intermediate rates is studied numerically. attention is focused on the disease incidence –namely, the portion of the population that has been infectious sometime during the epidemic process– and its dependence on the disease infectivity and the recombination rates, as well as on the initial number of infectious agents. our results are inscribed in the broader context of epidemics propagation on populations with evolving interaction patterns [4, 5, 8–11]. ii. sir dynamics and mean field description in the sir model, a disease propagates over a population each of whose members can be, at any given time, in one of three epidemiological states: susceptible (s), infectious (i), or recovered (r). susceptible agents become infectious by contagion from infectious neighbors, with probability λ per neighbor per time unit. infectious agents, in turn, become recovered spontaneously, with probability γ per time unit. the disease process s → i → r ends there, since recovered agents cannot be infected again [3]. with a given initial fraction of s and i–agents, the disease first propagates by contagion but later declines due to recovery. the population ends in an absorbing state where the infection has disappeared, and each agent is either recovered or still susceptible. in this respect, sir epidemics differs from the sis and sirs models, where –due to the cyclic nature of the disease,– the infection can asymptotically reach an endemic state, with a constant fraction of infectious agents permanently present in the population. another distinctive equilibrium property of sir epidemics is that the final state depends on the initial condition. in other words, the sir model possesses infinitely many equilibria parameterized by the initial states. in a mean field description, it is assumed that each agent is exposed to the average epidemiological state of the whole population. calling x and y the respective fractions of s and i–agents, the mean field evolution of the disease is governed by the equations ẋ = −kλxy, ẏ = kλxy − y, (1) where k is the average number of neighbors per agent. since the population is assumed to remain constant in size, the fraction of r–agents is z = 1 − x − y. in the second equation of eqs. (1), we have assigned the recovery frequency the value γ = 1, thus fixing the time unit equal to γ−1, the average duration of the infectious state. the contagion frequency λ is accordingly normalized: λ/γ → λ. this choice for γ will be maintained throughout the remaining of the paper. the solution to eqs. (1) implies that, from an initial condition without r–agents, the final fraction of s–agents, x∗, is related to the initial fraction of i–agents, y0, as [1] x∗ = 1 − (kλ)−1 log[(1 − y0)/x∗]. (2) note that the final fraction of r–agents, z∗ = 1 − x∗, gives the total fraction of agents who have been infectious sometime during the epidemic process. thus, z∗ directly measures the total incidence of the disease. the incidence z∗ as a function of the infectivity kλ, obtained from eq. (2) through the standard newton–raphson method for several values y0 of the initial fraction of i–agents, is shown in the upper panel of fig. 1. as expected, the disease incidence grows both with the infectivity and with y0. note that, on the one hand, this growth is smooth for finite positive y0. on the other hand, for y0 → 0 (but y0 6= 0) there is a transcritical bifurcation at kλ = 1. for lower infectivities, the disease is not able to propagate and, consequently, its incidence is identically equal to zero. for larger infectivities, even when the initial fraction of i– agents is vanishingly small, the disease propagates and the incidence turns out to be positive. finally, for y0 = 0 no agents are initially infectious, no infection spreads, and the incidence thus vanishes all over parameter space. iii. monogamous populations with couple recombination suppose now that, at any given time, each agent in the population has exactly just one neighbor or, 030001-2 papers in physics, vol. 3, art. 030001 (2011) / d. h. zanette figure 1: sir epidemics incidence (measured by the final fraction of recovered agents z∗) as a function of the infectivity (measured by the product of the mean number of neighbors times the infection probability per time unit per infected neighbor, kλ), for different initial fractions of infectious agents, y0. upper panel: for the mean field equations (1). lower panel: for a static (nonrecombining) monogamous population, described by eqs. (3) with r = 0. in other words, that the whole population is always divided into couples. in reference to sexually transmitted diseases, this pattern of contacts between agents defines a monogamous population [5]. if each couple is everlasting, so that neighbors do not change with time, the disease incidence should be heavily limited by the impossibility of propagating too far from the initially infectious agents. at most, some of the initially susceptible agents with infectious neighbors will become themselves infectious, but spontaneous recovery will soon prevail and the disease will disappear. if, on the other hand, the population remains monogamous but neighbors are occasionally allowed to change, any i–agent may transmit the disease several times before recovering. if such changes are frequent enough, the disease could perhaps reach an incidence similar to that predicted by the mean field description, eq. (1) (for k = 1, i.e. with an average of one neighbor per agent). we model neighbor changes by a process of couple recombination where, at each event, two couples (i, j) and (m, n) are chosen at random and their partners are exchanged [6, 7]. the two possible outcomes of recombination, either (i, m) and (j, n) or (i, n) and (j, m), occur with equal probability. to quantify recombination, we define r as the probability per unit time that any given couple becomes involved in such an event. a suitable description of sir epidemics in monogamous populations with recombination is achieved in terms of the fractions of couples of different kinds, mss, msi, mii, mir, mrr, and msr = 1 − msi − mii − mir − mrr. evolution equations for these fractions are obtained by considering the possible transitions between kinds of couples due to recombination and epidemic events [7]. for instance, partner exchange between two couples (s,s) and (i,r) which gives rise to (s,i) and (s,r), contributes positive terms to the time derivative of msi and msr, and negative terms to those of mss and mir, all of them proportional to the product mssmir. meanwhile, for example, contagion can transform an (s,i)–couple into an (i,i)– couple, with negative and positive contributions to the variations of the respective fractions, both proportional to msi. the equations resulting from these arguments read ṁss = rasir, ṁsi = rbsir − (1 + λ)msi, ṁii = rairs + λmsi − 2mii, ṁir = rbirs + 2mii − mir, ṁrr = rarsi + mir, ṁsr = rbrsi + msi. (3) for brevity, we have here denoted the contribution of recombination by means of the symbols aijh ≡ (mij +mih)2/4−mii(mjj +mjh+mhh), (4) 030001-3 papers in physics, vol. 3, art. 030001 (2011) / d. h. zanette and bijh ≡ (2mii + mih)(2mjj + mjh)/2 −mij (mij + mih + mjh + mhh)/2, (5) with i, j, h ∈ {s, i, r}. the remaining terms stand for the epidemic events. in terms of the couple fractions, the fractions of s, i and r–agents are expressed as x = mss + (msi + msr)/2, y = mii + (msi + mir)/2, z = mrr + (msr + mir)/2. (6) assuming that the agents with different epidemiological states are initially distributed at random over the pattern of couples, the initial fraction of each kind of couple is mss(0) = x20, msi(0) = 2x0y0, mii(0) = y20, mir(0) = 2y0z0, mrr(0) = z 2 0 , and msr(0) = 2x0z0, where x0, y0 and z0 are the initial fractions of each kind of agent. it is important to realize that the mean field–like eqs. (3) to (6) are exact for infinitely large populations. in fact, first, pairs of couples are selected at random for recombination. second, any epidemic event that changes the state of an agent modifies the kind of the corresponding couple, but does not affect any other couple. therefore, no correlations are created by either process. in the limit without recombination, r = 0, the pattern of couples is static. equations (3) become linear and can be analytically solved. for asymptotically long times, the solution provides –from the third of eqs. (6)– the disease incidence as a function of the initial condition. if no r–agents are present in the initial state, the incidence is z∗ = (1 + λ)−1[1 + λ(2 − y0)]y0. (7) this is plotted in the lower panel of fig. 1 as a function of the infectivity kλ ≡ λ, for various values of the initial fraction of i–agents, y0. when recombination is suppressed, as expected, the incidence is limited even for large infectivities, since disease propagation can only occur to susceptible agents initially connected to infectious neighbors. comparison with the upper panel makes apparent substantial quantitative differences with the mean field description, especially for small initial fractions of i–agents. another situation that can be treated analytically is the limit of infinitely frequent recombination, r → ∞. in this limit, over a sufficiently short time interval, the epidemiological state of all agents is virtually “frozen” while the pattern of couples tests all possible combinations of agent pairs. consequently, at each moment, the fraction of couples of each kind is completely determined by the instantaneous fraction of each kind of agent, namely, mss = x2, msi = 2xy, mii = y2, mir = 2yz, mrr = z2, msr = 2xz. (8) these relations are, of course, the same as quoted above for uncorrelated initial conditions. replacing eqs. (8) into (3) we verify, first, that the operators aijh and bijh vanish identically. the remaining of the equations, corresponding to the contribution of epidemic events, become equivalent to the mean field equations (1). therefore, if the distributions of couples and epidemiological states are initially uncorrelated, the evolution of the fraction of couples of each kind is exactly determined by the mean field description for the fraction each kind of agent, through the relations given in eqs. (8). for intermediate values of the recombination rate, 0 < r < ∞, we expect to obtain incidence levels that interpolate between the results presented in the two panels of fig. 1. however, these cannot be obtained analytically. we thus resort to the numerical solution of eqs. (3). iv. numerical results for recombining couples we solve eqs. (3) by means of a standard fourthorder runge-kutta algorithm. the initial conditions are as in the preceding section, representing no r–agents and a fraction y0 of i–agents. the disease incidence z∗ is estimated from the third equation of eqs. (6), using the long-time numerical solutions for mrr, msr, and mir. in the range of parameters considered here, numerical integration up to time t = 1000 was enough to get a satisfactory approach to asymptotic values. figure 2 shows the incidence as a function of infectivity for three values of the initial fraction of i–agents, y0 → 0, y0 = 0.2 and 0.6, and several values of the recombination rate r. numerically, 030001-4 papers in physics, vol. 3, art. 030001 (2011) / d. h. zanette figure 2: sir epidemics incidence as a function of the infectivity for three initial fractions of infectious agents, y0, and several recombination rates, r. mean field (m. f.) results are also shown. the insert in the upper panel displays the boundary between the phases of no incidence and positive incidence for y0 → 0, in the parameter plane of infectivity vs. recombination rate. the limit y0 → 0 has been represented by taking y0 = 10−9. within the plot resolution, smaller values of y0 give identical results. mean field (m. f.) results are also shown. as expected from the analytical results presented in the preceding section, positive values of r give rise to incidences between those obtained for a static couple pattern (r = 0) and for the mean field description. note that substantial departure from the limit of static couples is only got for relatively large recombination rates, r > 1, when at least one recombination per couple occurs in the typical time of recovery from the infection. among these results, the most interesting situation is that of a vanishingly small initial fraction of i–agents, y0 → 0. figure 3 shows, in this case, the epidemics incidence as a function of the recombination rate for several fixed infectivities. we recall that, for y0 → 0, the mean field description predicts a transcritical bifurcation between zero and positive incidence at a critical infectivity λ = 1, while in the absence of recombination the incidence is identically zero for all infectivities. our numerical calculations show that, for sufficiently large values of r, the transition is still present, but the critical point depends on the recombination rate. as r grows to infinity, the critical infectivity decreases approaching unity. figure 3: sir epidemics incidence as a function of the recombination rate r for a vanishingly small fraction of infectious agents, y0 → 0, and several infectivities λ. straightforward linearization analysis of eqs. (3) shows that the state of zero incidence becomes unstable above the critical infectivity λc = r + 1 r − 1 . (9) 030001-5 papers in physics, vol. 3, art. 030001 (2011) / d. h. zanette this value is in excellent agreement with the numerical determination of the transition point. note also that eq. (9) predicts a divergent critical infectivity for a recombination rate r = 1. this implies that, for 0 ≤ r ≤ 1, the transition is absent and the disease has no incidence irrespectively of the infectivity level. for y0 → 0, thus, the recombination rate must overcome the critical value rc = 1 to find positive incidence for sufficiently large infectivity. the critical line between zero and positive incidence in the parameter plane of infectivity vs. recombination rate, given by eq. (9), is plotted in the insert of the upper panel of fig. 2. v. conclusions we have studied the dynamics of sir epidemics in a population where, at any time, each agent forms a couple with exactly one neighbor, but neighbors are randomly exchanged at a fixed rate. as it had already been shown for the sis epidemiological model [6,7], this recombination of couples can, to some degree, compensate the high disconnection of the instantaneous interaction pattern, and thus allow for the propagation of the disease over a finite portion of the population. the interest of a separate study of sir epidemics is based on its peculiar dynamical features: in contrast with sis epidemics, it admits infinitely many absorbing equilibrium states. as a consequence, the disease incidence depends not only on the infectivity and the recombination rate, but also on the initial fraction of infectious agents in the population. due to the random nature of recombination, mean field–like arguments provide exact equations for the evolution of couples formed by agents in every possible epidemiological state. these equations can be analytically studied in the limits of zero and infinitely large recombination rates. the latter case, in particular, coincides with the standard mean field description of sir epidemics. numerical solutions for intermediate recombination rates smoothly interpolate between the two limits, except when the initial fraction of infectious agents is vanishingly small. for this special situation, if the recombination rate is below one recombination event per couple per time unit (which equals the mean recovery time), the disease does not propagate and its incidence is thus equal to zero. above that critical value, a transition appears as the disease infectivity changes: for small infectivities the incidence is still zero, while it becomes positive for large infectivities. the critical transition point shifts to lower infectivities as the recombination rate grows. it is worth mentioning that a similar transition between a state with no disease and an endemic state with a permanent infection level occurs in sis epidemics with a vanishingly small fraction of infectious agents [6, 7]. for this latter model, however, the transition is present for any positive recombination rate. for sir epidemics, on the other hand, the recombination rate must overcome a critical value for the disease to spread, even at very large infectivities. while both the (monogamous) structure and the (recombination) dynamics of the interaction pattern considered here are too artificial to play a role in the description of real systems, they correspond to significant limits of more realistic situations. first, the monogamous population represents the highest possible lack of connectivity in the interaction pattern (if isolated agents are excluded). second, random couple recombination preserves the instantaneous structure of interactions and does not introduce correlations between the individual epidemiological state of agents. as was already demonstrated for sis epidemics and chaotic synchronization [7], they have the additional advantage of being analytically tractable to a large extent. therefore, this kind of assumption promises to become a useful tool in the study of dynamical processes on evolving networks. acknowledgements financial support from sectyp–uncuyo and anpcyt, argentina, is gratefully acknowledged. [1] r m anderson, r m may, infectious diseases in humans, oxford university press, oxford (1991). [2] a s mikhailov, foundations of synergetics i. distributed active systems, springer, berlin (1990). 030001-6 papers in physics, vol. 3, art. 030001 (2011) / d. h. zanette [3] j d murray, mathematical biology, springer, berlin (2003). [4] k t d eames, m j keeling, modeling dynamic and network heterogeneities in the spread of sexually transmitted diseases, proc. nat. acad. sci. 99, 13330 (2002). [5] k t d eames, m j keeling, monogamous networks and the spread of sexually transmitted diseases, math. biosc. 189, 115 (2004). [6] s bouzat, d h zanette, sexually transmitted infections and the marriage problem, eur. phys. j b 70, 557 (2009). [7] f vazquez, d h zanette, epidemics and chaotic synchronization in recombining monogamous populations, physica d 239, 1922 (2010). [8] t gross, c j dommar d’lima, b blasius, epidemic dynamics in an adaptive network, phys. rev. lett. 96, 208 (2006). [9] t gross, b blasius, adaptive coevolutionary networks: a review, j. r. soc. interface 5, 259 (2008). [10] d h zanette, s risau–gusman, infection spreading in a population with evolving contacts, j. biol. phys. 34, 135 (2008). [11] s risau–gusman, d h zanette, contact switching as a control strategy for epidemic outbreaks, j. theor. biol. 257, 52 (2009). 030001-7 papers in physics, vol. 7, art. 070004 (2015) received: 17 november 2014, accepted: 2 march 2015 edited by: l. a. pugnaloni reviewed by: g. lumay, university of liège, belgium. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070004 www.papersinphysics.org issn 1852-4249 characterizing flowability of granular materials by onset of jamming in orifice flows paul mort1∗ this paper describes methods to measure flow rate and jamming onset of granules discharged through a flat-bottom cylindrical hopper with a circular orifice. the intrinsic jamming onset for ideal particles (spherical, monodisperse, smooth) is experimentally measured by two independent methods, with good agreement. for non-ideal particles, the normalized jamming onset increases with elongated granule shape, broadened size distribution and increased friction as measured by the drained angle of repose. an empirical model of the jamming onset is introduced to quantify these effects over the range of materials investigated. the jamming onset can be used as a measure of differentiation between relatively free-flowing granules. i. background the motivation for this work is to find a means of characterizing and differentiating flow quality of relatively free-flowing granular materials. on the one hand, industry requires methods that can be performed relatively simply and reproducibly. on the other hand, both industry and academia seek better fundamental understanding of granular rheology via physical mechanisms governing flows; in the case of industry, fundamental understanding should extend to commercially-relevant granular materials. this paper correlates the onset of jamming with granular characteristics, including particle size distribution and particle shape. details are provided on size and shape characterization of commercially-relevant materials. characterization of cohesive powder flow has been relatively well established using shear cells ∗email: mort.pr@pg.com 1 procter & gamble co., 5280 vine street, cincinnati, oh 45217, usa. to quantify incipient bulk flow (i.e., yield loci) of powders and granules. flow functions calculated using yield loci can differentiate between stronglycohesive, mildly-cohesive and free-flowing materials [1]; this is highly relevant to bulk storage and handling of powders. however, shear cell measurements are relatively insensitive in regards to differences among freely-flowing granules. for applications where the quality of granular dynamic flow is of interest, we need a more sensitive methodology, hence the motivation of the current work. various approaches are discussed in the literature, including using rotating drums [2, 3] and impeller-driven flows [4, 5] to measure granular rheology. the current work focuses on the onset of jamming in orifice flows to differentiate the quality of otherwise freelyflowing granules. let us consider jamming in the context of a shear cell analysis. according to continuum modeling, a “cohesionless” material measured using a shear cell (i.e., a yield locus passing through the origin) implies flow through a hopper cone opening of infinitesimal diameter. at this limit, the continuum model fails because the orifice must be at least as 070004-1 papers in physics, vol. 7, art. 070004 (2015) / p. mort figure 1: schematic of a flat-bottom cylindrical hopper device (flodextm). large as the particle diameter. further, local packing and jamming effects require the opening to be significantly larger than a single particle [6]. a theoretical analysis of intrinsic cohesion suggests a boundary layer effect of 3-5 particle diameters [7]; as a starting point, this can be taken as a theoretical minimum dimensionless orifice size for an otherwise free-flowing granular material. experimental studies on the physics of jamming in granular flows report jamming transitions on the basis of a dimensionless hopper opening at the onset of the jamming transition [8, 9], where the dimensionless hopper opening (do/d) is defined by the ratio of the opening size (do) relative to a monodisperse particle size (d). the current work is an empirical investigation, seeking to elucidate the effects of both size distribution and shape factors on the jamming probability of free-flowing granules. ii. experimental the experimental device used in this work was a flat-bottom cylindrical hopper with a set of interchangeable disks having a range of discrete orifice sizes; it is available as a commercial flow testing system, flodextm(hanson research, chatsworth, ca, usa), with additional disks machined as refigure 2: interpolation of jamming onset (j) at 25% mass discharged. quired to extend the range of orifice diameters. the range of orifice sizes used with the flodex included: 2.0, 2.5, 3.0, 3.5, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18 and 20 mm. experiments were done with common materials including narrowly-classified glass beads, ottawa sand, and a variety of granular detergent samples. in all cases, these samples were relatively free-flowing according to shear cell measurements. a schematic of the orifice flow instrument is shown in fig. 1. approximately 100 ml of material is used in the test, filling the 5.7 cm diameter stainless-steel hopper to a height of about 4 cm by pouring the granular sample through the loading funnel. after the sample settles, the spring-loaded discharge gate is opened and the sample is allowed to drain through the orifice into a receiving cup below. once the flow stops and remains stopped for 30 seconds, the mass of discharged material is measured. clogging is defined as a persistent jammed state where the orifice remains obscured by the granules at the point of flow stoppage. for each measurement, the mass% discharged is calculated according to the formula: (mass% discharged) = 100 × (mass discharged) / (sample mass). the average of the three mass% discharge calculations is plotted as a function of the dimensionless orifice size (do/d), with the mass% discharged on the ordinate and the dimensionless orifice size on the abscissa. this procedure is repeated using incrementally larger orifice sizes until the hopper discharges without clogging for three consecutive trials. the averaged data are linearly interpolated to find the jamming onset (j), defined here as the value of the 070004-2 papers in physics, vol. 7, art. 070004 (2015) / p. mort dimensionless orifice size at the point of 25 mass% average discharge (fig. 2). for trials that drain without clogging, the drained angle of repose, φd, is calculated according to eq. (1), derived assuming the granular material discharges in the form of an inverted cone with a base diameter equal to the hopper diameter, dh, and truncated where the cones apex protrudes through the orifice with diameter, do. assuming the remaining volume of retained material has cylindrical symmetry (fig. 3), simple solid geometry and the material’s repour (loose) bulk density, ρbulk, are sufficient to relate the measured mass of retained material, mret, to the drained angle of repose. φd = arctan [ 24mret πρbulk(2d 3 h − 3dod 2 h + d 3 o) ] (1) orifice size is made dimensionless by scaling the orifice size (do) to a characteristic particle size (d). the sauter mean (d32) is used as the characteristic size for this study. note the sauter mean is weighted toward the surface area of the granules, the surface area being relevant to inter-particle contact and frictional interactions of the flow. studies of cohesive dry powders show correlation between the sauter mean size and flow properties [10]. size distributions were measured by sieving, fitting the size distribution using a log-normal distribution model to obtain a mass-based geometric mean (d43) and geometric standard deviation (σg) for each sample, then converting to the sauter mean using the hatch-choate relation [11], eq. (2). examples of size distribution characteristics are shown in fig. 4, for select samples a-d discussed in detail later in the results section. ln d32 = ln d43 − ln2 σg (2) shape characteristics were measured using an automated image analysis system, solids sizer (jm canty, buffalo, ny), using statistical averaging of 104 cross-sectional particle images per sample. an example of particles counted in a freeze-frame captured by the image analysis system is given in fig. 5. note that the particles are in free-fall at the point of image capture, so the cross sectional image is randomly distributed over the possible states figure 3: vertical cross section of residual mass in a cylindrical hopper after draining successfully without a clog. figure 4: examples of particle size distribution analysis for four samples, a-d. the data points are from sieve analysis; the fitted curves are based on a log-normal distribution, weighted to the mass on each sieve. figure 5: image analysis freeze frame and crosssectional aspect ratio (ar) analysis: a) grayscale image; b) b&w image resulting from threshold analysis with particles outlined in red, ar = d1/d2. 070004-3 papers in physics, vol. 7, art. 070004 (2015) / p. mort figure 6: discharge rate data for sample d, measured over a series of orifice diameters (4 mm to 9 mm). for each condition, the flow rate is the time derivative of mass discharged. the flow rate data are plotted according to the beverloo relation in fig. 9. of rotation. the median aspect ratio (ar50) was used as the characteristic shape factor in this study. the granular discharge rate was measured by placing the receiver cup on an electronic balance with data acquisition to record the discharge mass with time. this experiment was done using orifice sizes above the jamming onset. discharge data are provided in fig. 6, using sample d as an example. sample d is discussed in more detail in the results section. the flow rate was determined by taking the time derivative (slope) of the cumulative discharge. the mass flow rate is converted to a volumetric rate using the repour bulk density. iii. results and discussion i. analysis of jamming onset using a multivariate model results for nearly “ideal” materials including tightly classified glass beads and washed ottawa sand are shown on fig. 7. these samples are nearly mono-disperse. the glass beads are nearly spherical, while the sand is slightly irregular in shape. both have relatively smooth-surfaces and a low drained friction angle. the jamming onset for the glass beads is in the range of 3 to 5 particle diameters. the washed sand has a slightly higher jamming onset of about 7 diameters. other samfigure 7: cylindrical hopper discharge data, jamming onset calculations and drained angle of repose for glass bead and washed sand samples. ples characterized in this work included commercial granules having similar compositions, but with a range of particle size and shape characteristics, the effect of which resulted in relative jamming onsets up to 30 and drained friction angles in excess of 60 degrees. because the commercial samples have several aspects of variability, a multivariate model was formulated to assess the relative effects of friction, size distribution and shape characteristics. data from the broader set of 23 samples (table 1), including commercial granular materials and manipulations thereof, were used to generate a multivariate power-law model for the jamming onset (j) as a function of particle parameters [eq. (3)]. the model parameters include the shape factor (median aspect ratio, ar50), size distribution breadth (geometric standard deviation, σg) and an excess friction factor defined as xf = max[1, tan(φd)]. the range in breadth of size distribution was from about 1.0 to 2.2; the range of median aspect ratio was from about 1.0 to 1.4. the fitted coefficients (a, b, c) represent exponents in the power-law form of the model. while this equation is purely empirical, its form is logical in the sense that it reduces to an intrinsic jamming intercept (kj) when the other terms are minimized. in other words, extrapolation to kj represents the jamming onset of an idealized sample (mono-disperse, spherical, smooth, low friction). ln(j) = ln(kj) + a ln(ar50) + b ln(σg) + c ln(xf) (3) 070004-4 papers in physics, vol. 7, art. 070004 (2015) / p. mort figure 8: multivariate regression of jamming onset model [eq. (3)]; deviation from the diagonal represents uncertainty in the empirical model. solid data points labeled a-d indicate samples used in flow rate analysis (fig. 9); their size distributions are shown in fig. 4. figure 9: regression [eq. (5)] of flow rate data collected for selected samples (a-d) having a range of jamming onsets. the results of the multivariate model are shown graphically in fig. 8. while the correlation coefficient (∼0.86) indicates that the model is reasonably predictive, the scatter suggests uncertainty, perhaps in the choice of model parameters, their interaction and/or uncertainty in measurements. for example, the model may be over-simplifying the parameter space, ignoring potentially important factors such as more complex shape factors, the breadth of the shape distribution, and interactions between size and shape distributions. the statistical analysis is shown in table 2. the most statistically significant factors are the intrinsic jamming intercept, kj, excess friction factor, xf , and the geometric size distribution, σg. the value of kj is ∼2.9, at the lower end of the range predicted by wier [7]. while the particle shape factor, ar50, is somewhat less significant statistically, its high coefficient (2.38) indicates that particle shape can have a strong impact on jamming, even over the relatively narrow range tested. ii. flow rate measurements and beverloo analysis flow rate data were analyzed using the beverloo equation [12]. it is shown here in volumetric form [eq. (4)], where v ′ is the volumetric feed rate, do is the orifice diameter, d is the characteristic particle size (sauter mean), g is the gravitational constant and c and kf are fit parameters; do − kfd represents a reduced orifice size for active flow caused by a boundary layer that scales with doparticle size. for the purpose of regression analysis as a function of relative orifice size (do/d), the beverloo equation can be written in linearized form, eq. (5), where c is solved using the regression slope and kf using the intercept. v ′ = c √ g (do −kfd) 5/2 (4) v ′2/5 d = c2/5g1/5 ( do d −kf ) (5) several flow rate data sets are shown in fig. 9. these data sets (samples a-d) represent similar materials, but with different shape and size distribution characteristics. sample a is a commercial granule; samples b and c are manipulated by classification; and sample d is classified then further 070004-5 papers in physics, vol. 7, art. 070004 (2015) / p. mort table 1: experimental data for relative jamming onset (j) as a function of excess friction factor (xf), geometric size distribution (σg), and median aspect ratio (ar50). sample j xf σg ar50 a 18.6 1.49 1.84 1.36 b 16.2 1.01 1.76 1.35 c 11.8 1.05 1.63 1.35 d 7.9 1.00 1.37 1.26 e 6.8 1.00 1.33 1.30 f 7.1 1.00 1.34 1.24 g 8.9 1.10 1.64 1.25 h 9.6 1.08 1.48 1.32 i 28.9 1.90 1.79 1.41 j 15.1 1.00 2.14 1.26 k 18.3 1.53 1.50 1.32 l 28.2 1.24 1.79 1.34 m 15.6 1.00 1.56 1.41 n 19.1 1.83 1.56 1.41 o 6.5 1.08 1.47 1.28 p 7.3 1.00 1.39 1.27 q 9.1 1.05 1.41 1.26 r 7.6 1.09 1.32 1.27 s 17.6 1.73 1.31 1.26 t 6.8 1.00 1.23 1.21 u 7.1 1.00 1.10 1.20 v 4.3 1.00 1.05 1.10 w 3.5 1.00 1.05 1.15 table 2: parameter estimates and standard errors obtained from multi-variate regression of eq. (3) using the data of table 1. “prob> |t|” is the probability that the true parameter value is zero, against the two-sided alternative that it is not; values less than about 0.05 are typically regarded as highly significant. term estimate std error p> |t| ln(kj) 1.072 0.225 0.0001 a 2.380 1.267 0.0758 b 1.404 0.398 0.0023 c 1.095 0.265 0.0006 rounded in a layering process. the jamming onsets (j) of these samples span a significant range from about 8 to 20. however, the beverloo regressions are remarkably similar. all lie on a common slope (c ∼ 0.57) and extrapolate to a similar zero-flow intercept (kf ∼ 2.64). in other words, the granular flow behavior is consistent for orifice sizes above the jamming onset, but subtle differences in the granular characteristics and frictional properties have a significant effect on the onset of jamming. this suggests that the onset of jamming is a useful measure to differentiate the flow behavior of relatively free-flowing granular materials. consider kf to represent an “intrinsic jamming onset” at which the flow rate goes to zero. note that the extrapolated value for intrinsic jamming (i.e., no flow) is only slightly lower than the powerlaw intercept of the multivariate model (kj ∼ 2.9). recall, the measurement of the jamming onset, j, is based on the interpolated orifice size for an average 25% sample discharge from the test hopper. thus, it is logical that kj, which is based on multivariate analysis of j, may slightly exceed kf , which implies zero flow. on the other hand, the slope of the jamming probability function tends to be quite steep in the range from zero to 25% discharge. so, for the purpose of this paper, we can merge the interpretation of kj and kf —both represent an intrinsic jamming limit. indeed, even though each one was measured independently using different methods, their values are nearly identical. an interpretation of this result is that the “intrinsic jamming onset” is characteristic of orifice flow for an ideal particle (monodisperse, spherical, no excess friction), and that differences between the intrinsic onset and the actual measured jamming onset are due, in large part, to irregularities in the size distribution, shape and frictional roughness of the granules. iv. conclusions the statistical analysis of jamming is a promising approach for characterization of relatively freeflowing granular materials. in this work, we evaluated the effect of dimensionless orifice size on the onset of jamming and flow rate. a multivariate analysis of the jamming onset suggests an intrinsic jamming probability of about 2.9 particle diame070004-6 papers in physics, vol. 7, art. 070004 (2015) / p. mort ters for an idealized sample; irregularities such as size distribution, shape or excess friction support jamming across an increased number of particle diameters. while all three parameters are significant in the regression analysis, the result suggests that shape (aspect ratio) is an especially important contributor to jamming. analysis of the flow rate data using the beverloo equation extrapolates to a similar value for the intrinsic jamming onset (∼2.6 particle diameters), providing good experimental agreement between two independent methods. the experimental methods described herein are relatively simple. continued work covering a wider range of materials can help to build stronger empirical models for the effects of shape, size distribution and other frictional properties on jamming probability. in addition, first principle modeling of dense flows may help to elucidate the underlying mechanisms of jamming and the real effects of non-ideal particle characteristics, and perhaps help to build a more theoretical model. lastly, the reader should temper the results presented with the caveat that all of the samples used herein are relatively free-flowing granular materials. as materials become more cohesive and/or compressible, it is not clear that the flow rate and jamming onset experiments will continue to converge on a common intrinsic jamming onset. acknowledgements the original data for this paper were collected and analyzed as part of a student internship at procter & gamble by derek geiger, university of michigan, with the assistance of mark wandstrat at p&g, and presented at the 2007 annual meeting of the american institute of chemical engineers [13]. this work at p&g followed an earlier collaboration with professor r. p. behringer [14]; the author acknowledges professor behringer’s guidance and insight on using fundamental granular physics to address problems of industrial relevance. [1] j a h de jong, a c hoffmann, h j finkers, properly determine powder flowability to maximize plant output, chem. eng. prog. 95, 25 (1999). [2] g lumay, f boschini, k traina, s bontempi, j-c remy, r cloots, n vandewalle, measuring the flowing properties of powders and grains, powder technol. 224, 19 (2012). [3] a alexander, b chaudhuri, a faqih, f muzzio, c davies, s tomassone, avalanching flow of cohesive powders, powder technol. 164, 13 (2006). [4] m kheiripour langroudi, s turek, a ouazzi, g i tardos, an investigation of frictional and collisional powder flows using a unified constitutive equation, powder technol. 197, 91 (2010). [5] m leturia, m benali, s lagarde, i ronga, k saleh, characterization of flow properties of cohesive powders: a comparative study of traditional and new testing methods, powder technol. 253, 406 (2014). [6] a h birks, m s a bradley, r farnish, the conversion of the analytical simple shear model for the jenike failure locus into principle stress space and implication of the model for hopper design, in: handbook of conveying and handling of particulate solids, eds. a levy, h kalman, pag. 95, elsevier (2001). [7] g j weir, the intrinsic cohesion of granular materials, powder technol. 104, 29 (1999). [8] i zuriguel, a garcimartn, d maza, l a pugnaloni, j m pastor, jamming during the discharge of granular matter from a silo, phys. rev. e 71, 051303 (2005). [9] k to, p-y lai, h k pak, jamming of granular flow in a two-dimensional hopper, phys. rev. lett. 86, 71 (2001). [10] d geldart, e c abdullah, a verlinden, characterisation of dry powders, powder technol., 224, 19 (2012). [11] t hatch, s p choate, statistical description of the size properties of non-uniform particulate substances, j. franklin inst., 207, 369 (1929). [12] w a beverloo, h a leniger, j van de velde, the flow of granular solids through orifices, chem eng. sci., 15, 260 (1961). 070004-7 papers in physics, vol. 7, art. 070004 (2015) / p. mort [13] d geiger, m wandstrat, p mort, granular flow through an orifice – effect of granule size and shape distributions, in: proceedings of the aiche annual meeting, salt lake city, ut, usa (2007). [14] p mort, k mckenzie, j wambaugh, r behringer, granular flow through an orifice – effect of granule size and shape distributions, in: proceedings of the world congress of particle technology 5, orlando, fl, usa (2006). 070004-8 papers in physics, vol. 7, art. 070003 (2015) received: 20 november 2014, accepted: 20 march 2015 edited by: c. a. condat, g. j. sibona licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070003 www.papersinphysics.org issn 1852-4249 reaction rate in an evanescent random walkers system miguel a. ré,1, 2∗ natalia c. bustos2 diffusion mediated reaction models are particularly ubiquitous in the description of physical, chemical or biological processes. the random walk schema is a useful tool for formulating these models. recently, evanescent random walk models have received attention in order to include finite lifetime processes. for instance, activated chemical reactions, such as laser photolysis, exhibit a different asymptotic limit when compared with immortal walker models. a diffusion limited reaction model based on a one dimensional continuous time random walk on a lattice with evanescent walkers is presented here. the absorption probability density and the reaction rate are analytically calculated in the laplace domain. a finite absorption rate is considered, a model usually referred to as imperfect trapping. short and long time behaviors are analyzed. i. introduction the dynamics of the diffusion mediated reaction process has been extensively studied for many years due to its relevance in the description of diverse phenomena in physics, chemistry or biology [1–4]. a particularly interesting problem is the calculation of the probability density for the time at which a reaction a + b → c takes place (absorption probability density apd) when the displacement of species a or b (or both) is diffusive. other magnitudes, such as time dependent reaction rates or survival probabilities of the reactives, can be derived from the apd. dielectric relaxation [5], capture of ligands after surface diffusion [6] or proteins with active sites deep inside the protein matrix [7] ∗email: mgl.re33@gmail.com 1 departamento de ciencias básicas, ciii, facultad regional córdoba, universidad tecnológica nacional. córdoba, argentina. 2 facultad de matemática, astronomı́a y f́ısica, universidad nacional de córdoba, ciudad universitaria, 5010 córdoba, argentina. are examples of the application of the diffusion mediated reactions schema. a random walk schema provides an excellent tool to model diffusion and has been studied for a long time with different alternatives to include the reaction process. recently, a new kind of random walk models has been addressed in [8,9] to include evanescent or mortal random walkers. in these models, the diffusing particles (reactives) may disappear during their displacement. their disappearance may represent, for example, the decay of a laser activated reactive as in studies of fluorescence quenching by laser photolysis [10]. the calculation of chemical reaction rates from a model of immortal diffusing particles in the presence of a trap may be traced to the original contribution of smoluchowski [11]: a single absorbing sphere surrounded by diffusing particles with an initial uniform concentration. smoluchowski’s model assumes immediate trapping (reaction) upon encounter of the reactives, a great dilution of one of the species (the minority species represented by the sphere) and normal diffusion of the particles with diffusion coefficient d = da + db, being da 070003-1 papers in physics, vol. 7, art. 070003 (2015) / m. a. ré et al. and db the diffusion coefficients of species a and b, respectively. diverse extensions have been proposed to include diffusion in disordered media [5] or to improve the description of short time behavior by considering a finite reaction rate [12, 13]. extensions of smoluchowski’s model to random walks on lattices have been proposed to consider a finite reaction rate [14, 15] or the modulation of reaction through gating controlled by an independent dynamics [16]. in these models, the reactives may separate without reaction in each encounter. we consider here an evanescent continuous time random walk (ctrw) on a one dimensional lattice with transitions to the nearest neighbors. we assume a time independent transitions rate λ for diffusion and η for evanescence. the reaction is included in the model by considering a trap at a fixed position in the lattice. when a walker arrives at this position, it may be trapped with a rate κ or it may escape to a neighbor site with transition rate λ. the main magnitude to be calculated is the absorption probability density (apd) for a walker starting at an arbitrary position on the lattice: the probability density for the time of reaction. the apd is analytically calculated in the laplace representation and from this magnitude the reaction rate and the survival probability are calculated. ii. continuous time random walk we include here some general ctrw results. although most of these results may be found in the literature, we include them here with our particular problem in mind and also to make consistent the notation used in this paper. let us consider an infinite one dimensional lattice as shown in fig. 1. each position in the lattice is identified by an integer number x. we assume that at some instant t = 0; present on the lattice, there is a uniform distribution of noninteracting walkers with concentration c0 at every lattice position. each walker is able to perform a ctrw with probability ψ0 (x−x′; t− t′) dt of making a transition x′ → x between t and t + dt, having arrived at x′ at time t′. we shall assume here the waiting time probability density ψ0 (x−x′; t) = [pd δx,x′+1 + pi δx,x′−1] λe−λt. (1) let g0 (x; t | x0) denote the conditional probability density for the arrival time at position x of a walker that started its journey at x0. this probability density is the green’s function for the problem and satisfies the recursive relation g0 (x; t | x0) = δx,x0δ ( t− 0+ ) (2) + ∑ x′ ψ0 (x−x′; t) ? g0 (x′; t | x0) , where ? stands for the time convolution product f (t) ? g (t) = t∫ 0 dt′f (t− t′) g (t′) . equation (2) may be solved by taking laplace transform in the time variable and fourier transform in the lattice coordinate. by means of this procedure, we get the solution in the laplace representation gl0 (x; u | x0) = 1 2rpψ l 0 (u) r (u) [ξ (u)] |x−x0| rx−x0c (3) where the super index l indicates the laplace transform of a function fl (u) = ∫ ∞ 0 dte−ut f (t) , and we have defined the auxiliary symbols rp = √ pdpi, rc = √ pi/pd, ψl0 (u) = λ u + λ , r (u) = √( 1 2rpψ l 0 (u) )2 − 1, ξ (u) = 1 2rpψ l 0 (u) −r (u) . (4) the conditional probability p0 (x; t | x0) of finding the walker at position x at time t given that it started at x0 can be obtained from the green’s function by a convolution product p0 (x; t | x0) = φ0 (t) ? g0 (x; t | x0) , (5) with φ0 (t) = e −λt, (6) 070003-2 papers in physics, vol. 7, art. 070003 (2015) / m. a. ré et al. g g g . . .����6r evanescence η g g . . . gr6. . .g gr g�� piλ pdλ�diffusion trapping limbo κ figure 1: evanescent ctrw on a one dimensional lattice. the walker may evanesce with an evanescent rate η. there is a trap at a particular site of the lattice. when the walker reaches the trap position, it may be trapped with a trapping rate κ. the sojourn probability at any site in the lattice. the conditional probability density for the first passage time, in turn, can be expressed in terms of the green’s function in the laplace representation by siegert’s [17] formula fl0 (x; u | x0) = gl0 (x; u | x0) gl0 (x; u | x) (7) i. evanescent continuous time random walk we include now the possibility of evanescence of a walker at any site in the lattice. we assume that evanescence is a process statistically independent of displacement with a time independent rate η. in the evanescent case, the waiting time density (wtd) modifies to ψe (x−x′; t) = [pd δx,x′+1 + pi δx,x′−1] ×λe−(λ+η)t. (8) the green’s function for the evanescent ctrw (ectrw) satisfies a recursive relation similar to eq. (2), where ψ0 must be replaced by ψe. the solution is directly obtained in the laplace representation in terms of gl0 gle (x; u | x0) = g l 0 (x; u + η | x0) , (9) i.e., in the standard ctrw green’s function in eq. (3); the variable u must be substituted by u + η. if we assume an initially uniform concentration in the ectrw, c0, the concentration at time t is c (t) = c0e −ηt, (10) i.e., it remains uniform but decays exponentially. iii. local trap in an ectrw we represent the reaction process as the trapping of a walker by a trap at a particular position in the lattice, denoted here as x1. in fig. 1, we represent the trapping as a transition of the walker to a limbo state from which it cannot return to the lattice. when a walker arrives at x1, it may be trapped with a time independent rate κ or it may make a transition to a neighbor site with a transition rate λ continuing with its walk or it may even evanesce with rate η. we assume the three processes to be statistically independent from each other. to take into account the trapping possibility at x1, the wtd at the trap position is modified to ψ1 (x−x1; t) = [pd δx,x1+1 + pi δx,x1−1] ×λe−(λ+η+κ)t. (11) for the remaining sites in the lattice, the wtd is that of eq. (8). therefore, the green’s function for the trapping problem, gt (x; t | x0), satisfies the recursive relation gt (x; t | x0) =δx,x0δ ( t− 0+ ) (12) + ∑ x′ ψi (x−x′; t) gt (x′; t | x0) , with ψi (x−x′; t) =   ψe (x−x′; t) x′ 6= x1 ψ1 (x−x1; t) x′ = x1 (13) by means of the local inhomogeneity method as in [15, 16], we may express gt (x; t | x0) in terms of ge (x; t | x0) in the laplace representation 070003-3 papers in physics, vol. 7, art. 070003 (2015) / m. a. ré et al. glt (x; u | x0) = g l e (x; u | x0) − a b c, (14) where a =gle (x; u | x1) −δx,x1 − ∑ x′ gle (x; u | x ′) ψ1 (x ′ −x1) , b =gle (x1; u | x1) − ∑ x′ gle (x1; u | x ′) ψ1 (x ′ −x1) , c =gle (x1; u | x0) . the green’s function in eq. (14) at x = x1 is of particular interest for the trapping problem. for a walker to be trapped, it must be at x1 and it has to make a transition to the limbo state instead of making a transition to a neighbor site or to evanesce. the time of trapping probability density (apd) is given by the convolution product a (t | x0) = κe−(κ+η+λ)t ? glt (x1; t | x0) . (15) an explicit expression for the apd is obtained in the laplace representation by making use of eqs. (3), (4), (9) and (14) al (u | x0) = 1 1 + 2rp λ κ r (u + η) × [ξ (u + η)] |x1−x0| rx1−x0c . (16) the same expression for the apd is obtained if we assume that the trap (the minority species) is evanescent instead of the walkers (the majority species). iv. survival probability and reaction rate if we consider a walker that starts its journey at x0, the probability that this walker has not been trapped by time t (one particle survival probability) is s1 (t) = 1 − ∫ t 0 dt′a (t′ | x0) . (17) following bendler and shlesinger [5], we assume a finite lattice of size v with n walkers initially uniformly distributed (the probability of finding a walker at a given site is 1/v ), with initial concentration c0 = n/v . the probability that none of the walkers initially on the lattice has been trapped by time t is sn (t) = [ 1 − 1 v ∫ t 0 dt′ ∑ x0 a (t′ | x0) ]n . (18) in the thermodynamic limit (n,v → ∞, n/v → c0), the probability of no reaction (survival probability) at time t is φ (t) = exp [ −c0 ∫ t 0 dt′ ∑ x0 a (t′ | x0) ] . (19) the exponent in eq. (19) is the integral of the time dependent reaction rate, r(t) = ∂t ln (φ (t)) r(t) = c0 ∑ x0 a (t′ | x0) . (20) in fig. 2, we present the reaction rate, r(t), obtained for the model in eq. (1). the plots are presented in dimensionless units. as it can be appreciated in plot a, there is a small influence of bias in r at intermediate times (in units of mean waiting time for diffusion). in plot b, the effect of evanescence is shown. a faster decline in r is observed with increasing evanescence rate, as it should be expected. in fig. 3, we present the graph of the function f (t) = 1 − φ (t) vs. λt (dimensionless units). the function f (t) introduced in [10], and also considered in [15], may be interpreted as the fraction of the original number of minority species (the trap) that have reacted by time t. at short times, the curves are grouped according to the value of the η/λ quotient (we are considering only first order markovian dynamics here), but at long times, this behavior is not observed due to evanescence. 070003-4 papers in physics, vol. 7, art. 070003 (2015) / m. a. ré et al. figure 2: reaction rate as a function of time in dimensionless units. λ is the transition rate between sites in the ctrw. plot a exhibits the influence of bias on the ctrw. plot b is for different values of the quotient η/λ, where η is the evanescence rate. v. discussion and conclusions we have presented a theoretical study of diffusion mediated reactions in an evanescent ctrw on a one dimensional lattice. a finite trapping rate is assumed when a walker reaches the trap position (imperfect trap model in refs. [12, 14, 15]). therefore, in each encounter the reaction will not always occur and the walker may escape from the trap. exact analytical expressions for the reaction rate and the survival probability (probability of no reaction) are obtained in the laplace representation. evanescence modifies the long time behavior of the survival probability: the survival probability does not go to zero at long times as in the usual models. consequently, the minority species fraction that reacts does not reach the asymptotic value of figure 3: f (t), the fraction of minority species that have reacted by time t vs. time in dimensionless units. λ is the transition rate between sites in ctrw. for immortal walkers, the asymptotic limit is 1. in this case, this value is reduced by evanescence. 1 at long times. in this case, the asymptotic value depends on the initial concentration of the majority species and on the ratio η/λ between the evanescence rate and the diffusion rate. finally, we point out that the reaction rate value obtained is not modified if we assume an evanescent trap in the presence of non evanescent walkers. more work along this line, generalizing the present results, is being developed and will be communicated elsewhere. [1] g h weiss, aspects and applications of the random walk, north holland, amsterdam (1994). [2] s a rice, diffusion-controlled reactions in solution, in: comprehensive chemical kinetics, eds. c h bamford, c f h tipper, r g compton, elsevier, amsterdam (1985). [3] n h berg, random walks in biology, princeton university press, princeton (1993). [4] n s goel, n richter-dyn. stochastic models in biology, academic press, new york (1974). [5] j t bendler, m f shlesinger, dielectric relaxation via the montroll-weiss random walk 070003-5 papers in physics, vol. 7, art. 070003 (2015) / m. a. ré et al. of defects, in: the wonderful world of stochastics, eds. m f shlesinger, g h weiss, elsevier, amsterdam (1985). [6] d wang, s gou, d axelrod, reaction rate enhancement by surface diffusion of adsorbates, biophys. chem. 43 117 (1992). [7] w nadler, d l stein, reaction–diffusion description of biological transport processes in general dimension, j. chem. phys. 104 1918 (1996). [8] s b yuste, e abad, k lindenberg, exploration and trapping of mortal random walkers, phys. rev. lett. 110 220603 (2013). [9] e abad, s b yuste, k lindenberg, survival probability of an immobile target in a sea of evanescent diffusive or subdiffusive traps: a fractional equation approach, phys. rev. e 86 061120 (2012). [10] j t chuang, k b eisenthal, studies of excited state charge-transfer interactions with picosecond laser pulses, j. chem. phys. 62 2213 (1975). [11] m v smoluchowski, versuch einer mathematischen theorie der koagulationskinetik kolloider lösungen, z. phys. chem. 92 129 (1917). [12] f c collins, g e kimball, diffusion-controlled reaction rates, j. colloid. sci. 4 425 (1949). [13] r m noyes, a treatment of chemical kinetics with special applicability to diffusion controlled reactions, j. chem. phys. 22 1349 (1954). [14] c a condat, defect diffusion and closed-time distributions for ionic channels in cell membranes, phys. rev. a 39 2112 (1989). [15] m a ré, c e budde, diffusion-mediated reactions with a time-dependent absorption rate, phys. rev. e 61, 1110 (2000). [16] m o cáceres, c e budde, m a ré, theory of the absorption probability density of diffusing particles in the presence of a dynamic trap, phys. rev. e 52, 3462 (1995). [17] a j f siegert, on the first passage time probability problem, phys. rev. 81, 617 (1951). 070003-6 papers in physics, vol. 2, art. 020005 (2010) received: 17 july 2010, accepted: 27 september 2010 edited by: d. h. zanette reviewed by: v. m. eguiluz, inst. f́ısica interdisciplinar y sist. complejos, palma de mallorca, spain licence: creative commons attribution 3.0 doi: 10.4279/pip.020005 www.papersinphysics.org issn 1852-4249 stability as a natural selection mechanism on interacting networks juan i. perotti,1, 2∗ orlando v. billoni,1, 2† francisco a. tamarit,1, 2‡ sergio a. cannas1, 2§ biological networks of interacting agents exhibit similar topological properties for a wide range of scales, from cellular to ecological levels, suggesting the existence of a common evolutionary origin. a general evolutionary mechanism based on global stability has been proposed recently [j i perotti, et al., phys. rev. lett. 103, 108701 (2009)]. this mechanism was incorporated into a model of a growing network of interacting agents in which each new agent’s membership in the network is determined by the agent’s effect on the network’s global stability. in this work, we analyze different quantities that characterize the topology of the emerging networks, such as global connectivity, clustering and average nearest neighbors degree, showing that they reproduce scaling behaviors frequently observed in several biological systems. the influence of the stability selection mechanism on the dynamics associated to the resulting network, as well as the interplay between some topological and functional features are also analyzed. i. introduction the concept of networks of interacting agents has proven, in the last decade, to be a powerful tool in the analysis of complex systems (for reviews, see refs. [1-4]). although not new, with the advent of high performance computing, this theoretical construction opened a new door for the statistical physics methodology in the analysis of systems composed by a large number of units that interact in a complicated way. this allowed to get new insights about the dynamical behavior of systems as ∗e-mail: juanpool@gmail.com †e-mail: billoni@famaf.unc.edu.ar ‡e-mail: tamarit@famaf.unc.edu.ar §e-mail: cannas@famaf.unc.edu.ar 1 facultad de matemática, astronomı́a y f́ısica, universidad nacional de córdoba, argentina. 2 instituto de f́ısica enrique gaviola (ifeg-conicet), ciudad universitaria, 5000 córdoba, argentina. complex as biological and social systems. in addition, it constitutes a basic backbone upon which relatively simple models can be constructed in a bottom-up strategy. as a modeling tool, the definition of an interaction network for a given system is frequently not unique (see for example the case of protein-protein interaction networks [5-7]), depending on the coarse grain level of the approach. nevertheless, many topological properties appear to be independent of the definition of the network. moreover, some of those properties have emerged in the last years as universal features among systems otherwise considered very different from each other. in particular, the following properties are characteristic of most biological networks. (a) small worldness: all of them exhibit high clustering cc and relatively short path length l, compared with random networks. l is defined as the minimum number of links needed to connect any pair of nodes in the network and cc is defined as the fraction of connections between 020005-1 papers in physics, vol. 2, art. 020005 (2010) / j. i. perotti et al. topological neighbors of any site[1]. (b) scale free degree distribution: the degree distribution p (k) (the probability of a node to be connected to k other ones) presents a broad tail for large values of k. in some cases, the tail can be approached by a power law p (k) ∼ k−γ with degree exponents γ < 3 for a wide range of scales, while in others, a cutoff appears for some maximum degree kmax; in the latter, the degree distribution is generally well described by p (k) ∼ k−γ e−k/kmax [1, 7-12]. in any case, the networks present a nonhomogeneous structure, very different from that expected in a random network. (c) scaling of the clustering coefficient: in many natural networks, it is observed that the clustering coefficient of a node with degree k follows the scaling law cc(k) ∼ k−β , with β taking values close to one. this has been interpreted as an evidence for a modular structure organized in a hierarchical way [13]. (d) disassortative mixing by degree: in most biological networks, highly connected nodes tend to be preferentially connected to nodes with low degree and vice versa [14]. these properties are observed for a wide range of scales, from the microscopic level of genetic, metabolic and proteins networks to the macroscopic level of communities of living beings (ecological networks). such ubiquity suggests the existence of some natural selection process that promotes the development of those particular structures [3]. one possible constraint general enough to act across such a range of scales is the proper stability of the underlying dynamics. growing biological networks involve the coupling of at least two dynamical processes. the first one concerns the addition of new nodes, attached during a slow evolutionary (i.e., species lifetime) process. a second one is the node dynamics which affects and in turn is affected by the growing processes. it is reasonable to expect that the network topologies we finally witness could have emerged out of these coupled processes. consider, for example, the case of an ecological network like a food web, where nodes are species within an ecosystem and edges are consumer-resource relationships between them. new nodes are added during evolutionary time scales, through speciation or migration of new species. then, the network grows through community assembly rules, strongly influenced by the underlying dynamics of species and specific interactions among them [15, 16]. the consequence of adding a new member with a given connectivity affecting a global in/stability, is represented in this case by the aboundance/lack of food [17]. notice that each new member may not only result in its own addition/rejection to the system, but it can also promote avalanches of extinctions amongst existing members. the above ingredients were recently incorporated into a simple model of growing networks under stability constraints [19]. numerical simulations on this model showed that, indeed, complex topology can emerge out of a stability selection pressure. in the present work, we further explore different topological and dynamical properties predicted by the model, whose definition is reviewed in section ii. the results are presented in sections iii. and iv. in section iii., we analyze the topological features that emerge in growing networks under stability constraint. in section iv., we show that this constraint not only induces topological features of the resulting networks but also influences the associated dynamics. a discussion of the results is presented in section v. ii. the model let us consider a system of n interactive agents, whose dynamics is given by a set of differential equations d~x/dt = ~f (~x), where ~x is an ncomponent vector describing the relevant state variables of each agent and ~f is an arbitrary nonlinear function. one could imagine that ~x in different systems may represent concentrations of some hormones, the average density populations in a food web, the concentration of chemicals in a biochemical network, or the activity of genes in a gene regulation net, etc. we assume that a given agent i interacts only with a limited set of ki < n other agents; thus, fi depends only on the variables belonging to that set. this defines the interaction network. we assume that there are two time scales in the dynamics. let fm be the average frequency of the incoming flux of new agents (migration, mutation, etc.). this defines a characteristic time τm = f−1m . on the long time scale t � τm (much larger than the observation time) new agents arrive to the system and start to interact with some of the previous ones. some of them can be incorporated into the 020005-2 papers in physics, vol. 2, art. 020005 (2010) / j. i. perotti et al. system or not, so n (and the whole set of differential equations) can change. once a new agent starts to interact with the system, we will assume that the enlarged system evolves towards some stationary state with characteristic relaxation time τrel � τm. then, in the short time scale τrel � t � τm we can assume that n is constant and the dynamics already led the system to a particular stable stationary state ~x∗ defined by ~f (~x∗) = 0. following may’s ideas [18], we assume that the only attractors of the dynamics are fixed points. nevertheless, the proposed mechanism is expected to work, as well, for more complex attractors (e.g, limit cycles). the stability of the solution ~x∗ is determined by the eigenvalue with maximum real part of the jacobian matrix ai,j ≡ ( ∂fi ∂xj ) ~x∗ . (1) a new agent will be incorporated to the network if its inclusion results in a new stable fixed point, that is, if the values of the interaction matrix ai,j are such that the eigenvalue with maximum real part λm of the enlarged jacobian matrix is negative (λm < 0). assuming that isolated agents will reach stable states by themselves after certain characteristic relaxation time, the diagonal elements of the matrix ai,i are negative and given unity value to further simplify the treatment [18]. the interaction values, (i.e., the non-diagonal matrix elements ai,j ) will take random values (both positive and negative) taken from some statistical distribution. in this way, we have an unbounded ensemble of systems [18] characterized by a “growing through stability” history. randomness would be self-generated through the addition of new agents processes. each specific set of matrix elements, after addition, defines a particular dynamical system and the subsequent analysis for time scales between successive migrations is purely deterministic. the model is then defined by the following algorithm [19]. at every step, the network can either grow or shrink. in each step, an attempt is made to add a new node to the existing network, starting from a single agent (n = 1). based on the stability criteria already discussed, the attempt can be successful or not. if successful, the agent is accepted, so the existing n × n matrix grows its size by one column and one row. otherwise, the novate agent will have a probability to be deleted together with some other nodes, as further explained below. more specifically, suppose that we have an already created network with n nodes, such that the n × n associated interaction matrix ai,j is stable. then, for the attachment of the (n + 1)th node, we first choose its degree kn+1 randomly between 1 and n with equal probability. then, the new agent interaction with the existing network member i is chosen, such that non-diagonal matrix elements (ai,n+1, an+1,i) (i = 1, . . . , n) are zero with probability 1−kn+1/n and different from zero with probability kn+1/n; to each non–zero matrix element we assign a different real random value uniformly distributed in [−b, b]. b determines the interaction range variability and it is one of the two parameters of the model [20]. then, we calculate numerically λm for the resulting (n + 1) × (n + 1) matrix. if λm < 0, the new node is accepted. if λm > 0, it means that the introduction of the new node destabilized the entire system and we will impose that, the new agent is either eliminated or it remains but produces the extinction of a certain number of previous existing agents. in order to further simplify the numerical treatment, we allow up to q ≤ kn+1 extinctions, taken from the set of kn+1 nodes connected to the new one; q is the other parameter of the model. to choose which nodes are to be eliminated, we first select one with equal probability in the set of kn+1 and remove it. if the resulting n × n matrix is stable, we start a new trial; otherwise, another node among the remaining kn+1 − 1 is chosen and removed, repeating the previous procedure. if after q removals the matrix remains unstable, the new node is removed (we return to the original n × n matrix and start a new trial). the process is repeated until the network reaches a maximum size n = nmax (typically nmax = 200) and restarted m times from n = 1 to obtain statistics of the networks (typically m = 105). iii. topological properties i. connectivity first, we analyzed the average connectivity c(n), defined as the fraction of non-diagonal matrix elements different from zero, averaged over different runs. in fig. 1, we show the typical behavior of 020005-3 papers in physics, vol. 2, art. 020005 (2010) / j. i. perotti et al. c(n) for different values of b (we found that c(n) is completely independent of q). the connectivity presents a power law tail for large values of n. from a fitting of the tail with a power law (see insets in fig. 1) we obtain the scaling behavior c(n) ∼ α−ω n−(1+�) (2) for large values of n, where α is the variance of the non-diagonal elements of the stability matrix (α = b2/3 for the uniform distribution) and ω = 0.7±0.1. from the inset of fig. 1, we see that the exponent � shows a weak dependency on b, taking values in the range (0.1, 0.3) . it is interesting to compare eq. (2) with may’s stability line for random networks [18] c(n) = (αn)−1. it is easy to see that eq. (2) lies above may’s stability line for network sizes up to ∼ 106 [21]. this shows that networks growing under stability constraint develop particular structures whose probability in a completely random ensemble is almost zero. in other words, the associated matrices belong to a subset of the random ensemble with zero measure and therefore they are only attainable through a constrained development process. in the next sections, we explore the characteristics of those networks. in fig. 2, we plotted the connectivity for different biological networks across three orders of magnitude of network size scales, using data collected from the literature. we see that the data are very well fitted by a single power law c(n) ∼ n−1.2, in a nice agreement with the average value � = 0.2 predicted by the present model. it is worth mentioning that the behavior c(n) ∼ n−(1+�) has also been obtained in a self organized criticality model of food webs [26]. ii. degree distribution the degree distribution p (k) of the network was analyzed in detail in ref. [19]. we briefly summarize the main results here. in fig. 3, we illustrate the typical behavior of p (k). it presents a power law tail p (k) ∼ k−γ for values of k > 20, with a finite size drop at k = nmax. the degree exponent γ takes values between 2 and 3 for values of b in the interval b ∈ (1.5, 3.5), which become almost independent of q as it increases. the exponent γ can also fall below 2 when the global stability constraint figure 1: connectivity as a function of the network size for q = 3, nmax = 200 and different values of b. the symbols correspond to numerical simulations and the dashed lines to power law fittings of the tails c(n) = b n−(1+�). the insets show the fitting values b and � as a function of b figure 2: connectivity as a function of the network size for different biological networks. the straight line is a power law fitting c(n) = a n−(1+�), giving an exponent � = 0.2 ± 0.1 (r2 = 0.92). data extracted from: [6, 7] (protein-protein interaction networks); [8] (metabolic networks); [22, 23] (food webs). 020005-4 papers in physics, vol. 2, art. 020005 (2010) / j. i. perotti et al. figure 3: degree distribution p (k) for q = 3, nmax = 200 and different values of b; the dashed lines correspond to power law fittings of the tail p (k) ∼ k−γ . logarithmic binning has been used to smooth the curves. is replaced by a local one. the qualitative structure of p (k) remains when the stability criterium λm < 0 is relaxed by the condition λm < ∆, with ∆ some small positive number. in other words, the power law tail emerges also when the addition of new nodes destabilizes the dynamics, provided that the characteristic time to leave the fixed point τ = λ−1m is large enough to become comparable to the migration time scale τm [19]. iii. network growth and clustering properties networks grown under stability constraint also display small world properties. the average clustering coefficient decays with the network size as cc(n) ∼ n−0.75 (which is slower than the 1/n decay in a random net), while the average path length l between two nodes increases as l(n) ∼ a ln (n + c) [19]. a similar behavior is observed in the barabasialbert model [1], where the clustering can be approximated by a power law with the same exponent, although the exact scaling is [27] cc(n) ∼ (ln n)2/n (therefore that behavior cannot be excluded in the present model). while this suggests the presence of an underlying preferential attachment rule mechanism, a detailed analysis has shown that this is not figure 4: average network size as a function of the time measured in number of trials for b = 2 and q = 3. the continuous red line corresponds to the power law t1/2, the dominant mechanism [19]. the behavior of cc and l is linked with the selection dynamics ruling which node is accepted or rejected. the stability constraint favors the nodes with few links, since they modify the matrix ai,j stability much less than new nodes with many links (of course this is reflected in the p (k) density). thus, most frequently, the network grows at the expense of adding nodes with one or few links, producing an increase of l and a decrease of cc, but sporadically, a highly connected node is accepted, decreasing l and increasing cc(n) [19]. those fluctuations lead to a slow diffusive-like growth of the network size n(t) ∼ t1/2 (see fig. 4). another quantity of interest is the average clustering cc(k) as a function of the degree k. a typical example is shown in fig. 5. we see that cc(k) decreases monotonously with k and displays a power law tail cc(k) ∼ k−β with an exponent β ≈ 0.9, close to one. the exponent appears to be completely independent of b and q. this behavior is indicative of a modular structure with hierarchical organization [13]. notice that this power law decay appears for degrees k > 20, precisely the same range of values for which the degree distribution p (k) displays a power law tail (see subsection ii.). 020005-5 papers in physics, vol. 2, art. 020005 (2010) / j. i. perotti et al. figure 5: average clustering coefficient cc(k) as a function of the degree k for b = 2, q = 3 and different values of nmax. the dashed black line corresponds to the power law k−0.9. iv. mixing by degree patterns to analyze the mixing by degree properties of the networks selected by the stability constraint, we calculated the average degree knn among the nearest neighbors of a node with degree k. in fig. 6, we see that knn decays with a power law knn ∼ k−δ for k > 20, with an exponent δ close to −0.25, in a clear disassortative behavior. this result is also consistent with previous works showing that assortative mixing by degree decreases the stability of a network, i.e., the maximum real part λm of the eigenvalues of random matrices of the type here considered increases faster on assortative networks than on disassortative ones [29]. iv. dynamical properties in the previous section, we analyzed different topological properties that are selected by the stability constraint, i.e., properties associated to the underlying adjacency matrix, regardless of the values of the interaction strengths. we now analyze the characteristics of the dynamics associated to the networks emerging from such constraint. in other words, we investigate the statistics of values of the non null elements aij 6= 0. first of all, we calculate the probability distribufigure 6: average nearest neighbors degree knn(k) of a node with degree k for q = 3 and different values of b and nmax. the dashed line corresponds to the power law k−0.25. tion of values for a single non null matrix element aij of the final network with size n = nmax. the typical behavior is shown in fig. 7. we see that p (aij ) is an even function, almost uniform in the interval [−b, b], with a small cusp around aij = 0. this shows that stability is not enhanced by a particular sign or absolute value of the individual interaction coefficients. it has been shown recently that the presence of anticorrelated links between pairs of nodes (i.e., links between pairs of nodes (i, j) such that sign(aij ) = −sign(aji)) significatively enhances the stability of random matrices [31]. in an ecological network, this typically corresponds to a predator-prey or parasite-host interaction. to check for the presence of such type of interactions, we calculated the correlation 〈aij aji〉, where the average is taken over pairs of nodes with a double link (aij 6= 0 and aji 6= 0). in fig. 8, we show 〈aij aji〉 as a function of the network size n. we see that this correlation is negative for any value of n and saturates into a value 〈aij aji〉 ≈ −0.65 for large values of n. in the inset of fig. 8, we compare the average fraction of double links 〈η〉 with the corresponding quantity for a completely random network with the same connectivity c(n), that is, a network where all edges are independently distributed with probabil020005-6 papers in physics, vol. 2, art. 020005 (2010) / j. i. perotti et al. figure 7: probability density of the matrix elements aij for b = 2, q = 3 and nmax = 100. ity p (aij 6= 0) = c(n). then, the probability of having a link between an arbitrary pair of sites is pd = c(n)(2 − c(n)) and the average degree per node 〈k〉 = pdn. hence 〈η〉ran = c2(n) n 〈k〉 = c(n) 2 − c(n) (3) then for large values of n, we have 〈η〉ran ∼ c(n) ∼ n−(1+�). from the inset of fig. 8, we see that 〈η〉 ∼ n−0.68 when n � 1 in the present case. the fraction of double links is considerable larger than in a random network. the two results of fig. 8 together show that the present networks have indeed a significantly large number of anticorrelated pair interactions. next, we calculated the correlation 〈aij aji〉/〈|aij|〉2 between the matrix elements, linking a node i and its neighbors j, as a function of its degree ki, where the average is taken only on the double links. from fig. 9, we see that the absolute value of the correlation presents a maximum around ki = 25 and tends to zero as the degree increases. the inset of fig. 9 shows that the average fraction of anticorrelated links 〈κ〉 (i.e., # anticorrelated links/total # double links) tends to 1/2 as the degree increases. we can conclude from these results that the interactions strengths between the hubs and their neighbors are almost uncorrelated. this suggests that the influence of figure 8: correlation function 〈aij aji〉 between a pair of nodes (i, j) with a double link as a function of the network size, for b = 2 and q = 3. the average was calculated over all pair of sites with double link in networks with the same size. the inset shows a comparison between the fraction of double links in the present network 〈η〉 (green circles) and a random network of the same size and connectivity c(n): 〈η〉ran = c/(2 − c) (continuous line). yellow triangles correspond to a numerical calculation of 〈η〉ran. the dashed line corresponds to a power law n−0.68. hubs in the stabilization of the dynamics is mainly associated to their topological role (e.g., reduction of the average length l) rather than to the nature of their associated interactions. v. discussion the recent advances in the research on networks theory in biological systems have called for a deeper understanding about the relationship between network structure and function, based on evolutionary grounds [3]. in this work, we have shown that a key factor to explain the emergency of many of the complex topological features commonly observed in biological networks could be just the stability of the underlying dynamics. stability can then be considered as an effective fitness acting in all biological situations. the results presented in fig. 2 for the connectivity of real biological networks at different network size scales support this conclusion. in ad020005-7 papers in physics, vol. 2, art. 020005 (2010) / j. i. perotti et al. figure 9: correlation function 〈aij aji〉/〈|aij|〉2 between the matrix elements linking a node i and its neighbors j as a function of its degree ki, for b = 2, q = 3 and different values of nmax. the inset shows the average fraction of anticorrelated links 〈κ〉 as a function of ki. dition, the present approach (although based on a very simple model) allows to draw some conclusions about the interplay between network structure and function that could be of general applicability. the present results suggest that hubs play mainly a topological role of linking modules (disassortativity, low clustering, uncorrelated links), while low connected nodes inside modules enhance stability through the presence of many anticorrelated interactions. the stabilizing effects of some of the topological and functional network features here analyzed have been previously addressed separately (small world [32], dissasortative mixing [29, 30], anticorrelated interactions [31]). however, the present analysis suggests that the simultaneous observance of all of them is highly unlikely to be a result of a purely random process. such delicate balance of specific topological and functional features would only be attainable through a slow, evolutionary stability selection process. in particular, the above scenario agrees very well with the observed structures in cellular networks. for instance, the scaling behavior of cc(k), displayed in fig. 5, has been observed in metabolic [28] and protein [6, 7] networks. disassortative mixing by degree is another ubiquous property of those systems and indeed a very similar behavior to that shown in fig. 6 has been observed in certain protein-protein interaction networks [7]. also, the available data for the degree distribution in all those cases are consistent with a power law behavior with an exponent γ between 2 and 2.5 [1]. the agreement with the whole set of properties predicted by the model suggests that stability could be a key evolutionary factor in the development of cellular networks. the situation is a bit different in the case of ecological networks, where the predictions of the model do not completely agree with the observations, specially those related to food webs. on the one hand, food webs usually display also disassortative mixing by degree, modularity and relatively low small worldness [33] (rather low values of clustering, compared with other biological networks), in agreement with the present predictions. regarding the scaling behavior of c(n) [24], this is the topic of an old debate in ecology (see dunne’s review in ref. [23] for a summary of the debate). while in general it is expected a power law behavior, the value of the exponent (and the associated interpretations) is controversial, due to the large dispersion of the available data, the rather small range of network sizes available and, in some cases, the low resolution of the data [25]. the consistency of the scaling shown in fig. 2 for a broad range of size scales suggests that the ecological debate should be reconsidered in a broader context of evolutionary growth under stability constraints. on the other hand, the degree distribution of food webs is not always a power law, but it frequently exhibits an exponential cutoff at some maximum characteristic degree kmax [9, 10]. such variance between food webs and other biological networks is probably related to the way ecosystems assemble and evolve compared with other systems. while the hypothesis of the present model are general enough to apply in principle to any biological system, that difference suggests that stability is not enough to explain the observed structures in food webs, but further constraints should be included to account for them. for instance, at least two different (although closely related) constraints are known that can generate a cutoff in the degree distribution: aging and limited capacity of the nodes [34]. in the former, nodes can become inactive with 020005-8 papers in physics, vol. 2, art. 020005 (2010) / j. i. perotti et al. some probability through time (in the sense that they stop interacting with new agents), while in the latter they systematically pay a “cost” every time a new link is established with them, so that they become inactive when some maximum degree is reached. one can easily imagine different situations in which mechanisms of that type become important in the evolution of ecological webs, either by limitations in the available resources or by dynamical changes in the diet of species due to external perturbations (for instance, there are many factors that constrain a predator’s diet; see ref. [9] and references therein for a related discussion). mechanisms of these kind can be easily incorporated into the model, serving as a base for the description of more complex behaviors in particular systems like food webs. finally, it would be interesting to analyze the relationship between dynamical stability in evolving complex networks and synchronization, a topic about which closely related results have been recently published [35]. acknowledgements this work was supported by conicet, universidad nacional de córdoba, and foncyt grant pict-2005 33305 (argentina). we thank useful discussions with p. gleiser and d. r. chialvo. we acknowledge useful comments and criticisms of the referee. [1] r albert, a-l barabási, statistical mechanics of complex networks, rev. mod. phys. 74, 47 (2002). [2] m e j newman, the structure and function of complex networks, siam review 45, 167 (2003). [3] s r proulx, d e l promislow, p c phillips, network thinking in ecology and evolution, trends ecol. evol. 20, 345 (2005). [4] s boccaletti, v latora, y moreno, m chavez, d u hwang, complex networks: structure and dynamics, phys. rep. 424, 175 (2006). [5] n a alves, a s martinez, inferring topological features of proteins from amino acid residue networks, physica a 375, 336 (2007). [6] s h yook, z n oltvai, a-l barabási, functional and topological characterization of protein interaction networks, proteomics 4, 928 (2004). [7] v colizza, a flammini, a maritan, a vespignani, characterization and modeling of protein–protein interaction networks, physica a 352, 1 (2005). [8] h jeong, b tombor, r albert, z n oltvai, a-l barabási, the large-scale organization of metabolic networks, nature 407, 651 (2000). [9] j m montoya, s l pimm, r v solé, ecological networks and their fragility, nature 442, 259 (2006). [10] j a dunne, r j williams, n d martinez, food web structure and network theory: the role of connectance and size, p. natl. acad. sci. usa 99, 12917 (2002). [11] o sporns, d r chialvo, m kaiser, c c hilgetag, organization, development and function of complex brain networks, trends cogn. sci. 8, 418 (2004). [12] v m eguiluz, d r chialvo, g a cecchi, m baliki, a v apkarian, scale-free brain functional networks, phys. rev. lett. 94, 018102 (2005). [13] e ravasz, a-l barabási, hierarchical organization in complex networks, phys. rev. e 67, 026112 (2003). [14] m e j newman, mixing patterns in networks, phys. rev. e 67, 026126 (2003). [15] e weiher, p keddy, ecological assembly rules, cambridge u. press, cambridge (1999). [16] s l pimm, the balance of nature, the university of chicago press, chicago (1991). [17] in particular cases, like food-webs, there could be other coupled process almost probably as important as the two considered 020005-9 papers in physics, vol. 2, art. 020005 (2010) / j. i. perotti et al. here, such as the dynamical change in previously existing links weights. for instance, if the abundace of one species is reduced, this could trigger the adaptation of predators to feed on other existing species, thus creating new links not directly related to the incoming species. the present scheme should be considered as a minimal model about stability as an evolutionary constraint. [18] r m may, will a large complex system be stable?, nature 238, 413 (1972). [19] j i perotti, o v billoni, f a tamarit, d r chialvo, s a cannas, emergent selforganized complex network topology out of stability constraints, phys. rev. lett. 103, 108701 (2009). [20] some of the numerical calculations performed in section iii. were repeated replacing the uniform distribution for the non diagonal elements by a gaussian one with the same variance. the results were indistinguishable. [21] an extrapolation of the present results suggest that the probability of growing goes to zero above some maximum network size. preliminary estimations give a value ≈ 105 for such maximum size and therefore all the obatined networks would be above may’s stability line. this topic is the subject of present investigations and the corresponding results will be published in the near future. [22] c melia, j bascompte, food web cohesion, ecology 85, 352 (2004). [23] j a dunne, food webs, in: encyclopedia of complexity and systems science, ed. r. a. meyers, pag. 3661, springer, new york (2009). [24] the quantity usually considered in ecology is the connectance, defined as the number of links in the web divided by n2. it is proportional to the connectivity c(n), with a proportionality factor ≈ 1/2. [25] for instance, a power law fitting of the food web data of fig. 2 alone gives a smaller exponent with a relative error of about 50% and correlation coefficient r2 = 0.5. [26] r v solé, d alonso, a mckane, physica a 286, 337 (2000). [27] k klemm, v m eguiluz, growing scale-free networks with small-world behavior, phys. rev. e 65, 057102 (2002). [28] e ravasz, a l somera, d a mongru, z n oltvai, a-l barabási, hierarchical organization of modularity in metabolic networks, science 297, 1551 (2002). [29] m brede, s sinha, assortative mixing by degree makes a network more unstable, arxiv:cond-mat/0507710 (2005). [30] s sinha, s sinha, robust emergent activity in dynamical networks, phys. rev. e 74, 066117 (2006). [31] s allesina, m pascual, network structure, predator–prey modules, and stability in large food webs, theor. ecol. 1, 55 (2008). [32] s sinha, complexity vs. stability in small– world networks, physica a 346, 147 (2005). [33] j a dunne, the network structure of food webs, in: ecological networks: linking structure to dynamics in food webs, eds. m. pascual, j a dunne, pag. 27, oxford university press, oxford (2006). [34] l a n amaral, a scala, m barthélémy, h e stanley, p. natl. acad. sci. usa 97, 11149 (2000). [35] t nishikawa, a motter, network synchronization landscape reveals compensatory structures, quantization, and the positive effect of negative interactions, p. natl. acad. sci. usa 107, 10342 (2010). 020005-10 papers in physics, vol. 7, art. 070006 (2015) received: 20 november 2014, accepted: 1 april 2015 edited by: c. a. condat, g. j. sibona licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070006 www.papersinphysics.org issn 1852-4249 noise versus chaos in a causal fisher-shannon plane osvaldo a. rosso,1, 2∗ felipe olivares,3 angelo plastino4 we revisit the fisher-shannon representation plane h×f, evaluated using the bandt and pompe recipe to assign a probability distribution to a time series. several stochastic dynamical (noises with f−k, k ≥ 0, power spectrum) and chaotic processes (27 chaotic maps) are analyzed so as to illustrate the approach. our main achievement is uncovering the informational properties of the planar location. i. introduction temporal sequences of measurements (or observations), that is, time-series (ts), are the basic elements for investigating natural phenomena. from ts, one should judiciously extract information on dynamical systems. those ts arising from chaotic systems share with those generated by stochastic processes several properties that make them very similar: (1) a wide-band power spectrum (ps), (2) a delta-like autocorrelation function, (3) irregular behavior of the measured signals, etc. now, irregular and apparently unpredictable behavior is often observed in natural ts, which makes interesting the establishment of whether the underlying dynamical process is of either deterministic or stochastic character in order to i) model the associated phenomenon and ii) determine which are the relevant ∗email: oarosso@gmail.com 1 insitituto tecnológico de buenos aires, av. eduardo madero 399, c1106acd ciudad autónoma de buenos aires, argentina. 2 instituto de f́ısica, universidade federal de alagoas, maceió, alagoas, brazil. 3 departamento de f́ısica, facultad de ciencias exactas, universidad nacional de la plata, la plata, argentina. 4 instituto de f́ısica, iflp-cct, universidad nacional de la plata, la plata, argentina. quantifiers. chaotic systems display “sensitivity to initial conditions” and lead to non-periodic motion (chaotic time series). long-term unpredictability arises despite the deterministic character of the trajectories (two neighboring points in the phase space move away exponentially rapidly). let x1(t) and x2(t) be two such points, located within a ball of radius r at time t. further, assume that these two points cannot be resolved within the ball due to poor instrumental resolution. at some later time t′, the distance between the points will typically grow to |x1(t′) −x2(t′)| ≈ |x1(t) −x2(t)| exp(λ |t′− t|), with λ > 0 for a chaotic dynamics, λ the largest lyapunov exponent. when this distance at time t′ exceeds r, the points become experimentally distinguishable. this implies that instability reveals some information about the phase space population that was not available at earlier times [1]. one can then think of chaos as an information source. the associated rate of generated information can be cast in precise fashion via the kolmogorov-sinai’s entropy [2, 3]. one question often emerges: is the system chaotic (low-dimensional deterministic) or stochastic? if one is able to show that the system is dominated by low-dimensional deterministic chaos, then only few (nonlinear and collective) modes are required to describe the pertinent dynamics [4]. if 070006-1 papers in physics, vol. 7, art. 070006 (2015) / o. a. rosso et al. not, then the complex behavior could be modeled by a system dominated by a very large number of excited modes which are in general better described by stochastic or statistical approaches. several methodologies for evaluation of lyapunov exponents and kolmogorov-sinai entropies for time-series’ analysis have been proposed (see ref. [5]), but their applicability involves taking into account constraints (stationarity, time series length, parameters values election for the methodology, etc.) which in general make the ensuing results non-conclusive. thus, one wishes for new tools able to distinguish chaos (determinism) from noise (stochastic) and this leads to our present interest in the computation of quantifiers based on information theory, for instance, “entropy”, “statistical complexity”, “fisher information”, etc. these quantifiers can be used to detect determinism in time series [6–11]. different information theory based measures (normalized shannon entropy, statistical complexity, fisher information) allow for a better distinction between deterministic chaotic and stochastic dynamics whenever “causal” information is incorporated via the bandt and pompe’s (bp) methodology [12]. for a review of bp’s methodology and its applications to physics, biomedical and econophysic signals, see [13]. here we revisit, for the purposes previously detailed, the so-called causality fisher–shannon entropy plane, h×f [14], which allows to quantify the global versus local characteristic of the time series generated by the dynamical process under study. the two functionals h and f are evaluated using the bandt and pompe permutation approach. several stochastic dynamics (noises with f−k, k ≥ 0, power spectrum) and chaotic processes (27 chaotic maps) are analyzed so as to illustrate the methodology. we will encounter that significant information is provided by the planar location. ii. shannon entropy and fisher information measure given a continuous probability distribution function (pdf) f(x) with x ∈ ∆ ⊂ r and ∫ ∆ f(x) dx = 1, its associated shannon entropy s [15] is s[f] = − ∫ ∆ f ln(f) dx , (1) a measure of “global character” that is not too sensitive to strong changes in the distribution taking place on a small-sized region. such is not the case with fisher’s information measure (fim) f [16,17], which constitutes a measure of the gradient content of the distribution f(x), thus being quite sensitive even to tiny localized perturbations. it reads f[f] = ∫ ∆ 1 f(x) [ df(x) dx ]2 dx = 4 ∫ ∆ [ dψ(x) dx ]2 . (2) fim can be variously interpreted as a measure of the ability to estimate a parameter, as the amount of information that can be extracted from a set of measurements, and also as a measure of the state of disorder of a system or phenomenon [17]. in the previous definition of fim (eq. (2)), the division by f(x) is not convenient if f(x) → 0 at certain x−values. we avoid this if we work with real probability amplitudes f(x) = ψ2(x) [16, 17], which is a simpler form (no divisors) and shows that f simply measures the gradient content in ψ(x). the gradient operator significantly influences the contribution of minute local f−variations to fim’s value. accordingly, this quantifier is called a “local” one [17]. let now p = {pi; i = 1, · · · ,n} be a discrete probability distribution, with n the number of possible states of the system under study. the concomitant problem of information-loss due to discretization has been thoroughly studied and, in particular, it entails the loss of fim’s shift-invariance, which is of no importance for our present purposes [10, 11]. in the discrete case, we define a “normalized” shannon entropy as h[p] = s[p ] smax = 1 smax { − n∑ i=1 pi ln(pi) } , (3) where the denominator smax = s[pe] = ln n is that attained by a uniform probability distribution pe = {pi = 1/n, ∀i = 1, · · · ,n}. for the fim, we take the expression in term of real probability amplitudes as starting point, then a discrete normalized fim convenient for our present purposes is 070006-2 papers in physics, vol. 7, art. 070006 (2015) / o. a. rosso et al. given by f[p] = f0 n−1∑ i=1 [(pi+1) 1/2 − (pi)1/2]2 . (4) it has been extensively discussed that this discretization is the best behaved in a discrete environment [18]. here, the normalization constant f0 reads f0 =   1 if pi∗ = 1 for i∗ = 1 or i∗ = n and pi = 0 ∀i 6= i∗ 1/2 otherwise. (5) if our system lies in a very ordered state, which occurs when almost all the pi – values are zeros, we have a normalized shannon entropy h∼ 0 and a normalized fisher’s information measure f ∼ 1. on the other hand, when the system under study is represented by a very disordered state, that is when all the pi – values oscillate around the same value, we obtain h ∼ 1 while f ∼ 0. one can state that the general fim-behavior of the present discrete version (eq. (4)), is opposite to that of the shannon entropy, except for periodic motions [10, 11]. the local sensitivity of fim for discrete-pdfs is reflected in the fact that the specific “i−ordering” of the discrete values pi must be seriously taken into account in evaluating the sum in eq. (4). this point was extensively discussed by us in previous works [10, 11]. the summands can be regarded as a kind of “distance” between two contiguous probabilities. thus, a different ordering of the pertinent summands would lead to a different fim-value, hereby its local nature. in the present work, we follow the lexicographic order described by lehmer [22] in the generation of bandt-pompe pdf. iii. description of our chaotic and stochastic systems here we study both chaotic and stochastic systems, selected as illustrative examples of different classes of signals, namely, (a) 27 chaotic dynamic maps [9,19] and (b) truly stochastic processes, noises with f−k power spectrum [9]. i. chaotic maps in the present work, we consider 27 chaotic maps described by j. c. sprott in the appendix of his book [19]. these chaotic maps are grouped as a) noninvertible maps: (1) logistic map; (2) sine map; (3) tent map; (4) linear congruential generator; (5) cubic map; (6) ricker’s population model; (7) gauss map; (8) cusp map; (9) pinchers map; (10) spence map; (11) sinecircle map; b) dissipative maps: (12) hénon map; (13) lozi map; (14) delayed logistic map; (15) tinkerbell map; (16) burgers’ map; (17) holmes cubic map; (18) dissipative standard map; (19) ikeda map; (20) sinai map; (21) discrete predator-prey map, c) conservative maps: (22) chirikov standard map; (23) hénon area-preserving quadratic map; (24) arnold’s cat map; (25) gingerbreadman map; (26) chaotic web map; (27) lorenz three-dimensional chaotic map; even when the present list of chaotic maps is not exhaustive, it could be taken as representative of common chaotic systems [19]. ii. noises with f−k power spectrum the corresponding time series are generated as follows [20]: 1) using the mersenne twister generator [21] through the matlab c© rand function we generate pseudo random numbers y0i in the interval (−0.5, 0.5) with an (a) almost flat power spectra (ps), (b) uniform pdf, and (c) zero mean value. 2) then, the fast fourier transform (fft) y1i is first obtained and then multiplied by f−k/2, yielding y2i ; 3) now, y 2 i is symmetrized so as to obtain a real function. the pertinent inverse fft is now at our disposal, after discarding the small imaginary components produced by the numerical approximations. the resulting time series η(k) exhibits the desired power spectra and, by construction, is representative of non-gaussian noises. iv. results and discussion in all chaotic maps, we took (see section i.) the same initial conditions and the parameter-values detailed by sprott. the corresponding initial values are given in the basin of attraction or near the attractor for the dissipative systems, or in the chaotic sea for the conservative systems [19]. for each map’s ts, we discarded the first 105 iterations 070006-3 papers in physics, vol. 7, art. 070006 (2015) / o. a. rosso et al. 0.80.60.40.2 1.00.0 0.0 0.2 0.4 0.6 0.8 1.0 d = 6 2726 25 24 23-y 23-x 22-y 22-x 21-y 21-x 20-y20-x 19-y 19-x 18-y 18-x 17 16-y 16-x 15-y 15-x 14 13 12 11 109 8 7 6 5 4 2 1,3 k =0 k =2 k =2.5 k =3 nonivertible maps dissipative maps conservative maps k noise f is h er i n fo rm at io n m ea su re normalized shannon entropy k =3.5 figure 1: localization in the causality fishershannon plane of the 27 chaotic maps considered in the present work. the bandt-pompe pdf was evaluated following the lexicographic order [22] and considering d = 6 (pattern-length), τ = 1 (time lag) and time series length n = 107 data (initial conditions given by sprott [19]). the inside numbers represent the corresponding chaotic map enumerated at the beginning of section i.. the letters “x” and “y” represent the time series coordinates maps for which their planar representation is clearly distinguishable. the open circle-dash line represents the planar localization (average values over ten realizations with different seeds) for the stochastic process: noises with f−k power spectrum. and, after that, n = 107 iterations-data were generated. stochastic dynamics represented by time series of noises with f−k power spectrum (0 ≤ k ≤ 3.5 and ∆k = 0.25) were considered. for each value of k, ten series with different seeds and total length n = 106 data were generated (see section ii.), and their corresponding average values were reported for uncorrelated (k = 0) and correlated (k > 0) noises. the bp-pdf was evaluated for each ts of n data, stochastic and chaotic, following the lexicographic pattern-order proposed by lehmer [22], with pattern-lengths d = 6 and time lag τ = 1. their corresponding localization in the causality fisher-shannon plane are shown in fig. 1. one can use any of these ts for evaluating the dynamical system’s invariants (like correlation dimension, lyapunov exponents, etc.), by appealing to a time lag reconstruction [19]. here we analyzed ts generated by each one of chaotic maps’ coordinates when the corresponding map is bior multi-dimensional. due to the fact that the bp-pdf is not a dynamical invariant (neither are other quantifiers derived by information theory), some variation could be expected in the quantifiers’ values computed with this pdf, whenever one or other of the ts generated by these multidimensional coordinate systems. from fig. 1, we clearly see that the chaotic maps under study are localized mainly at entropic region lying between 0.35 and 0.9, and reach fim values from 0.4 to almost 1. a second group of chaotic maps, constituted by: the gauss map (7), linear congruential generator (4), dissipative standard map (18), sinai map (20) and arnold’s cat map, is localized near the right-lower corner of the h×f plane, that is in the range 0.95 ≤ h ≤ 1.0 and 0 ≤ f ≤ 0.3. their localization could be understood if one takes into account that when a 2d-graphical representation of them (i.e., a graph xn ×xn+1 for one dimensional maps, or xn ×yn for two dimensional maps) it tends to fulfill the space, resembling the behavior of stochastic dynamics. however, they are chaotic and present a clear structure when the dynamics are represented in higher dimensional plane. noises with f−k power spectrum (with 0 ≤ k ≤ 5) exhibit a wide range of entropic values (0.1 ≤ h ≤ 1) and fim values lying between 0 ≤ f ≤ 0.5. a smooth transition in the planar location is observed in the passage from uncorrelated noise (k = 0 with h ∼ 1 and f ∼ 0) to correlated one (k > 0). the correlation degree grows as the k value increases. from fig. 1 we gather that, for stochastic time series with increasing correlation-degree, the associated entropic values h decrease, while fisher’s values f increase. taking into account that other stochastic processes, like fbm and fgn (not shown), present a quite close behavior to the k-noise analyzed here (see ref. [11]), we can think that the open circle-dash line represents a division of the plane; above this line all the chaotic maps are localized. it is also interesting to note that, qualitatively, the same results are obtained when the evaluations where made with pattern length d = 4 and d = 5, as well as, differ070006-4 papers in physics, vol. 7, art. 070006 (2015) / o. a. rosso et al. ent fisher information measure discretization are used. summing up, we have presented an extensive series of numerical simulations/computations and have contrasted the characterizations of deterministic chaotic and noisy-stochastic dynamics, as represented by time series of finite length. surprisingly enough, one just has to look at the different planar locations of our two dynamical regimes. the planar location is able to tell us whether we deal with chaotic or stochastic time series. acknowledgements o. a. rosso and a. plastino were supported by consejo nacional de investigaciones cient́ıficas y técnicas (conicet), argentina. o. a. rosso acknowledges support as a fapeal fellow, brazil. f. olivares is supported by departamento de f́ısica, facultad de ciencias exactas, universidad nacional de la plata, argentina. [1] h d i abarbanel, analysis of observed chaotic data, springer-verlag, new york (1996). [2] a n kolmogorov, a new metric invariant for transitive dynamical systems and automorphisms in lebesgue sapces, dokl. akad. nauk. (ussr) 119, 861 (1959). [3] y g sinai, on the concept of entropy for a dynamical system, dokl. akad. nauk. (ussr) 124, 768 (1959). [4] a r osborne, a provenzale, finite correlation dimension for stochastic systems with powerlaw spectra, physica d 35, 357 (1989). [5] h kantz, t scheiber, nonlinear time series analysis, cambridge university press, cambridge, uk (2002). [6] o a rosso, h a larrondo, m t mart́ın, a plastino, m a fuentes, distinguishing noise from chaos, phys. rev. lett. 99, 154102 (2007). [7] o a rosso, l c carpi, p m saco, m gómez ravetti, a plastino, h a larrondo, causality and the entropy-complexity plane: robustness and missing ordinal patters, physica a 391, 42 (2012). [8] o a rosso, l c carpi, p m saco, m gómez ravetti, h a larrondo, a plastino, the amigó paradigm of forbidden/missing patterns: a detailed analysis, eur. phys. j. b 85, 419 (2012). [9] o a rosso, f olivares, l zunino, l de micco, a l l aquino, a plastino, h a larrondo, characterization of chaotic maps using the permutation bandt–pompe probability distribution, eur. phys. j. b 86, 116 (2013). [10] f olivares, a plastino, o a rosso, ambiguities in bandt–pompe’s methodology for local entropic quantifiers, physica a, 391, 2518 (2012). [11] f olivares, a plastino, o a rosso, contrasting chaos with noise via local versus global information quantifiers, phys. lett a 376, 1577 (2012). [12] c bandt, b pompe, permutation entropy: a natural complexity measure for time series, phys. rev. lett. 88, 174102 (2002). [13] m zanin, l zunino, o a rosso, d papo, permutation entropy and its main biomedical and econophysics applications: a review, entropy 14, 1553 (2012). [14] c vignat, j f bercher, analysis of signals in the fisher-shannon information plane, phys. lett. a 312, 27 (2003). [15] c shannon, w weaver, the mathematical theory of communication, university of illinois press, champaign, usa (1949). [16] r a fisher, on the mathematical foundations of theoretical statistics, philos. trans. r. soc. lond. ser. a 222, 309 (1922). [17] b r frieden, science from fisher information: a unification, cambridge university press, cambridge, uk (2004). [18] p sánchez-moreno, r j yánẽz, j s dehesa, discrete densities and fisher information, in: proceedings of the 14th international conference on difference equations and applications, eds. m. bohner, et al., pag. 291, 070006-5 papers in physics, vol. 7, art. 070006 (2015) / o. a. rosso et al. uğurbahçeşehir university publishing company, istanbul, turkey (2009). [19] j c sprott, chaos and time series analysis, oxford university press, new york, usa (2003). [20] h a larrondo, matab program: noisefk.m (http://www.mathworks.com/matlabcentral/ fileexchange/35381) (2012). [21] m matsumoto, t nishimura, mersenne twister: a 623-dimensionally uniform pseudorandom number gererator, acm t. model. comput. s. 8, 3 (1998). [22] http://www.keithschwarz.com/interesting/ code/factoradic-permutation/factoradic permutation.hh.html 070006-6 papers in physics, vol. 5, art. 050009 (2013) received: 30 october 2013, accepted: 4 november 2013 edited by: s. a. grigera licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.050009 www.papersinphysics.org issn 1852-4249 reply to the commentary on “graphite and its hidden superconductivity” p. esquinazi1∗ i appreciate very much the time our colleague forgan took to read my manuscript and write his comment on the exposed physics and interpretation of experimental results. it is surely not easy to find somebody that provides such a detailed report. there is no doubt that the subject of the manuscript remains highly controversial and most of the community does not simply believe in the existence of high or low temperature superconductivity in non-intercalated graphite. after working for nearly 13 years with this material, i had the opportunity to deal with all kind of transport, magnetic and band structure data and revise part of the interesting history of this material. my personal experience related to the defect-induced magnetic order in graphite discovered more than 10 years ago (a phenomenon that one finds nowadays in a large number of compounds) and the (over)skepticism the whole community had at that time showed me (once more) that in natural sciences one should not always accept the opinion of the majority. before replying to forgan’s comment, i would like to tell you a short story. six years ago and after independent colleagues proved using different experimental methods that one can have magnetic order in graphite due to defects or non-magnetic ∗e-mail: esquin@physik.uni-leipzig.de 1 division of superconductivity and magnetism, institute for experimental physics ii, fakultät für physik und geowissenschaften, universität leipzig, linnéstrasse 5, d04103 leipzig, germany. ad-atoms like hydrogen, a speculative explanation on the magnetic data published in the year 2000 [1] (59), i decided that it was time to check whether the claim of superconductivity in graphite at extraordinarily high temperature could be real or not. the first unexpected hint came by chance, when i took one hopg sample from advanced ceramics company and asked the responsible of a dual beam microscope to take a look at the interior with the low-energy transmission electron microscope option that machine had. what we saw at that time were well defined quasi two-dimensional interfaces inside the samples. i had no idea about the origin of those interfaces and whether they could influence the transport properties or not. years later i found that these interfaces had been recognized before [2] (8) but nobody apparently payed attention on their possible influence on measurable properties. it was the systematic change in the absolute value of the resistivity as well as in its temperature dependence with the thickness of that kind of hopg samples [3,4] (7,10) that helped me clarify some inconsistencies found in the graphite literature and provided us with a hint of where superconductivity, if at all, could be hidden. in the reply, i will try to describe and emphasize important details of the experimental evidence that were not, apparently, taken into account or simply just overseen in forgan’s comment. i reply to forgan’s comment in the order of appearance, copying at the beginning of each issue part of the corresponding paragraph to help the reader. with 050009-1 papers in physics, vol. 5, art. 050009 (2013) / p. esquinazi the same purpose, i have included citations at the end of the reply. in case the citation was already in the manuscript i wrote it with its corresponding number in parentheses, at least the first time i cite it. (1) “this bulk property and many others, such as the de haas van alphen effect in large samples [1] have been understood in general terms [2] as a consequence of a semi-metallic band-structure [3] since 1960.” reply: i believe the graphite story is an example that shows that sometimes a “democratic” experimental fact and its possible “democratic interpretation” do not assure correctness. in this case, such a “democratic consent” can have a rather negative influence in the development of science, more dangerous than a wrong interpretation or an experiment carried out incorrectly. the cited de haas van alphen as well as the schubnikov de haas (sdh) oscillations in the magnetoresistance of graphite were taken as evidence for the existence of a finite fermi surface and understood in terms of an anisotropic 3d band structure with coupling constants between c-atoms obtained from fitting experimental data. those data, e.g., those sdh oscillations, were assumed to represent the ideal defect-free graphite structure. however, systematically done measurements on thin graphite flakes [5] (11), as well as the influence of irradiation in the sdh oscillations [6] (9), already indicate that the oscillations found in some bulk samples do not correspond to ideal graphite. graphite samples, independently of their size but without interfaces, do not show any oscillations in the magnetoresistance (even in its derivative) at low temperatures. for such samples, these sdh oscillations can be observed after producing defects (i.e., after increasing the carrier density in some regions of the sample) [6] or by applying a large enough electric field [7]. (2) “i now turn to the various sections of the paper. in section ii, there is an account of strong magnetoresistance effects. similar effects have also been observed in bismuth [4] and have a very interesting explanation [4] in terms of the semi-metallic properties of graphite and bismuth, so there is no need to propose a superconducting explanation for this.” reply: one can start arguing against the explanation of the (magnetic field driven) metalinsulator-transition (mit) proposed in that paper [8] (26) in forgan’s report taking into account the large (about 6) number of free parameters and the inconsistency of the proposed explanation with the nearly linear field dependence of the resistance observed at low temperatures in ordered graphite. on the other hand, one could argue that those critical points are of rather technical nature that do not touch the main idea of the proposed explanation. however, there is a more serious problem with the “interesting explanation” based on the assumed electronic band structure of bernal graphite: as i wrote in my review, the mit is not observed when the graphite sample has no interfaces. that is to say, the “democratically observed” metallic-like temperature dependence of graphite samples is simply not intrinsic of ideal graphite [3, 4, 9] (7,10,12). therefore, the explanation used in [8] to understand apparent “intrinsic properties” of bernal ideal graphite is actually not applicable. the mit as well as the metallic-like temperature dependence of graphite samples is related to the existence of interfaces, or other lattice defects, and this matches the forthcoming explanations of other effects in the review. if one does not realize or accept this fact discussed at the beginning of the review, one loses an important part of evidence. regarding the mit found also in bismuth in that paper [8]: it is important to remark that in that work [8], no information about the internal structure of the measured bi sample was given and whether interfaces were there or not. this is important because, as graphite, interfaces in bi samples can have superconducting properties at temperatures even above 10 k [10–14] (-,-,90,-,91), although pure bi bulk is not a superconductor. this interface effect in both semimetals seems to be more than a simple coincidence. (3) “in section iii, a tiny hysteresis in magnetoresistance is described. two comments are relevant here: the author notes that the sign of the hysteresis is opposite to that expected for a superconductor and limits himself to stating that the data provide “striking hints that granular superconductivity is at work in some regions of these samples”. this is hardly definitive proof.” reply: certainly, a definitive proof for the existence of granular superconductivity only through an anomalous magnetoresistance hysteresis loop is not. however, if we put all the pieces of the puzzle together, the proposed interpretation does not 050009-2 papers in physics, vol. 5, art. 050009 (2013) / p. esquinazi appear like a simple coincidence. the evidence of an anomalous hysteresis in the magnetoresistance measured in graphite flakes and its enhancement with constrictions leave not so many possibilities of interpretation. note that the hysteresis appears in the temperature region where the resistance shows a metallic-like temperature dependence. coincidentally, it is in this temperature region where we observe the field driven mit. this experiment can be easily reproduced in all graphite flakes that show such a maximum in the resistance vs. temperature, if one has a measurement system with a 10−5 relative error in the measurement of the resistance and good temperature stability. a hysteresis in the magnetoresistance is evidence for magnetic entities that remain pinned or for example, magnetic anisotropy observed in ferromagnets. typical ferromagnetic hysteresis loops in the magnetoresistance have been observed in graphite flakes [15] (46) as well as in bulk graphite after proton irradiation [16] (22). these facts do not speak for an origin of the anomalous hysteresis observed in [17, 18] (38,42) in terms of ferromagnetism. similar anomalous hysteresis in the magnetoresistance were observed in granular superconductors [19–21] (39,40,41) and explained in terms of josephson-coupled superconducting grains [19]. (4) “ section iv is headed “direct evidence for josephson behaviour”. this quotes data ... and the fact that magnetic fields could increase, decrease or have no effect on the voltages observed, also cast great doubt on the josephson interpretation.” reply: indeed, the currents used here were small simply because high currents shift the observed transitions systematically to lower temperatures [22] (45) and this fact should be taken as evidence in the direction of superconductivity, actually. it is also correct that no measurement of a strictly zero resistance state has been shown in [22] just because in such measurements and due to the finite sensitivity one cannot measure zero resistance, this is actually obvious. however, the estimate of the minimum measured resistance given in the comment is not quite correct. upon sample and at low enough temperature, the voltage noise around zero voltage measured at 1 µa is ±5 nv, or ±5 mω for samples with a resistance at high temperature of the order of 100 ω. it should be clear that upon josephson coupling and the characteristics of the superconducting patches at the interfaces, thermal fluctuations may affect the zero average value. to verify that a zero resistance state is possible one needs to show that currents remain for a sufficiently large time by measuring, for example, the magnetic moment of a ring where a superconducting current flows, as it has been done recently using graphite flakes embedded in alkanes [23] (94). hopefully, new experiments in this direction will clarify the situation. such sharp transitions in the measured voltage vs. temperature do not appear to be simply possible from wrong contacts or anisotropic current distributions. in particular, when the whole i − v behavior is compatible with the one expected from josephson-coupled regions within the interfaces. that an anisotropic current distribution can exist in the graphite lamellae is especially true in the case the contacts are localized at the corners of the lamellae, i.e., in a van de pauw configuration, as it has been clearly stated in the review and in [22] (45). in this case, the simple model used to fit the i − v curves takes explicitly into account the anisotropic arrangement. the negative resistance behaviour was observed only in that case but not for the usual linear electrode arrangement [22], as expected. it seems to be more than a simple coincidence that the same equation with only the critical josephson current as free parameter is sufficient to interpret the i−v curves measured in very different configurations. finally, one can convince oneself about the relationship between interfaces and the observed transitions simply by measuring a lamella without interfaces but with the same anisotropy of graphite and with similar contacts. forgan correctly pointed out the striking magnetic field effects on the i − v and v (t) at constant current. the effect of a magnetic field on the i − v characteristics is as expected only for large (thick) samples and in the same field region (a few koe) where the field suppresses the metallic-like behaviour, i.e., the mit. at fields higher than several tesla, one observes a reentrance in the i −v curves at low currents, i.e., the resistance starts to decrease with field [22]. this effect in the resistance has already been reported in 2003 [24] (48) and its interpretation remains open. again, this effect is related to the existence of the interfaces and it does not appear to be a simple artefact of anisotropic current distribution or contact problems. (5) “in response to [6], a colleague repeated 050009-3 papers in physics, vol. 5, art. 050009 (2013) / p. esquinazi their measurements as an undergraduate project [7]. their clear conclusion was that if the correct diamagnetic background slope (that obtained at large fields) is subtracted, then the hysteresis corresponds to a tiny ferromagnetic component. however, if a slightly different background is chosen, the hysteresis loops look somewhat like the response of a granular superconductor.” reply: the arbitrariness of the diamagnetic background subtraction has been taken into account in the description of the results in [25, 26] (58,29) and, indeed, the appearance of the hysteresis can be changed subtracting different backgrounds. although i do not know the results from the undergraduate project, i would like to emphasize a few details and the difference of the results in [25, 26] with those one expects from a simple ferromagnetic response. there are two methods one can use without the need of any background subtraction to obtain the true magnetic response of the sample. the first one is to measure the remanent magnetic moment (i.e., at zero field) after cycling the sample to a maximum field strength, see fig. 3 in [25] or fig. 10 in [26]. in general, for a ferromagnetic sample and from the measurements of the minor loops at low fields one obtains a remanence mr that increases following the rayleigh law, i.e., mr ∝ hnmax, with hmax the maximum applied field strength and n ∼ 1 . . . 1.5, see [27]. the measurements done in graphite powders and hopg samples with interfaces do show, however, a within error nearly zero remanence up to a temperature dependent critical field hjc1(t) [25, 26]. one could still argue that a critical field may also appear through magnetic domain pinning or magnetic anisotropy response in any ferromagnetic inclusions that exist in the graphite powder. in this case, one can use a second method to measure the difference between field cooled (fc) and zerofield cooled (zfc) temperature dependent magnetic moment m(t). a finite positive difference between mfc −mzfc can be taken as due to pinning of magnetic entities, superconducting vortices or magnetic domains in ferromagnets, for example. let us assume that the hysteresis seen in [25] is due to a ferromagnetic response with a saturation field hs ∼ 1 koe. in this case, the difference mfc −mzfc would increase with field reaching a maximum at a field h . hs, but then it should decrease to zero sharply at higher fields. this experiment can be easily carried out using, for example, a real ferromagnet as micro or nano particles of magnetite, which upon size and sample preparation can show saturation fields of this order. however, this is not the behaviour reported in [25] (see fig. 6 in the supporting information of that article). this difference, after reaching a minimum or plateau at ∼ 1 koe, increases again steadily up to the maximum applied field of 7 t. it seems difficult to find a ferromagnetic material that shows a hysteresis in field that does not saturate but increases steadily with field of the order or higher than 7 t. we have repeated this kind of experiments in different graphite powders from the same source as used in [25] and this behaviour is well reproducible and it does not appear compatible with a ferromagnetic response. measurements with ferromagnetic small particles embedded in disordered carbon show clearly different behavior from that reported in [25, 26], as expected. regarding forgans’s comment, and i quote, “there are many possible reasons (both real and due to experimental artifacts) why measurements on a sample taken on heating and cooling might disagree”, one can test the squid system and convince oneself about its limits and sensitivity, using different samples, as hopg samples without interfaces [26] or just amorphous carbon powder or ferromagnetic particles in a non magnetic matrix, etc., and check whether similar behavior for the mfc − mzfc is observed using exactly the same squid sequence. the results we obtain from all those samples provide us with the necessary confidence. let us assume now that the undergraduate students cited by forgan measured indeed a hysteresis of ferromagnetic nature. if it is due to magnetic impurities, then this can be firstly proved through elemental analysis and the behaviour of the remanence and the difference mfc − mzfc must be different from the one of a granular superconductor. but, what if this ferromagnetic response is not due to magnetic impurities but it is due to graphite itself due to, for example, hydrogenation through the water treatment? note that hydrogen may trigger magnetic order in graphite, as shown recently through measurements using three different experimental methods as xmcd [28], magnetization and the anisotropic magnetoresistance 050009-4 papers in physics, vol. 5, art. 050009 (2013) / p. esquinazi [29] (56). the hysteresis curves of ferromagnetic graphite reported in those works as well as in several other independent reports show, however, saturation fields of the order of a few koe, without any opening of the hysteresis at higher fields, and a small remanence in contrast to the nearly square hysteresis loops found in [25]. thus, the usual ferromagnetic behaviour of graphite does not seem to be compatible with the reported observations in [25, 26]. nevertheless, if the ferromagnetic response is due to the treatment the undergraduate students did to the graphite powder, and vanishes after pressing the powder (see fig. 4(c) in [25]), that means that it may come from interfaces between grains or grain surface near region. in this case, it should be taken seriously, spending more time on its characterization. one should not take for granted that magnetic order in any graphite or carbon-based sample is due to impurities and not something intrinsic. there is enough evidence about defect-induced magnetism in graphite, due to vacancies as well as due to hydrogen (for a short review see [30] and refs. therein). it is even possible that some interfaces in graphite show magnetic order or even that both phenomena, superconductivity and magnetism, appear and a mixture of both signals is observed. i would like to cite one sentence written in the conclusion of the work in [31] (21) where the flat band at the interfaces between bernal and rhombohedral graphite structures is proposed as the origin for the high temperature superconductivity: “in general, flat bands are susceptible to instabilities with respect to some other ordered states; for example, a magnetic state could also be possible.” evidently, if both phenomena occur at the interfaces or surface of graphite, the situation will get more interesting but difficult from the experimental point of view. unless one can prove that the ferromagnetic hysteresis is due to impurities, one should not take this kind of evidence offhandedly. (6) “we see in [6] that the hysteresis at 300 k is essentially the same as that at 5 k. we bear in mind that by assumption the superconductivity is confined to an atomic layer, and that the higher the tc of a superconductor the shorter the coherence length. these two together ensure that thermal fluctuations (which are already very noticeable at t ∼ 100 k in cuprate materials) would be huge for any room temperature graphite superconductivity [9]. thermal fluctuations would greatly reduce vortex pinning and magnetic irreversibility at room temperature, contrary to what is observed.” reply: a possible answer to this comment might be that the superconductivity at graphite interfaces has not only a ten times larger critical temperature but also ten times larger activation energies as in cuprate materials. in fact, in [25, 26] not only a ten times larger critical josephson fields (within the interpretation given in that paper), but also a larger pinning potential barrier than in cuprates have been estimated from the time relaxation measurements. in this case, the effect of thermal fluctuations should not affect substantially the sample response between 5 k and 300 k. it is rather premature to speculate based on usual s-wave (or d-wave) pairing equations. if it is true that the magnetic field makes this interface superconductivity robust then the equations used for conventional and cuprates superconductors will not be applicable. forgan speculates that the coherence length should be very small. should it really be small? the usual estimate of the coherence length is based, in general, on the ginzburg-landau theory. whether this is applicable in the case of superconductivity at graphite interfaces has to be seen. we note that a direct measurement of the coherence length is not possible. the use of the proximity effect or the change of the critical temperature with sample size are possible experimental methods one can take to estimate the coherence length, if the upper critical field cannot be measured. if one uses the variable sample size method on the interfaces found in graphite, one observes indeed a systematic decrease of the josephson critical temperature measured at a fixed current, decreasing the width of the interfaces, i.e., the thickness of the tem lamellae, behaviour measured recently in more than eight samples of the same hopg batch (same interface density) [32]. this size dependence may provide also a hint to understand the differences in the behavior of large and small samples, i.e., for squid or transport measurements. it is still too early to answer whether recent theories [31, 33] (21,93) can explain quantitatively the observed behaviour. note that, according to the last theoretical work [33], the nature of granular superconductivity that may exist at interfaces is not as simple as in usual josephson-coupled localised 050009-5 papers in physics, vol. 5, art. 050009 (2013) / p. esquinazi superconductor grains in a normal matrix. the size of the effective region that influences the observed josephson behaviour may be larger than the intrinsic coherence length. (7) “i cannot give an overriding simple explanation for all the different results reported in esquinazi’s paper, but neither can the author.” reply: this is obviously true (for both) and i fully agree. taking into account that the physics of interfaces in graphite is a subject starting just now and that nobody has tried to make them systematically, it is natural that we need time. taking the example of the cuprates and the time needed to clarify some, not all, open issues, nobody should be surprised that we do not have “an overriding simple (why simple?) explanation” at the moment. [1] y kopelevich, p esquinazi, j torres, s moehlecke, ferromagneticand superconducting-like behavior of graphite, j. low temp. phys. 119, 691 (2000). [2] m inagaki, new carbons: control of structure and functions, elsevier (2000). [3] j barzola-quiquia, j l yao, p rödiger, k schindler, p esquinazi, sample size effects on the transport properties of mesoscopic graphite samples, phys. status solidi a 205, 2924 (2008). [4] n garćıa, p esquinazi, j barzola-quiquia, s dusari, evidence for semiconducting behavior with a narrow band gap of bernal graphite, new j. phys. 14, 053015 (2012). [5] y ohashi, k yamamoto, t kubo, shubnikov de haas effect of very thin graphite crystals, in: carbon’01, an international conference on carbon, pag. 568, the american carbon society, lexington, ky, united states, (2001). [6] a arndt, d spoddig, p esquinazi, j barzolaquiquia, s dusari, t butz, electric carrier concentration in graphite: dependence of electrical resistivity and magnetoresistance on defect concentration, phys. rev. b 80, 195402 (2009). [7] a ballestar, j barzola-quiquia, s dusari, p esquinazi, r r da silva, y kopelevich, electric field induced superconductivity in multigraphene, arxiv:1202.3327 (2012). [8] x du, s w tsai, d l maslov, a f hebard, metal-insulator-like behavior in semimetallic bismuth and graphite, phys. rev. lett. 94, 166601 (2005). [9] b c camargo, y kopelevich, s b hubbard, a usher, w böhlmann, p esquinazi, effect of structural disorder on the quantum oscillations in graphite, (unpublished). in this work the authors show that in certain hopg samples (spi) of high grade, the density of interfaces is much lower than in, for example, advanced ceramics hopg zya samples. in this new hopg samples basically no sdh oscillations are found and the temperature dependence of the resistance shows a semiconducting behavior with saturation a low temperatures. (2013). [10] d v gitsu, a f grozav, v g kistol, l i leporda, f m muntyanu, experimental observation of a superconducting phase with tc ' 8.5 k in large-angle bismuth bicrystals, jetp lett. 55, 403 (1992). [11] f m muntyanu, l i leporda, restructuring of the energy spectrum in large angle bismuth bicrystals, phys. solid state 37, 298 (1995). [12] f muntyanua, a gilewski, k nenkov, j warchulska, a zaleski, experimental magnetization evidence for two superconducting phases in bi bicrystals with large crystallite disorientation angle, phys. rev. b 73, 132507 (2006). [13] f m muntyanu, a gilewski, k nenkov, a j zaleski, , v chistol, fermi-surface rearrangement in bi bicrystals with twisting superconducting crystallite interfaces, phys. rev. b 76, 014532 (2007). [14] f muntyanua, a gilewski, k nenkov, a zaleski, v chistol, superconducting crystallite interfaces with tc up to 21 k in bi and bisb bicrystals of inclination type, solid state commun. 147, 183 (2008). [15] j barzola-quiquia, p esquinazi, ferromagneticand superconducting-like behavior of the electrical resistance of an 050009-6 papers in physics, vol. 5, art. 050009 (2013) / p. esquinazi inhomogeneous graphite flake, j. supercond. nov. magn. 23, 451 (2010). [16] p esquinazi, j barzola-quiquia, d spemann, m rothermel, h ohldag, n garćıa, a setzer, t butz, magnetic order in graphite: experimental evidence, intrinsic and extrinsic difficulties, j. magn. magn. mat. 322, 1156 (2010). [17] p esquinazi, n garćıa, j barzola-quiquia, p rödiger, k schindler, j l yao, m ziese, indications for intrinsic superconductivity in highly oriented pyrolytic graphite, phys. rev. b 78, 134516 (2008). [18] s dusari, j barzola-quiquia, p esquinazi, superconducting behavior of interfaces in graphite: transport measurements of microconstrictions, j. supercond. nov. magn. 24, 401 (2011). [19] l ji, m s rzchowski, n anand, m thinkam, magnetic-field-dependent surface resistance and two-level critical-state model for granular superconductors, phys. rev. b 47, 470 (1993). [20] y kopelevich, c dos santos, s moehlecke, a machado, current-induced superconductorinsulator transition in granular high-tc superconductors, arxiv:0108311 (2001). [21] i felner, e galstyan, b lorenz, d cao, y s wang, y y xue, c w chu, magnetoresistance hysteresis and critical current density in granular rusr2gd2−xcexcu2o10−δ, phys. rev. b 67, 134506 (2003). [22] a ballestar, j barzola-quiquia, t scheike, p esquinazi, evidence of josephson-coupled superconducting regions at the interfaces of highly oriented pyrolytic graphite, new j. phys. 15, 023024 (2013). [23] y kawashima, possible room temperature superconductivity in conductors obtained by bringing alkanes into contact with a graphite surface, aip advances 3, 052132 (2013). [24] y kopelevich, j h s torres, r r da silva, f mrowka, h kempa, p esquinazi, reentrant metallic behavior of graphite in the quantum limit, phys. rev. lett. 90, 156402 (2003). [25] t scheike, w böhlmann, p esquinazi, j barzola-quiquia, a ballestar, a setzer, can doping graphite trigger room temperature superconductivity? evidence for granular hightemperature superconductivity in water-treated graphite powder, adv. mater. 24, 5826 (2012). [26] t scheike, p esquinazi, a setzer, w böhlmann, granular superconductivity at room temperature in bulk highly oriented pyrolytic graphite samples, carbon 59, 140 (2013). [27] s kobayashi, s takahashi, t shishido, y kamada, h kikuchi, low-field magnetic characterization of ferromagnets using a minor-loop scaling law, j. appl. phys. 107, 023908 (2010). [28] h ohldag, p esquinazi, e arenholz, d spemann, m rothermel, a setzer, t butz, the role of hydrogen in room-temperature ferromagnetism at graphite surfaces, new j. phys. 12, 123012 (2010). [29] j barzola-quiquia, w böhlmann, p esquinazi, a schadewitz, a ballestar, s dusari, l schultze-nobre, b kersting, enhancement of the ferromagnetic order of graphite after sulphuric acid treatment, appl. phys. lett. 98, 192511 (2011). [30] p esquinazi, w hergert, d spemann, a setzer, a ernst, defect-induced magnetism in solids, ieee transactions on magnetics 49, 4668 (2013). [31] n b kopnin, m ijäs, a harju, t t heikkilä, high-temperature surface superconductivity in rhombohedral graphite, phys. rev. b 87, 140503 (2013). [32] a ballestar, superconductivity at graphite interfaces, ph.d. thesis, university of leipzig, (unpublished). [33] w a muñoz, l covaci, f peeters, tightbinding description of intrinsic superconducting correlations in multilayer graphene, phys. rev. b 87, 134509 (2013). 050009-7 papers in physics, vol. 10, art. 100003 (2018) received: 8 january 2018, accepted: 11 january 2018 edited by: a. mart́ı licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.100003 www.papersinphysics.org issn 1852-4249 commentary on “modeling temperature manipulation in a circular model of birdsong production” y. s. zhang1∗ the paper titled “modeling temperature manipulations in a circular model of birdsong production” by dima et al. [1] suggests that the birdsong system has a circular architecture in which an initiating area in the brainstem provides inputs to both the downstream respiratory area and the upstream vocal control area; the latter subsequently sends input to the respiratory area as well. a consequence of the proposed architecture is that the birdsong syllables are generated by the neural commands at two different timescales corresponding to the two inputs. the model is successful in explaining the syllable stretching and breaking phenomena when an upstream vocal area is cooled down. in this commentary, i make some remarks on the findings of this paper and discuss how this work fits into the current knowledge about birdsong generation. the avian vocal production system has long been understood as a feedforward network mainly implemented by a descending motor pathway. in this picture, a forebrain area hvc (high vocal center, as a proper name) generates the precisely timed sequential actions of this complex vocal behavior. the projection neurons in hvc fire sparsely at specific time points of the song. these activities activate the downstream ra (the robust nucleus of the arcopallium) where the motor patterns are generated by the activation of a subset of the ra neu∗e-mail: yz9@princeton.edu 1 princeton neuroscience institute, princeton, 08544 new jersey, usa. rons. the ra neurons then activate the hypoglossal nerve (nxiits) which controls the syringeal muscles and the respiratory group [2]. it has also been established that in addition to the descending pathway, there is an ascending pathway from the brainstem to hvc mediated by the nucleus uva where the feedback from respiration is fed into hvc. the descending and ascending pathways form a loop [3]. the circular model proposed in dima et al. [1] is compatible with the known architecture of the avian vocal system. furthermore, the authors hypothesized that a slower timescale, independent of the moment-to-moment timing control from the hvc dynamics, exists in this architecture. in their model, this timescale is determined by the pulse activities of an initiating area (ia). this activity usually signals the start of a long syllable or a bout of brief pulses. the ia is thought to be in the brainstem which directly provides input to the expiratory related area (er). simultaneously, the ia also sends information to the hvc2 via the ascending pathway and initiates the hvc activities. with this additional degree of freedom in their model, they can simulate the respiratory patterns of all types of syllables present in canary songs. the setup is further supported by the hvc cooling effects on canary respiratory patterns during singing. whereas this study provides further understanding of the dynamics of the avian vocal production system, it depends on a number of assumptions that currently lack strong empirical evidence. one major assumption is the direct input from the hy100003-1 papers in physics, vol. 10, art. 100003 (2018) / y. s. zhang pothetical ia to the er. anatomically, the dorsal medial nucleus of the intercollicular complex (dm) seems to project to both the respiratory group and uva. however, whether it behaves as the proposed model has not been reported. second, the assumption that the syllable type determines the necessity of direct input seems a bit arbitrary. why would the model undergo a qualitative change between different syllable types? third, syllable types depend on very specific parameter settings of the ia activity. if we look at the spectrogram of the canary song shown in fig. 1 of ref. [1], the transformation from one type of syllable to another is rather gradual. however, the model-predicted ia activity exhibits quite distinct behaviors. as the model depends on several parameters, such prediction may be the result of overfitting. one possible way to interpret the ia activity is that it may represent the sensory feedback from the respiratory system. the biomechanics of respiration imposes a strong constraint on the song dynamics [4]. it is conceivable that the slow timescale is originated from respiration. a future expansion of the circular model to perhaps a two-loop model with the respiratory system included may help understanding the origin and integration of the multiple timescales in the song production system. in conclusion, this study provides a convincing evidence for the existence of multiple timescales in the avian song production system. on the one hand, it is compatible with the popular models that emphasize the role of hvc in timing control. on the other hand, it proposes alternative generators of song timing. [1] g c dima, m a goldin, g b mindlin, modeling temperature manipulations in a circular model of birdsong production, pap. phys. 10, 100002 (2018). [2] m s fee, a a kozhevnikov, r h hahnloser, neural mechanisms of vocal sequence generation in the songbird, ann. ny. acad. sci. 1016, 1 (2004). [3] r c ashmore, j m wild, m f schmidt, brainstem and forebrain contributions to the generation of learned motor behaviors for song, j. neurosci. 25, 37 (2005). [4] m f schmidt, f goller, breathtaking songs: coordinating the neural circuits for breathing and singing, physiology 31, 6 (2016). 100003-2 papers in physics, vol. 5, art. 050006 (2013) received: 14 june 2013, accepted: 9 july 2013 edited by: g. b. mindlin licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.050006 www.papersinphysics.org issn 1852-4249 a neuronal device for the control of multi-step computations ariel zylberberg,1–4∗ luciano paz,1, 2† pieter r. roelfsema,4–6‡ stanislas dehaene,7–10§ mariano sigman2¶ we describe the operation of a neuronal device which embodies the computational principles of the “paper-and-pencil” machine envisioned by alan turing. the network is based on principles of cortical organization. we develop a plausible solution to implement pointers and investigate how neuronal circuits may instantiate the basic operations involved in assigning a value to a variable (i.e., x = 5), in determining whether two variables have the same value and in retrieving the value of a given variable to be accessible to other nodes of the network. we exemplify the collective function of the network in simplified arithmetic and problem solving (blocks-world) tasks. ∗e-mail: arielz@df.uba.ar †e-mail: lpaz@df.uba.ar ‡e-mail: p.roelfsema@nin.knaw.nl §e-mail: stanislas.dehaene@gmail.com ¶e-mail: sigman@df.uba.ar 1 these authors contributed equally to this work 2 laboratory of integrative neuroscience, physics department, fceyn uba and ifiba, conicet; pabellón 1, ciudad universitaria, 1428 buenos aires, argentina. 3 laboratory of applied artificial intelligence, computer science department, fceyn uba; pabellón 1, ciudad universitaria, 1428 buenos aires, argentina. 4 department of vision and cognition, netherlands institute for neuroscience, an institute of the royal netherlands academy of arts and sciences, meibergdreef 47, 1105 ba amsterdam, the netherlands. 5 department of integrative neurophysiology, center for neurogenomics and cognitive research, vu university, amsterdam, the netherlands. 6 psychiatry department, academic medical center, amsterdam, the netherlands. 7 collge de france, f-75005 paris, france. 8 inserm, u992, cognitive neuroimaging unit, f-91191 gif/yvette, france. 9 cea, dsv/i2bm, neurospin center, f-91191 gif/yvette, france. i. introduction consider the task of finding a route in a map. you are likely to start searching the initial and final destinations, identifying possible routes between them, and then selecting the one you think is shorter or more appropriate. this simple example highlights how almost any task we perform is organized in a sequence of processes involving operations which we identify as “atomic” (here search, memory and decision-making). in contrast with the thorough knowledge of the neurophysiology underlying these atomic operations [1–3], neuroscience is only starting to shed light on how they organize into programs [4–6]. partly due to the difficulty of implementing compound tasks in animal models, sequential decision-making has mostly been addressed by the domains of artificial intelligence, cognitive science and psychology [7, 8]. our goal is to go beyond the available neurophysiological data to show how the brain might sequentialize operations to conform multi-step cognitive 10 university paris-sud, cognitive neuroimaging unit, f91191 gif/yvette, france. 050006-1 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. programs. we suppose the existence of elementary operations akin to ullman’s [9] routines, although not limited to the visual domain. of special relevance to our report is the body of work that has grown out of the seminal work of john anderson [8]. building on the notion of “production systems” [10], anderson and colleagues developed act-r as a framework for human cognition [8]. it consists of a general mechanism for selecting productions fueled by sensory, motor, goal and memory modules. the act-r framework emphasizes the chained nature of human cognition: at any moment in the execution of a task, information placed in buffers acts as data for the central production system, which feeds-back to these same modules. despite vast recent progress in our understanding of decision formation in simple perceptual tasks [3], it remains unresolved how the operations required by cognitive architectures may be implemented at the level of single neurons. we address some of the challenges posed by the translation of cognitive architectures to neurons: how neuronal circuits might implement a single operation, how multiple operations are arranged in a sequence, how the output of one operation is made available to the next one. ii. fundamental assumptions and neuronal implementation i. the basis for single operations insights into the machinery for simple sensorymotor associations come from studies of monkey electrophysiology. studies of oculomotor decisions —focused primarily on area lip [12]— have shown that neurons in this area reflect the accumulation of evidence leading to a decision [3]. in a well studied paradigm, monkeys were trained to discriminate the direction of motion of a patch of moving dots and report the direction of motion with an eye-movement response to the target located in the direction of motion [13]. neurons in lip which respond with high levels of activity during memory saccades to specific portions of space are recorded during the task. these neurons show ramping activity of their firing rates with a slope that depends on the difficulty of the task, controlled by the proportion of dots that move coherently in one direction [14]. in a reaction time version of the task [13], when monkeys are free to make a saccade any time after the onset of the motion stimuli, the ramping activity continues until a fixed level of activity is reached, signaling the impending saccade (fig. 1 a). crucially, the level of this “threshold” does not depend on the difficulty of the task or the time to respond. the emerging picture is that ramping neurons in lip integrate sensory evidence and trigger a response when activity reaches a threshold. this finding provided strong support for accumulation or race models of decision making which have been previously postulated to explain error rates and reaction times in simple tasks, and match nicely with decision theoretical notions of optimality [15]. while experimental studies have mainly characterized the feedforward flow of information from sensory to motor areas, evidence accumulation is also modulated by contextual and task-related information including prior knowledge about the likelihood and payoff of each alternative [16, 17]. interestingly, a common currency —the spiking activity of neurons in motor intention areas— may underlie these seemingly unequal influences on decision formation. ii. sequencing of multiple operations brain circuits can integrate sensory evidence over hundreds of milliseconds. this illustrates how the brain decides based on unreliable evidence, averaging over time to clean up the noise. yet the duration of single accumulation processes is constrained by its analog character, a problem pointed out earlier by von neumann in his book the computer and the brain [18]: “in the course of long calculations not only do errors add up but also those committed early in the calculation are amplified by the latter parts of it...”. modern computers avoid the problem of noise amplification employing digital instead of analog computation. we have suggested that the brain may deal with the amplification of noise by serially chaining individual integration stages, where the changes made by one ramp-to-threshold process represent the initial stage for the next one (fig. 1 b) [11, 19]. evidence for the ramp-to-threshold dynamics has been derived from tasks in which the decision reflects a commitment to a motor plan [3, 20]. as others [21], we posit that the ramp-to-threshold process for action-selection is not restricted to motor actions, but may also be a mechanism to select 050006-2 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. change the state of sensory and memory units competing production rules threshold distribution collapsed in a few symbols, passed to the next diffusion step if-then « programs » arise from chained diffusion processes step 1 step 2 evidence provided by sensory and memory units competing motor actions threshold motor a b figure 1: diffusion processes in single and multi-step tasks. (a) in simple sensory-motor tasks, response selection is mediated by the the accumulation of sensory evidence to a response threshold. (b) in tasks involving multiple steps, there is a parallel competition between a subset of “productions rules” implemented by pools of neurons that accumulate evidence up to a threshold. the selected production can have overt effects (motor actions) as well as covert effects in modifying the state of the memory, after which a new cycle begins. adapted with permission from ref. [11]. internal actions like the decision of where to attend, what to store in memory, or what question to ask next. therefore, the activation of a circuit based on sensory and mnemonic evidence is mediated by the accumulation of evidence in areas of the brain which can activate that specific circuit, and which compete with other internal and external actions. within a single step, the computation proceeds in parallel and settles in a choice. seriality is the consequence of the competitive selection of internal and external actions that transforms noisy and parallel steps of evidence accumulation into a sequence of discrete changes in the state of the network. these discrete steps clean up the noise and enable the logical flow of the computation. following the terminology of symbolic cognitive architectures [7, 8], the “ramping” neurons which select the operation to do next are referred as “production” neurons. the competition between productions is driven by inputs from sensory and memory neurons and by the spontaneous activity in the production network. as in single-step decisions [13], the race between productions concludes when neurons encoding one production reach a decision threshold. the neurons which detect the crossing of a threshold by a production also mediate its effects, which is to activate or deactivate other brain circuits. the activated circuits can be motor, but are not restricted to it, producing different effects like changes in the state of working memory (deciding what to remember), activating and broadcasting information which was in a “latent” state (like sensory traces and synaptic memories [22,23]), or activating peripheral processors capable of performing specific functions (like changing the focus of attention, segregating a figure from its background, or tracing a curve). iii. pointers versatility and flexibility are shared computational virtues of the human brain and of the turing machine. the simple example of addition (at the root of turing’s conception) well illustrates what sort of architecture this requires. one can picture the addition of two numbers x and y as displacing y steps from the initial position x. this simple representation of addition as a walk in the number line describes the core connection between movement in space and mathematical operations. it also describes the need of operations that use variables which temporarily bind to values in order to achieve flexibility. in this section we describe how neuronal circuits 050006-3 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. may instantiate the basic operations involved in assigning a value to a variable (i.e., x = 5), in determining whether two variables have the same value and in retrieving the value of a given variable. this is in a way a proof of concept, i.e., a way to construct these operations with neurons. we are of course in no position to claim that this instantiation is unique (it is certainly not). however, we have tried to ground it on important principles of neurophsyiology and we believe that this construction raises important aspects and limitations which may generalize to other neuronal constructions of variable assignment, comparison and retrieval mechanisms. here we introduce the concept of pointers; individual or pool of neurons which can temporarily point to other circuits. when a pointer is active, it facilitates the activation of the circuit to which it is temporarily bound and which is dynamically set during the course of a computation. a pointer “points” to a cortical territory (for instance, to v1). this cortical mantle represents a space of values that a given variable may assume. the cortex is organized in spatial maps representing features (space, orientation, color, 1-d line...) and a pointer can temporarily bind to one of these possible set of values in a way that the activation of the pointer corresponds to the activation of the value and hence functions as a variable (i.e., x = 3). there are many proposed physiological mechanisms to temporally bind neuronal circuits [24–26]. a broad class of mechanisms relies on sustained reverberant neuronal activity [26]. a different class relies on small modifications of synaptic activity, which constitute silent memories in latent connections [23]. here we opt for the second alternative, first because it has a great metabolic advantage allowing to share many memories at very low cost, and more importantly because it separates the processes of variable assignation and variable retrieval. as we describe below in detail, in this architecture the current state of the variable is not broadcast to other areas until it is specifically queried. to specifically implement the binding with neuron-like elements, we follow the classic assumption that when two groups of neurons are active at the same time, a subset of the connection between them is strengthened (fig. 2). the strengthening of the synapses is bidirectional, and it is responsible for the binding between neuronal populations. to avoid saturation of the connections and to allow for new associations to form, the strength of these connections decays exponentially within a few hundred milliseconds. specifically, if the connection strength between a pair of populations is wbase, then when both populations are active the connection strength increases exponentially with a time constant τrise to a maximum connection strength of wmax. when one or both of the populations become inactive then the connection strength decays exponentially back to wbase with a time constant of τdecay. the mechanism described above generates a silent coupling between a pointer and the value to which it points. how is this value recovered? in other words, how can other elements of the program know the value of the variable x? the expression of a value stored in silent synapses is achieved by simultaneously activating the pointer circuit and forcing the domain of the variable to a winner-takeall (wta) mode (fig. 2). the wta mode —set by having neurons with self-excitation and crossinhibition [27]— assures that only one value is retrieved at a time. these neurons make stronger connections with the neurons to which they are bound than with the other neurons. when the network is set in a wta mode, these connections bias the competition to retrieve the value previously associated with the pointer (fig. 2). in other words, activation of the pointer by itself is not sufficient to drive synaptic activity to the neuron (or neurons) representing the value to which it points. but it can bias the activation of a specific neuron when co-activated with a tonic activation of the entire network. this architecture is flexible and economic. value neurons only fire when they are set or retrieved. memory capacity is constrained by the number of connections and not by the number of neurons. but it also has a caveat. given that only one variable can be bound to a specific domain at any one time, multiple bindings must be addressed serially. as we show later, this can be accomplished by the sequential firing of production neurons. a. compare the value of two variables if two variables x and y bind to instances in the same domain, it is possible to determine whether the two variables are bound to the same instance, 050006-4 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. binding by co-activation pointers domain retrieve value of pointer wta competition biases the competition a a a figure 2: instantiation of variables through plastic synapses. when two neurons become active, the connection between them is rapidly strengthened, forming a transient association which decays within a few hundreds of milliseconds. the value of a pointer can be retrieved by setting the pointer’s domain in a winner-take-all mode and activating the pointer which biases the wta competition through its strengthened connections. i.e., whether x = y. the mechanics of this process is very similar to retrieving the value of a variable. pointer neurons1 x and y are co-activated. the equality in the assignment of x and y can be identified by a coincidence detector. specifically, this is solved by adjusting the excitability in the value domain in such a way that the simultaneous input on a single value neuron exceeds the threshold but the input of a single pointer does not. this proposed mechanism is very similar to the circuits in the brain stem which —based on coincidence detection of delay lines— encode interaural time difference [28]. this shows the concrete plausibility of generating such dynamic threshold mechanisms that act as coincidence detectors. b. assign the value of one variable to another similarly, to assign the value of x to the variable y (y ← x), the value of the variable that is to be copied needs to be retrieved as indicated previously, by activating the variable x and forcing a wta competition at the variable’s domain. then, the node coding for variable y must be activated, which will lead to a reinforcing of the connections between y and x which will instantiate the new association. 1in our framework, a pointer can also be a population of neurons that functions as a single pointer. iii. concrete implementation of neuronal programs in the previous sections we sketched a set of principles by which brain circuits can control multistep operations and store temporary information in memory buffers to share it between different operations. here we demonstrate, as a proof of concept, a neuronal implementation of such circuits in two simple tasks. the first one is a simple arithmetic counter, where the network has to count from an initial number nini to a final number nend, a task that can be seen as the emblematic operation of a turing device. the second example is a blocks-world problem, a paradigmatic example of multi-step decision making in artificial intelligence [29]. the aim of the first task is to illustrate how the different elements sketched above act in concert to implement neuronal programs. the motivation to implement blocks worlds is to link these ideas to developed notions of visual routines [9, 30, 31]. i. arithmetic counter we designed the network to be generic in the sense of being able to solve any instance of the problem, i.e., any instantiation of nini and nend. we decided to implement a counter, since it constitutes essentially a while loop and hence a basic intermediate description of most flexible computations. in the 050006-5 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. network, each node is meaningful, and all parameters were set by hand. of course, understanding how these parameters are adjusted through a learning process is a difficult and important question, but this is left for future work. each number is represented by a pool of neurons selective to the corresponding numerosity value [32]. a potential area for the neurons belonging to the numerosity domain is the intra parietal sulcus (ips) [33], where neurons coding for numerosity have been found in both humans and monkeys [34]. in the model, number neurons interact through random lateral inhibitory connections and self-excitation. this allows, as described above, to collapse a broad distribution of number neurons [32] to a pool representing a single number, in a retrieval process during a step of the program. we assume that the newtwork has learned a notion of number proximity and continuity. this was implemented via a transition-network that has asymmetrical connections with the number-network. a given neuron representing the number n excites the transition neuron n → n + 1 population. this in turn excites the neuron that represents the number n + 1. again, we do not delve into how this is learned, we assume it as a consolidated mechanism. the numbers-network can be in different modes: it can be quiescent, such that no number is active, or it can be in a winner-take-all mode with only one unit in the active state. our network implementation of the counter makes use of two variables. the count variable stores the current count and changes dynamically as the program progresses, after being initialized to nini. the end variable stores the number at which the counting has to stop and is initially set to nend. the network behaves basically as a while loop, increasing the value of the count variable while its value differs from that of the end variable. to increment the count, we modeled a transitionnetwork with units that have asymmetrical connections with the numbers-network. for example, the “1 → 2” node receives excitatory input from the unit coding for number 1 and in turn excites number 2. this network stores knowledge about successor functions, and in order to become active it requires additional input from the production system. as mentioned above, here we do not address how such structure is learned in number representing neurons. learning to count is an effortful process during childhood [35] by which children learn transition rules between numbers. we postulate that these relations are encoded in structures which resemble horizontal connections in the cortex [36–38]. in the same way that horizontal connections incorporate transition probabilities of oriented elements in a slow learning process [39, 40], resulting in a gestalt as a psychological sense of “natural” continuity, we argue that horizontal connections between numerosity neurons can endow the same sense of transition probability and natural continuity in the space of numbers. the successor function can be as an homologous to a matrix of horizontal connections in the array of number neurons. in a way, our description postulates that a certain number of operations are embedded in each domain cortex (orientation selective neurons in v1 for curve tracing, number selective neurons in ips for arithmetic...). this can be seen as “compiled” routines which are instantiated by local horizontal connections capable of performing operations such as collinear continuity, or “add one”. the program can control which of these operations becomes active at any given step by gating the set of horizontal connections, a process we have referred to as “addressing” the cortex [41]. just as an example, when older children learn to automatically count every three numbers (1, 4, 7, 10, 13...) we postulate that they have instantiated a new routine (through a slow learning process) capable of establishing the transition matrix of n → n + 3. the repertoire of compiled functions is dynamic and can change with learning [42]. counting requires a sequence of operations which include changes in the current count, retrieval of successor functions and numeric comparisons. the successive steps of the counting routine are governed by firing of production neurons. the order in which the productions fire is controlled by the content of the memory (fig. 3). we emphasize that while the production selection process proceeds in parallel —as each production neuron constantly evaluates its input— the selected production strongly inhibits the other production neurons and therefore the evidence accumulated at one step is for the most part lost after a production is selected.2 in fig. 3 we simulate a network that has 2in the absence of external noise (as in the present simulations), only the production with the largest input has 050006-6 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. figure 3: sketch of the network implementing an arithmetic counter. (a) the network is divided into five sub-networks: productions, memory, pointers (or variables), numbers, and transition-networks. the order in which the productions fire is controlled by the state of the memory network, which is itself controlled by the production system. (b) dynamics of a subset of neurons in the network. all units are binary except for the production neurons (violet) which gradually accumulate evidence to a threshold. to count from numbers 2 to 6. once the initial and final numbers have been bound to the count and end variables respectively, the network cycles through six productions. the first production that is selected is the preparenext production, whose role is to retrieve the value that results from adding 1 to the current count. to this end, this production retrieves the current value of the count variable, and excites the neurons of the transition-network such that the node receiving an input from the retrieved value of count becomes active (i.e., if 2 is active in the numbers-network, then 2 → 3 becomes active in the transition-network). to assure that the retrieved value is remembered for the next step (the actual change of the current count), neurons in the transition-network are endowed with recurrent excitation, and therefore these neurons remain active until explicitly inhibited. the same production also activates a node in the memorynetwork which excites the incrementcount production, which is therefore selected next. the role of the next production (incrementcount ) is to actually update the current count, changing the binding of the count variable. the incrementcount production inhibits all neurons in the number network, to turn it to the quiescent state. once the network is quiescent, lateral inhibition between number nodes is released and the asymmetrical inputs from the transition-network can activate the number to which it projects. at the same time, the incrementcount production activates the count varihigher-than-baseline activity. able, which is then bound to the currently active node in the numbers-network. notice that this two-step process between the preparenext and incrementcount productions basically re-assigns the value of the count variable from its initial value n to a new value n + 1. using a single production to replace these two (as tried in earlier versions of the simulations) required activating the number and transition neurons at the same time which lead to fast and uncontrolled transitions in the numbersnetwork. to increase the current count in a controlled manner, we settled for the two-productions solution. the incrementcount production also activates a memory unit that biases the competition at the next stage, in favor of the clearnext production. this production shuts up the activity in the transition-network, strongly inhibiting its neurons to compensate for their recurrent excitation. shutting the activity of these neurons is required at the next step of the routine to avoid changes in the current count when the count variable is retrieved to be compared with the end variable. after clearnext, the retrieveend production fires which retrieves the value of the end variable to strengthen the connections between the end variable and the value to which it is bound. this step is required since the strength of the plastic connections decays rapidly, and therefore the instantiation of the variables will be lost if not used or reactivated periodically. finally, the checkequal production is selected to determine if the count and end variables are equal. if both variables are equal, a node 050006-7 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. in the memory network is activated which is detected by the halt production to indicate that the task has been completed; otherwise, the production that is selected next is the preparenext production and the production cycle is repeated. in fig. 3b, we show the dynamics of a subset of neurons for a network that has to count from 2 to 6. with this example we have shown how even a seemingly very easy task such as counting (which can be encoded in up to two lines in virtually any programming language) seems to require a complex set of procedures to coordinate and stabilize all computations, when they are performed by neuronal circuits with slow building of activity and temporal decay. ii. a world of blocks a natural extension of the numerotopic domain used in the above example is to incorporate problems in which the actor must interact with its environment, and sensory and motor productions ought to be coordinated. the visual system performs a great variety of computations. it can encode a large set of visual features in a parallel feed-forward manner forming its base representation [9,30,43,44], and temporally store these features in a distributed manner [45–47]. a matrix of lateral connections gated by top-down processing can further detect conjunctions of these feature for object recognition. in analogy with motor routines, the visual system relies on serial chaining of elemental operation [9, 31] to gain computational flexibility. there are many proposals as to which operations are elemental [31], but, as we have discussed above, this list may be fuzzy since the set of elementary operations may be changed by learning [41]. in this framework, atomic operations are those that are encoded in value domains. here, and for the purpose of implementing a neural circuit capable of solving the blocks worlds, we will focus on a simplified group of three elemental operations: visual search: the capacity of the system to identify the location of a given feature. visual cuing: the capacity of the system of highlighting the features that are present at a given location. early visual layer color position search cuing figure 4: simplified model of the visual system used in the blocks-world simulations. the upper portion shows the different layers and their connections. the early visual area is formed by a first sensory layer of neurons that receive stimulation from the outer world and a second attentional layer with bidirectional connections between the higher color or position tuned areas. the grayed neurons are the ones that present a higher activity. the lower panels show an example of the cuing and search operations. in the latter, a color tuned neuron is stimulated and drives an activity increase in the early visual layer. that later empowers the activity in the position layer concluding the search. the right situation shows the similar cuing operation. shift the processing focus: a method guided by attention to focus the processing of visual features or other computations in a given location. here we use a simplified representation of the visual system based on previous studies [48–50]. we assume a hierarchy of two layers of neurons. the first one is tuned to conjunctions of colors and locations in the visual field. the second one has two distinct groups of populations, one with neurons that have large receptive fields that encode color irrespective of their location and another which encodes location independently of color (fig. 4). the model assumes that neurons in the first layer tuned to a particular color and retinotopic location are reciprocally connected to second layer units that encode the same color or location. this architecture performs visual search of color in a way which resembles the variable assignation described above, through a conjunction mechanism between maps encoding different features. in the 050006-8 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. visual system sensory layer attentional layer memory state production system variables forcing neurons color position layer goal configuration game state figure 5: sketch of the network that solves the blocks-world problem. the sensory early visual system receives input from the bw configuration and excites the first attentional layer. the latter is connected to color and position specific areas. the arrows show there is a connection between layers. the individual connections may be excitatory or inhibitory. the connections with the inverted triangle head indicate only excitatory connections exist. model, the color cortex encodes each color redundantly for all positions forming a set of spatial vectors (one for each color). of course, all these spatial maps selective for a given color can be intermingled in the cortex, as it is also the case with orientation columns which sample all orientations filling space in small receptive fields. if, for instance, a red square is presented in position three, the neuron selective for red (henceforth referred to as in the red map) and with a receptive field in position three will fire. this activation, in turn, propagates to spatial neurons (which are insensitive to color). thus, if four squares of different colors are presented in positions 1 to 4, the spatial neurons in these positions will fire at the same rate. to search for the spatial position of a red block, the activity of neurons coding for red in the color map must be enhanced. the enhanced activity propagates to the early visual areas which code for conjunctions of color and space, which in turn propagates to the spatial map, highlighting the position where the searched color was present. spatial selection is triggered by an attention layer which selects the production “attend to red”, addressing the sensory cortex in a way that only locations containing red features will be routed to the spatial neurons. the color of a block at one location can be retrieved by an almost identical mechanism. in this case, the production system sets the attentional network to a given position in space and through conjunction mechanisms (because connections are reciprocal) only the color in the selected position is retrieved. this is a simple device for searching based on the propagation of attentional signals which has been used before in several models (e.g., ref. [31]) (fig. 4). to bridge these ideas which are well grounded in the visual literature [30, 31, 43] with notions of planning and sequential mental programs, we use this model to implement a solver for a simple set of blocks-world problems. the blocks-world framework is a paradigmatic artificial intelligence problem that consists of a series of colored blocks piled up on top of a large surface in many columns. the goal is to arrange the blocks according to their color in a given goal configuration. the player can only move one block from the top of any column and place it at the top of another, or on the surface that supports all the blocks. we choose to construct a solver for a restricted blocks-world problem where the surface can only hold 3 columns of blocks and the goal configuration is to arrange them all into one column (that we call the target column).3 we implement a network with a set of memory and production neurons —analogous to the counter circuit described above— which coordinates a set of visual and motor productions (fig. 5). the interaction between the memory layer and the production system triggers the execution of elemental visual processes, motor actions and changes in the memory configuration in order to solve any given instance of the problem. to solve this problem, an algorithm needs to be able to find whether a block is in the correct position. for this, it requires, first, a “retrieve color” from a given location function. normally the location that is intended to be cued is the one that is being attended to. we implement the attended location as a variable population (that we call the processing focus or pf inspired in ullman’s work [9]), so the “retrieve color” is equal to cue the color in pf’s location.4 second, it must compare the colors in different locations. this can be done by bind3this restricted problem is equivalent to the tower of london game [51]. 4there are works that name pf as deictic pointers and suggest that it would be possible to store it also by keeping gaze or even a finger at the relevant location [52]. 050006-9 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. production system state color visual layer visual layers attentional layer sensory layer variables time spatial initial state clear target column target column pile on target column goal solved instance. must compare with goal to end figure 6: mean firing rate of a subset of neuronal populations involved in the resolution of an instance of bw. the rate is normalized between 0 (white) and 1 (black). the horizontal axis represents the ellapsed time in arbitrary units. at the bottom, we show a subset of intermediate bw configurations, aligned to the execution time of the motor commands which lead to these configurations. ing the relevant location colors to separate variables and then comparing them in the way described in section a. as the goal is to pile all the blocks in the correct order in a given target column, a possible first step towards the goal is to compare the target column with the goal configuration from the bottom to the top. this can be done by chaining several movements of the processing focus with color retrievals and subsequent comparisons. once the first difference is found, the target column’s upper blocks must be moved away in order to replace the different colored block with the correct one. this process is carried out using several motor productions. once the target column is free to receive the correct colored block, that color must be searched in the remaining columns. this is done as described earlier in this section. once found, the pf can be moved there in order to view if there are blocks above it. if there are, motor productions must be chained in order to free the desired block and move it to the target column. after this is done, the program can loop back to comparing the target column with the goal configuration and iteratively solve the problem. our neuronal implementation chains the produc050006-10 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. tions in a similar way as the one described above and elicits a complex activity pattern (fig. 6). a detailed explanation of the implementation can be found in the supplementary material [53]. iv. conclusions here we presented ideas aimed to bridge the gap between the neurophysiology of simple decisions and the control of sequential operations. our framework proposes a specific set of mechanisms by which multi-step computations can be controlled by neural circuits. action selection is determined by a parallel competition amongst competing neurons which slowly accumulate sensory and mnemonic evidence until a threshold. actions are conceived in a broad sense, as they can result in the activation of motor circuits or other brain circuits not directly involved in a movement of the eyes or the limbs. thresholding the action of the productions results in discrete changes to a meta-stable network. these discrete steps clean up the noise and enable a logical flow of the computation. comprehending the electrophysiological mechanisms of seriality is hindered by the intrinsic difficulty of training non-human primates in complex multi-step tasks. the ideas presented in this report may serve to guide the experimental search for the mechanisms required to perform tasks involving multiple steps. neurons integrating evidence towards a threshold should be observed even in the absence of an overt response, for instance in the frontal-eye fields of awake monkeys for the control of attention. memory neurons should show fast transitions between metastable states, on average every ∼ 100-250 msec, compatible with the mean time between successive productions in act-r [8]. as mentioned, we do not address how the productions and the order in which they are executed are learned. there is a vast literature, for instance in reinforcement learning [54–56] describing how to learn the sequence of actions required to solve a task. deahene & changeaux [57] showed how a neuronal network can solve a task similar to the bw that we modelled here, but where the order in which productions fire was controlled by the distance from the game state to the goal. instead, our aim here was to investigate how the algorithm (the pseudo-code) may be implemented in neuronal circuits —once it has already been learned— from a small set of generic principles. the operation of the proposed neuronal device in a simple arithmetic task and in a neuronal network capable of solving any instance of a restricted blocks-world domain illustrates the plausibility of our framework for the control of computations involving multiple steps. acknowledgements az was supported by a fellowship of the peruilh foundation, faculty of engineering, universidad de buenos aires. lp was supported by a fellowship of the national research council of argentina (conicet). prr was supported by a netherlands organization for scientific research (nwo)-vici grant, a grant of the nwo excellence program brain and cognition, and a human frontier science program grant. [1] m platt, p glimcher, neural correlates of decision variables in parietal cortex, nature 400, 233 (1999). [2] j d schall, neural basis of deciding, choosing and acting, nat. rev. neurosci. 2, 33 (2001). [3] j i gold, m n shadlen, the neural basis of decision making, annu. rev. neurosci. 30, 535 (2007). [4] p r roelfsema, p s khayat, h spekreijse, subtask sequencing in the primary visual cortex., p. natl. acad. sci. usa 100, 5467 (2003). [5] r romo, e salinas, flutter discrimination: neural codes, perception, memory and decision making, nat. rev. neurosci. 4, 203 (2003). [6] s i moro, m tolboom, p s khayat, p r roelfsema, neuronal activity in the visual cortex reveals the temporal order of cognitive operations, j. neurosci. 30, 16293 (2010). [7] a newell, unified theories of cognition, harvard university press, cambridge, massachusetts (1990). [8] j r anderson, c j lebiere, the atomic components of thought, lawrence erlbaum, mahwah, new jersey (1998). 050006-11 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. [9] s ullman, visual routines, cognition 18, 97 (1984). [10] a newell, productions systems: models of control structures, in: visual information processing, ed. w g chase, pag. 463, academic press, new york (1973). [11] s dehaene, m sigman, from a single decision to a multi-step algorithm, curr. opin. neurobio. 22, 937 (2012). [12] j gottlieb, p balan, attention as a decision in information space, trends cogn. sci. 14, 240 (2010). [13] j d roitman, m n shadlen, response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task, j. neurosci. 22, 9475 (2002). [14] m n shadlen, w t newsome, motion perception: seeing and deciding, p. natl. acad. sci. usa 93, 628 (1996). [15] y huang, a friesen, t hanks, m shadlen, r rao, how prior probability influences decision making: a unifying probabilistic model, in: advances in neural information processing systems 25, eds. p bartlett, f c n pereira, c j ca l burges, l bottou, k q weinberger, pag. 1277, lake tahoe, nevada (2012). [16] l p sugrue, g s corrado, w t newsome, matching behavior and the representation of value in the parietal cortex, science 304, 1782 (2004). [17] j d wallis, k c anderson, e k miller, single neurons in prefrontal cortex encode abstract rules, nature 411, 953 (2001). [18] j von neumann, the computer and the brain, yale university press, new haven, connecticut (1958). [19] a zylberberg, s dehaene, p r roelfsema, m sigman, the human turing machine: a neural framework for mental programs, trends cogn. sci. 15, 293 (2011). [20] g maimon, j a assad, a cognitive signal for the proactive timing of action in macaque lip, nat. neurosci. 9, 948 (2006). [21] m n shadlen, r kiani, t d hanks, a k churchland, neurobiology of decision making an intentional framework, in: better than conscious?, eds. c engel, w singer, pag. 71, mit press, massachusetts (2008). [22] a zylberberg, s dehaene, g b mindlin, m sigman, neurophysiological bases of exponential sensory decay and top-down memory retrieval: a model, front. comput. neurosci. 3, 4 (2009). [23] g mongillo, o barak, m tsodyks, synaptic theory of working memory, science 319, 1543 (2008). [24] r c o’reilly, biologically based computational models of high-level cognition, science 314, 91 (2006). [25] l shastri, v ajjanagadde, et al., from simple associations to systematic reasoning: a connectionist representation of rules, variables and dynamic bindings using temporal synchrony, behav. brain sci. 16, 417 (1993). [26] r hahnloser, r j douglas, m mahowald, k hepp, feedback interactions between neuronal pointers and maps for attentional processing, nat. neurosci. 2, 746 (1999). [27] x j wang, introduction to computational neuroscience, technical report volen center for complex systems, brandeis university, waltham, massachusetts (2006). [28] c carr, m konishi, a circuit for detection of interaural time differences in the brain stem of the barn owl, j. neurosci. 10, 3227 (1990). [29] j slaney, s thiébaux, blocks world revisited, artif. intell. 125, 119 (2001). [30] p r roelfsema, v a lamme, h spekreijse, the implementation of visual routines, vision res. 40, 1385 (2000). [31] p r roelfsema, elemental operations in vision, trends cogn. sci. 9, 226 (2005). 050006-12 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. [32] s dehaene, j p changeux, development of elementary numerical abilities: a neuronal model, j. cognitive neurosci. 5, 390 (1993). [33] m piazza, v izard, p pinel, d le bihan, s dehaene, tuning curves for approximate numerosity in the human intraparietal sulcus, neuron 44, 547 (2004). [34] a nieder, s dehaene, representation of number in the brain, annu. rev. neurosci. 32, 185 (2009). [35] c lebiere, the dynamics of cognition: an act-r model of cognitive arithmetic, doctoral dissertation. carnegie mellon university, pittsburgh, pennsylvania (1998). [36] d y ts’o, c d gilbert, t n wiesel, relationships between horizontal interactions and functional architecture in cat striate cortex as revealed by cross-correlation analysis, j. neurosci. 6, 1160 (1986). [37] b a mcguire, c d gilbert, p k rivlin, t n wiesel, targets of horizontal connections in macaque primary visual cortex, j. comp. neurol. 305, 370 (1991). [38] c d gilbert, y daniel, t n wiesel, lateral interactions in visual cortex, in: from pigments to perception, eds. a valberg, b b lee, pag. 239, plenun press, new york (1991). [39] m sigman, g a cecchi, c d gilbert, m o magnasco, on a common circle: natural scenes and gestalt rules, p. natl. acad. sci. usa 98, 1935 (2001). [40] c d gilbert, m sigman, r e crist, the neural basis of perceptual learning, neuron 31, 681 (2001). [41] c d gilbert, m sigman, brain states: topdown influences in sensory processing, neuron 54, 677 (2007). [42] m k kapadia, m ito, c d gilbert, g westheimer, improvement in visual sensitivity by changes in local context: parallel studies in human observers and in v1 of alert monkeys, neuron 15, 843 (1995). [43] v a lamme, p r roelfsema, the distinct modes of vision offered by feedforward and recurrent processing., trends neurosci. 23, 571 (2000). [44] s thorpe, d fize, c marlot, speed of processing in the human visual system, nature 381, 520 (1996). [45] d j felleman, d c van essen, distributed hierarchical processing in the primate cerebral cortex, cereb. cortex 1, 1 (1991). [46] g sperling, the information available in brief visual presentations, psychol. monogr. gen. a. 74, 1 (1960). [47] m graziano, m sigman, the dynamics of sensory buffers: geometric spatial and experience-dependent shaping of iconic memory, j. vision 8, 1 (2008). [48] f hamker, the role of feedback connections in task-driven visual search, in: connectionist models in cognitive neuroscience, eds. d heinke et al., pag. 252, springer–verlag, london (1999). [49] f h hamker, a dynamic model of how feature cues guide spatial attention, vision res. 44, 501 (2004). [50] d heinke, g w humphreys, attention, spatial representation, and visual neglect: simulating emergent attention and spatial memory in the selective attention for identification model (saim)., psychol. rev. 110, 29 (2003). [51] t shallice, specific impairments of planning, phil. trans. r. soc. lond. b 298, 199 (1982). [52] d h ballard, m m hayhoe, p k pook, r p n rao, deictic codes for the embodiment of cognition, behav. brain sci. 20, 723 (1997). [53] a zylberberg, l paz, p r roelfsema, s dehaene, m sigman, supplementary material to this paper, available at www.papersinphysics.org (2013). [54] r s sutton, a g barto, reinforcement learning: an introduction, mit press, massachusetts (1998). 050006-13 papers in physics, vol. 5, art. 050006 (2013) / a. zylberberg et al. [55] p r roelfsema, a van ooyen, t watanabe, perceptual learning rules based on reinforcers and attention., trends cogn. sci. 14, 64 (2010). [56] j o rombouts, s m bohte, p r roelfsema, neurally plausible reinforcement learning of working memory tasks, in: advances in neural information processing systems 25, eds. p bartlett, f c n pereira, c j ca l burges, l bottou, k q weinberger, pag. 1880, lake tahoe, nevada (2012). [57] s dehaene, j p changeux, a hierarchical neuronal network for planning behavior, p. natl. acad. sci. usa 94, 13293 (1997). 050006-14 papers in physics, vol. 6, art. 060013 (2014) received: 9 october 2014, accepted: 13 november 2014 edited by: g. c. barker licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060013 www.papersinphysics.org issn 1852-4249 sequential evacuation strategy for multiple rooms toward the same means of egress d. r. parisi,1, 2∗ p. a. negri2, 3† this paper examines different evacuation strategies for systems where several rooms evacuate through the same means of egress, using microscopic pedestrian simulation. as a case study, a medium-rise office building is considered. it was found that the standard strategy, whereby the simultaneous evacuation of all levels is performed, can be improved by a sequential evacuation, beginning with the lowest floor and continuing successively with each one of the upper floors after a certain delay. the importance of the present research is that it provides the basis for the design and implementation of new evacuation strategies and alarm systems that could significantly improve the evacuation of multiple rooms through a common means of escape. i. introduction a quick and safe evacuation of a building when threats or hazards are present, whether natural or man-made, is of enormous interest in the field of safety design. any improvement in this sense would increase evacuation safety, and a greater number of lives could be better protected when fast and efficient total egress is required. evacuation from real pedestrian facilities can have different degrees of complexity due to the particular layout, functionality, means of escape, occupation and evacuation plans. during the last two decades, modeling and simulation of pedestrian ∗e-mail: dparisi@itba.edu.ar †e-mail: pnegri@uade.edu.ar 1 instituto tecnológico de buenos aires, 25 de mayo 444, 1002 ciudad autónoma de buenos aires, argentina. 2 consejo nacional de investigaciones cient́ıficas y técnicas, av. rivadavia 1917, 1033 ciudad autónoma de buenos aires, argentina. 3 universidad argentina de la empresa, lima 754, 1073 ciudad autónoma de buenos aires, argentina. movements have developed into a new approach to the study of this kind of system. basic research on evacuation dynamics has started with the simplest problem of evacuation from a room through a single door. this “building block” problem of pedestrian evacuation has extensively been studied in the bibliography, for example, experimetally [1, 2], or by using the social force model [3–5], and cellular automata models [6–8], among many others. as a next step, we propose investigating the egress from multiple rooms toward a single means of egress, such as a hallway or corridor. examples of this configuration are schools and universities where several classrooms open into a single hallway, cinema complexes, museums, office buildings, and the evacuation of different building floors via the same staircase. the key variable in this kind of system is the timing (simultaneity) at which the different occupants of individual rooms go toward the common means of egress. clearly, this means of egress has a certain capacity that can be rapidly exceeded if all rooms are evacuated simultaneously and thus, the total evacuation time can be suboptimal. so, it is valid to ask in what order the different 060013-1 papers in physics, vol. 6, art. 060013 (2014) / d. r. parisi et al. rooms should be evacuated. the answer to this question is not obvious. depending on the synchronization and order in which the individual rooms are evacuated, the hallway can be saturated in different sectors, which could hinder the exit from some rooms and thus, the corresponding flow rate of people will be limited by the degree of saturation of the hallway. this is because density is a limitation for speed. the relationship between density and velocity in a crowd is called “fundamental diagram of pedestrian traffic” [9–14]. therefore, the performance of the egress from each room will depend on the density of people in the hallway, which is difficult to predict from analytical methods. this type of analysis is limited to simple cases such as simultaneous evacuation of all rooms, assuming a maximum degree of saturation on the stairs. an example of an analytical resolution for this simple case can be seen in ref. [12], on chapter 3-14, where the egress from a multistory building is studied. from now on we will analyze a 2d version of this particular case: an office building with 7 floors being evacuated through the same staircase, which is just an example of the general problem of several rooms evacuating through a common means of egress. i. description of the evacuation process the evacuation process comprises two periods: e1, reaction time indicating the time period between the onset of a threat or incident and the instant when the occupants of the building begin to evacuate. e2, the evacuation time itself is measured from the beginning of the egress, when the first person starts to exit, until the last person is able leave the building. e1 can be subdivided into: time to detect danger, report to building manager, decision-making of the person responsible for starting the evacuation, and the time it takes to activate the alarm. these times are of variable duration depending on the usage given to the building, the day and time of the event, the occupants training, the proper functioning of the alarm system, etc. because period e1 takes place before the alarm system is triggered, it must be separated from period e2. the duration of e1 is the same for the whole building. in consequence, for the present study only the evacuation process itself described as period e2 is considered. the total time of a real complete evacuation will be necessarily longer depending on the duration of e1. ii. hypothesis this subsection defines the scope and conditions that are assumed for the system. 1. the study only considers period e2 (the evacuation process itself) described in subsection i. i. above. 2. all floors have the same priority for evacuation. the case in which there is a fire at some intermediate floor is not considered. 3. the main aspect to be analyzed is the movement of people who follow the evacuation plan. other aspects of safety such as types of doors, materials, electrical installation, ventilation system, storage of toxic products, etc., are not included in the present analysis. 4. after the alarm is triggered on each floor, the egress begins under conditions similar to those of a fire drill, namely: • people walk under normal conditions, without running. • if high densities are produced, people wait without pushing. • exits are free and the doors are wide open. • the evacuation plan is properly signaled. • people start to evacuate when the alarm is activated on their own floor, following the evacuation signals. • there is good visibility. ii. simulations i. the model the physical model implemented is the one described in [15], which is a modification of the social 060013-2 papers in physics, vol. 6, art. 060013 (2014) / d. r. parisi et al. force model (sfm) [3]. this modification allows a better approximation to the fundamental diagram of ref. [12], commonly used in the design of pedestrian facilities. the sfm is a continuous-space and force-based model that describes the dynamics considering the forces exerted over each particle (pi). its newton equation reads miai = fdi + fsi + fci, (1) where ai is the acceleration of particle pi. the equations are solved using standard molecular dynamics techniques. the three forces are: “driving force” (fdi), “social force” (fsi) and “contact force”(fci). the corresponding expressions are as follows fdi = mi (vdi ei − vi) τ , (2) where mi is the particle mass, vi and vdi are the actual velocity and the desired velocity magnitude, respectively. ei is the unit vector pointing to the desired target (particles inside the corridors or rooms have their targets located at the closest position over the line of the exit door), τ is a constant related to the time needed for the particle to achieve vd. fsi = np∑ j=1,j 6=i a exp ( −�ij b ) enij, (3) with np being the total number of pedestrians in the system, a and b are constants that determine the strength and range of the social interaction, enij is the unit vector pointing from particle pj to pi; this direction is the “normal” direction between two particles, and �ij is defined as �ij = rij − (ri + rj), (4) where rij is the distance between the centers of pi and pj and r is their corresponding particle radius. fci = (5) np∑ j=1,j 6=i [ (−�ij kn) enij + (v t ij �ij kt) e t ij ] g(�ij), where the tangential unit vector (etij) indicates the corresponding perpendicular direction, kn and kt are the normal and tangential elastic restorative constants, vtij is the tangential projection of the relative velocity seen from pj(vij = vi − vj), and the function g(�ij) is: g = 1 if �ij < 0 or g = 0 otherwise. because this version of the sfm does not provide any self-stopping mechanism for the particles, it cannot reproduce the fundamental diagram of pedestrian traffic as shown in ref. [15]. in consequence, the modification consists in providing virtual pedestrians with a way to stop pushing other pedestrians. this is achieved by incorporating a semicircular respect area close to and ahead of the particle (pi). while any other pedestrian is inside this area, the desired velocity of pedestrians (pi) is set equal to zero (vdi = 0). for further details and benefits of this modification to the sfm, we refer the reader to ref. [15]. the kind of model used allows one to define the pedestrian characteristics individually. following standard pedestrian dynamics bibliography (see, for example, [3–5, 15]), we considered independent and uniform distributed values between the ranges: pedestrian mass m � [70 kg, 90 kg]; shoulder width d � [48 cm, 56 cm]; desired velocity vd � [1.1 m/s, 1.5 m/s]; and the constant values are: τ = 0.5 s, a = 2000 n, b = 0.08 m, kn = 1.2 10 5 n/m, kt = 2.4 10 5 kg/m/s. beyond the microscopic model, pedestrian behavior simply consists in moving toward the exit of the room and then toward the exit of the hallway, following the evacuation plan. from the simulations, all the positions and velocities of the virtual pedestrians were recorded every 0.1 second. from these data, it is possible to calculate several outputs; in the present work we focused on evacuation times. ii. definition of the system under study as a case study, we have chosen that of a mediumrise office building with n = 7, n being the number of floors. this system was studied analytically in chapter 3-14 in ref. [12], only for the case of simultaneous evacuation of all floors. the building has two fire escapes in a symmetric architecture. at each level, there are 300 occupants. exploiting the symmetric configuration, we will only consider the egress of 150 persons toward one of the stairs. thus, on each floor, 150 people 060013-3 papers in physics, vol. 6, art. 060013 (2014) / d. r. parisi et al. figure 1: schematic of the two-dimensional system to be simulated. each black dot indicates one person. are initially placed along the central corridor that is 1.2 m wide and 45 m long. in total, 1050 pedestrians are considered for simulating the system. for the sake of simplicity, we define a twodimensional version of a building where the central corridors of all the floors and the staircase are considered to be on the same plane as shown in fig. 1. the central corridors can be identified with the “rooms” of the general problem described in section i. and the staircase is the common means of egress. the effective width of the stairway is 1.4 m. the central corridors of each floor are separated by 10.66 m. this separation arises from adding the horizontal distance of the steps and the landings between floors in the 3d system [12]. so the distance between two floors in the 2d version of the problem is of the same length as the horizontal distance that a person should walk, also between two floors, along the stairway in the 3d building. iii. evacuation strategies the objective of proposing a strategy in which different floors start their evacuation at different times is to investigate whether this method allows an improvement over the standard procedure, which is the simultaneous evacuation of all floors. the parameters to be varied in the study are the following: a the order in which the different levels are evacuated. in this sense, we study two procedures: a.1) “bottom-up”: indicates that the evacuation begins on the lowest (1st) floor and then follows in order to the immediately superior floors. a.2) “top-down” indicates that the evacuation begins on the top floor (7th, in this case), and continues to the next lower floor, until the 1st floor is finally evacuated. b the time delay dt between the start of the evacuation of two consecutive floors. this could be implemented in a real system through a segmented alarm system for each floor, which triggers the start of the evacuation in an independent way for the corresponding floor. the initial time, when the first fire alarm is triggered in the building, is defined as t0. the instant t f 0 {bu,t d,se} indicates the time when the alarm is activated on floor f. subindices {bu,td,se} are set if the time t belongs to the bottom-up, top-down, or simultaneous evacuation strategies, respectively. the bottom-up strategy establishes that the 1st floor is evacuated first: t10 bu = t0. then the alarm on the 2nd floor is triggered after dt seconds, t20 bu = t 1 0 bu + dt, and so on in ascending order up to the 7th floor . in general, the time when the alarm is triggered on floor f can be calculated as: t f 0 bu = t0 + dt× (f − 1). (6) the top-down strategy begins the building evacuation on the top floor (7th, in this case): t70 t d = t0. after a time dt, the evacuation of the floor immediately below starts, and so on until the evacuation of the 1st floor: t f 0 t d = t0 + dt× (n −f). (7) simultaneous evacuation is the special case in which dt = 0 and thus, it considers the alarms on all the floors to be triggered at the same time: t f 0 se = t0|f=1,2,...,7. (8) 060013-4 papers in physics, vol. 6, art. 060013 (2014) / d. r. parisi et al. iii. results this section presents the results of simulations made by varying the strategy and the time delay between the beginning of the evacuation of the different levels. each configuration was simulated five times, and thus, the mean values and standard deviations are reported. this is consistent with reality, because if a drill is repeated in the same building, total evacuation times will not be exactly the same. i. metrics definition here we define the metrics that will be used to quantify the efficiency of the evacuation process of the system under study. it is called total evacuation time (tet), starting at t0, when everyone in the building (150×7 = 1050 persons) has reached the exit located on the ground floor (see fig. 1), which means that the building is completely evacuated. the fth floor evacuation time (fetf ) refers to the time elapsed since initiating the evacuation of floor f until its 150 occupants reach the staircase. it must be noted that this evacuation time does not consider the time elapsed between the access to the staircase and the general exit from the building, nor does it consider as starting time the time at which the evacuation of some other level or of the building in general begins. it only considers the beginning of the evacuation of the current floor. average floor evacuation time (fet ) is the average of the seven fetf . from these definitions, it follows that tet > fetf for any floor (even the lowest one). ii. simultaneous evacuation strategy in general, the standard methodology consists in evacuating all the floors having the same priority at the same time. under these conditions, the capacity of the stairs saturates quickly, and so all floors have a slow evacuation. figure 2 shows a snapshot from one simulation of this strategy. here, the profile of the queues at each level can be observed. the differences in the length of queues are due to differences in the temporal evolution of density in front of each door. figure 2: snapshot taken at 73 seconds since the start of the simultaneous evacuation, where the queues of different lengths can be observed on each floor. in this evacuation scheme, the first level that can be emptied is the 1st floor (105 ± 6 s) and the last one is the 6th floor (259 ± 3 s). the total evacuation time (tet ) of the building for this configuration is 316±8 s, and the mean floor evacuation time (fet) is 195 ± 55 s. for reference, the independent evacuation of a single floor toward the stairs was also simulated. it was found that the evacuation time of only one level toward the empty stair is 65 ± 4 s. iii. bottom-up strategy figure 3(a) shows the evacuation times for different time delays dt following the bottom-up strategy. it can be seen that the total evacuation time (tet ) remains constant for time delays (dt) up to 30 seconds. therefore, tet is the same as the simultaneous evacuation strategy (dt = 0 s) in this range. it is worth noting that 30 seconds is approximately one half of the time needed to evacuate a floor if the staircase were empty. furthermore, the mean floor evacuation time (fet) declines as dt increases, reaching the 060013-5 papers in physics, vol. 6, art. 060013 (2014) / d. r. parisi et al. −20 0 20 40 60 80 100 0 100 200 300 400 500 600 700 dt (s) e v a c u a ti o n t im e ( s ) tet fet (a) −20 0 20 40 60 80 100 0 100 200 300 400 500 600 700 dt (s) e v a c u a ti o n t im e ( s ) tet fet (b) figure 3: tet and fet, obtained from simulations for different phase shifts (dt) following sequential evacuation: (a) bottom-up strategy, (b) top-down strategy. the symbols and error bars indicate one standard deviation. asymptotic value for 65 seconds, which is the evacuation time of a single floor considering the empty stairway. as expected, if the levels are evacuated one at a time, with a time delay greater than the duration of the evacuation time of one floor, the system is at the limit of decoupled or independent levels. in these cases, tet increases linearly with dt. since tet is the same for dt < 30 s and fet is significantly improved (it is reduced by half) for dt = 30 s, this phase shift can be taken as the best value, for this strategy, to evacuate this particular building. this result is surprising because the tet of the building is not affected by systematic delays (dt) at the start of the evacuation of each floor if dt ≤ 30 s, which reaches up to 180 seconds for the floor that further delays the start of the evacuation. more details can be obtained by looking at the discharge curves corresponding to one realization of the building egress simulation. the evacuation of the first 140 pedestrians (93%) of each floor is analyzed by plotting the occupation as a function of time in fig. 4 for three time delays between the relevant range dt �[0, 30]. for dt = 0 [fig. 4(a)] there is an initial transient of about 10 seconds in which every floor can be evacuated toward a free part of the staircase before reaching the congestion due to the evacuation of lower levels. after that, it can be seen that the egress time of different floors has important variations, the lower floors (1st and 2nd) being the ones that evacuate quicker and intermediate floors such as 5th and 6th the ones that take longer to evacuate. after an intermediate situation for dt = 15 s [fig. 4(b)] we can observe the population profiles for the optimum phase shift of dt = 30 in fig. 4(c). there, it can be seen that the first 140 occupants of different floors evacuate uniformly and very little perturbation from one to another is observed. in the curves shown in fig. 4, the derivative of the population curve is the flow rate, meaning that low slopes (almost horizontal parts of the curve such as the one observed in fig. 4(a) for the 5th floor between 40 and 100 s) can be identified with lower velocities and higher waiting time for the evacuating people. because of the fundamental diagram, we know that lower velocities indicate higher densities. in consequence, we can say that the greater the slope of the population curves, the greater the comfort of the evacuation (more velocity, less waiting time, less density). therefore, it is clear that the situation displayed in fig. 4(c) is much more comfortable than the one in fig. 4(a). in short, for the bottom-up strategy, the time delay dt = 30 s minimizes the perturbation among 060013-6 papers in physics, vol. 6, art. 060013 (2014) / d. r. parisi et al. (a) (b) (c) figure 4: time evolution of the number of pedestrians in each floor up to 3 m before the exit to the staircase. (a) for the simultaneous evacuation (dt = 0); (b) for delay of dt = 15 s and (c) for dt = 30 s. evacuating pedestrians from successive levels; it reduces fet to one half of the simultaneous strategy (dt = 0 s); it maintains the total evacuation time (tet ) at the minimum and, overall, it exploits the maximum capacity of the staircase maintaining each pedestrian’s evacuation time at a minimum. this result is highly beneficial for the general system and for each floor, because it can avoid situations generating impatience due to waiting for gaining access to the staircase. iv. top-down strategy figure 3(b) shows the variation of tet and fet, as a function of the time delay dt, for the top-down strategy. it must be noted that tet increases monotonously for all dt, which is sufficient to rule out this evacuation scheme. in addition, for dt < 15 s, fet also increased, peaking at dt = 15 s. it can be said that for the system studied, the top-down strategy with a time delay of dt = 15 s leads to the worst case scenario. for 15 s < dt < 45 s, there is a change of regime in which fet decreases and tet stabilizes. for values of dt > 45 s, fet reaches the limit of independent evacuation of a single floor (see section iii.ii.). and the tet of the building increases linearly due to the increasing delays between the start of the evacuation of the different floors. in summary, the top-down strategy does not present any improvement with respect to the standard strategy of simultaneous evacuation of all floors (dt = 0). iv. conclusions in this paper, we studied the evacuation of several pedestrian reservoirs (“rooms”) toward the same means of egress (“hallway”). in particular, we focused on an example, namely, a multistory building in which different floors are evacuated toward the staircase. we studied various strategies using computer simulations of people’s movement. a new methodology, consisting in the sequential evacuation of the different floors (after a time delay dt) is proposed and compared to the commonly used strategy in which all the floors begin to evacuate simultaneously. for the system under consideration, the present study shows that if a strategy of sequential evacuation of levels begins with the evacuation of the 1st floor and, after a delay of 30 seconds (in this particular case, 30 s is approximately one half of the time needed to evacuate only one floor if the staircase were empty), it follows with the evacuation of the 2nd floor and so on (bottom-up strategy), the quality of the overall evacuation process improves. from the standpoint of the evacuation of the building, tet is the same as that for the reference state. however, if fet is considered, there is a significant improvement since it falls to about half. this will make each person more comfortable during an evacuation, reducing the waiting time and thus, the probability of causing anxiety that may bring undesirable consequences. 060013-7 papers in physics, vol. 6, art. 060013 (2014) / d. r. parisi et al. so, one important general conclusion is that a sequential bottom-up strategy with a certain phase shift can improve the quality of the evacuation of a building of medium height. on the other hand, the simulations show that the sequential top-down strategy is unwise for any time delay (dt). in particular, for the system studied, the value dt = 15 s leads to a very poor evacuation since the tet is greater than that of the reference, and it maximizes fet (which is also higher than the reference value at dt = 0). in consequence, the present study reveals that this would be a bad strategy that should be avoided. the perspectives for future work are to generalize this study to buildings with an arbitrary number of floors (tall buildings), seeking new strategies. we also intend to analyze strategies where some intermediate floor must be evacuated first (e.g., in case of a fire) and then the rest of the floors. the results of the present research could form the basis for developing new and innovative alarm systems and evacuation strategies aimed at enhancing the comfort and security conditions for people who must evacuate from pedestrian facilities, such us multistory buildings, schools, universities, and other systems in which several “rooms” share a common means of escape. acknowledgements this work was financially supported by grant pict2011 1238 (anpcyt, argentina). [1] t kretz, a grnebohm, m schreckenberg, experimental study of pedestrian flow through a bottleneck, j. stat. mech. p10014 (2006). [2] a seyfried, o passon, b steffen, m boltes, t rupprecht, w klingsch, new insights into pedestrian flow through bottlenecks, transport. sci. 43, 395 (2009). [3] d helbing, i farkas, t vicsek, simulating dynamical features of escape panic, nature 407, 487 (2000). [4] d r parisi, c dorso, microscopic dynamics of pedestrian evacuation, physica a 354, 608 (2005). [5] d r parisi, c dorso, morphological and dynamical aspects of the room evacuation process, physica a 385, 343 (2007). [6] a kirchner, a schadschneider, simulation of evacuation processes using a bionics-inspired cellular automaton model for pedestrian dynamics, physica a 312, 260 (2002). [7] c burstedde, k klauck, a schadschneider, j zittartz, simulation of pedestrian dynamics using a two-dimensional cellular automaton, physica a 295, 507 (2001). [8] w song, x xu, b h wang, s ni, simulation of evacuation processes using a multi-grid model for pedestrian dynamics, physica a 363, 492 (2006). [9] u weidmann, transporttechnik der eussgänger, transporttechnische eigenschaften des fussgängerverkehrs, zweite, ergänzte auflage, zürich, 90 (1993). [10] j fruin, pedestrian planning and design, the metropolitan association of urban designers and environmental planners, new york (1971). [11] a seyfried, b steffen, w klingsch, m boltes, the fundamental diagram of pedestrian movement revisited, j. stat. mech. p10002 (2005). [12] p j di nenno (ed.), sfpe handbook of fire protection engineering, society of fire protection engineers and national fire protection association (2002). [13] d helbing, a johansson, h al-abideen, dynamics of crowd disasters: an empirical study, phys. rev. e 75, 046109 (2007). [14] http://www.asim.uni-wuppertal.de/datab ase-new/data-from-literature/fundament al-diagrams.html, accessed november 27, 2014. [15] d r parisi, b m gilman, h moldovan, a modification of the social force model can reproduce experimental data of pedestrian flows in normal conditions, physica a 388, 3600 (2009). 060013-8 papers in physics, vol. 6, art. 060007 (2014) received: 14 june 2014, accepted: 7 october 2014 edited by: l. a. pugnaloni reviewed by: r. arévalo, school of physical and mathematical sciences, nanyang technological university, singapore. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060007 www.papersinphysics.org issn 1852-4249 jamming transition in a two-dimensional open granular pile with rolling resistance c. f. m. magalhães,1∗ a. p. f. atman,2, 3† g. combe,5 j. g. moreira4‡ we present a molecular dynamics study of the jamming/unjamming transition in twodimensional granular piles with open boundaries. the grains are modeled by viscoelastic forces, coulomb friction and resistance to rolling. two models for the rolling resistance interaction were assessed: one considers a constant rolling friction coefficient, and the other one a strain dependent coefficient. the piles are grown on a finite size substrate and subsequently discharged through an orifice opened at the center of the substrate. varying the orifice width and taking the final height of the pile after the discharge as the order parameter, one can devise a transition from a jammed regime (when the grain flux is always clogged by an arch) to a catastrophic regime, in which the pile is completely destroyed by an avalanche as large as the system size. a finite size analysis shows that there is a finite orifice width associated with the threshold for the unjamming transition, no matter the model used for the microscopic interactions. as expected, the value of this threshold width increases when rolling resistance is considered, and it depends on the model used for the rolling friction. i. introduction granular materials are ubiquitous either in nature —desert dunes, beach sand, soil, etc.— or in indus∗e-mail: cfmm@unifei.edu.br †e-mail: atman@dppg.cefetmg.br ‡e-mail: jmoreira@fisica.ufmg.br 1 universidade federal de itajubá campus de itabira, rua irmã ivone drumond, 200, 35900-000 itabira, brazil. 2 departamento de f́ısica e matemática, centro federal de educação tecnológica de minas gerais, av. amazonas, 7675, 30510-000 belo horizonte, brazil. 3 instituto nacional de ciência e tecnologia sistemas complexos, 30510-000 belo horizonte, brasil. 4 universidade federal de minas gerais, caixa postal 702, 30161-970 belo horizonte, brasil. 5 ujf-grenoble 1, grenoble-inp, cnrs umr 5521, 3sr lab. grenoble f-38041, france. trial processes as mineral extraction and processing, or food, construction and pharmaceutical industries [1–3]. in fact, any particulate matter made of macroscopic solid elements can be classified as granular material. the vast phenomenology exhibited by these systems combined with an incomplete understanding about the microscopic physical mechanisms responsible for the macroscopic behavior of these materials have motivated the increasing interest of the physics community in the past years [4, 5]. although materials of this class are not sensitive to thermal fluctuations, they can be found at gas, liquid or solid phases [6]. the transition between solid and liquid phases in granular matter, which is commonly referred to as jamming/unjamming transition, has been extensively studied from both theoretical and experimental perspectives [4, 6–11]. currently, a great effort is being made to under060007-1 papers in physics, vol. 6, art. 060007 (2014) / c. f. m. magalhães et al. stand the nature of this transition, which is still a subject of debate [12]. the jamming/unjamming transition is not a specific property of granular matter, being observed in many kinds of materials, such as foams [13], emulsions [14], colloids [15], gels [16], and also in usual molecular liquids [17] —glass transition. liu and nagel [9] proposed a general phase diagram as an attempt to unify the several approaches to study jamming/unjamming in disordered materials. this work has motivated several theoretical, experimental and numerical investigations, but a comprehensive understanding of this transition is still lacking. o’hern et al. [18, 19] have performed numerical simulations of granular materials approaching to jamming in two and three dimensions. they have explored the packing fraction axis of the general phase diagram proposed by liu and nagel and have demonstrated, by means of finite-size analysis, the existence of a unique critical point in which the system jams in the thermodynamic limit. the authors have also shown some evidence that this point is an ordinary critical point, indicating that the jamming transition would be a second order phase transition. these results were corroborated later by experiments (c.f. majmudar et al. [10]) and by simulations (c.f. manna and khakhar [20]). in ref. [20], the authors have revealed evidence of self-organized criticality (soc) by measuring the internal avalanches resulting from the opening of an orifice at the bottom of granular piles. experimental investigation of the jamming transition in granular materials under gravitational field has been conducted in a variety of ways, addressing the role of many parameters, like the grain shape, the friction coefficient, and the system geometry. however, a common feature of all these approaches is the analysis of the granular flow through bottlenecks. the jamming of three-dimensional piles seems to be settled after the work of zuriguel et al. [21]. they have demonstrated experimentally, for piles composed of different kinds of grains, the divergence on the mean internal avalanche size. it means that, as the outlet size approaches a critical value, the internal avalanche increases without limit and a permanent flow is established. this critical outlet is insensitive to the density, stiffness and roughness of the grains, but shows a significant dependence on the grain shape. for spherical grains, a critical outlet width wc ∼ 4.94d was obtained, for cylindrical grains wc ∼ 5.03d and for rice grains wc ∼ 6.15d. nevertheless, the jamming transition in twodimensional piles is still a question under debate. in order to address it, to et al. [22] have carried out experiments using two-dimensional hoppers in order to find a critical outlet size for jamming events. the jamming probability j has presented a rapid decay from j = 1 to j = 0 close to the aperture width w ∼ 3.8d, signaling a possible phase transition. the authors discussed, based on a restricted random walk model, the connection between jamming and the arch formation mechanism. nevertheless, the point needs further investigation, especially at the limit of high hopper angle. there exist several works [23–25] focused on the mechanisms of arch formation, but none had explored its relation to jamming probability. janda et al. [26] have made some progress by simulating discharges of two-dimensional silos. they have improved the definition of jamming so that the internal avalanche size is considered. this modification addresses the extremely long relaxation times associated with jamming. within the framework of a probabilistic theory concerning the arch formation, the authors have tested two hypothesis for the internal avalanche behavior: a functional form that predicts a divergence in the mean size, and a functional form where the mean size exists for all values of the orifice width w. since the latter one is more compatible with the arch formation model, the authors claimed that “no critical opening size exists beyond which there is not jamming” [26]. the results mentioned so far are related to fully confined systems. recently, a simulation study on discharges of granular piles with open boundaries [27] provided new insights on the problem. the piles are composed by homogeneous disks interacting via elastic and frictional forces. using finite size analysis, the authors have shown that a catastrophic regime, in which the pile is completely destroyed by the opening of the orifice, is well defined. at the limit of infinitely large systems, the catastrophic regime coincides with the unjammed phase, since it implies a divergent internal avalanche. hence, the results indicate the existence of the jamming transition. it is important to note, however, that the pile geometry could probably play a role in the causes for this distinct 060007-2 papers in physics, vol. 6, art. 060007 (2014) / c. f. m. magalhães et al. behavior, due to the absence of the janssen effect, but further investigations are necessary to confirm this point. in the present work, the investigation of jamming in 2d open systems is extended in order to consider a rolling resistance term in the grain interaction model, following the prescription adopted by chevalier et al. [28]. the main objective of this study is the verification of rolling resistance influence on jamming. many factors contribute to the appearance of rolling friction, including microsliding, plastic deformation, surface adhesion, grain shape, etc., but mainly, it is due to the contact deformation [29]. here, it will be taken into account only the effect due to the contact deformation by implementing the micromechanical model proposed by jiang et al. [30]. the rolling friction produces a resistance to roll which, among other effects, is responsible for granting more stability to granular piles [31], and for the occurrence of different types of failure modes in granular matter [32,33]. rolling friction was also used to model a system of polygonal grains by making a correspondence between the rolling stiffness and the number of sides of the polygon [34]. it was demonstrated that it is an essential ingredient to reproduce experimental compression tests in mixtures of two-dimensional circular and rectangular grains [35]. these facts suggest that jamming could be affected by rolling friction. nevertheless, most studies on granular materials based on computer simulations do not deal with it. the paper is organized as follows: after a review in the introduction, the next section is concerned with the methodology. then, we present the results and a brief discussion. finally, the last section gathers the conclusions and some perspectives. ii. methods the jamming transition is assessed by means of discharges of granular piles, simulated using the molecular dynamics method [36] with the velocityverlet algorithm [37]. in a few words, the molecular dynamics consists in integrating numerically the equations of motion that governs the system dynamics. the system is constituted by n free grains governed by newton’s second law and by a finite and horizontally aligned substrate made of fixed grains. the free grains, which will form the pile, are homogeneous bi-dimensional disks that are free to translate and rotate around their center, and whose radii are uniformly distributed around an average value d, a small polydispersity of 5% was imposed in order to avoid crystallization effects. all spatial quantities will be expressed in terms of d. since the grains have all the same mass density, their masses are proportional to their respective areas. normalized by the mass of the heaviest grain, the masses are given by mi = d 2 i /d 2 max, where dmax stands for the diameter of the largest grain. the finite substrate of length l is composed by fixed grains of the same kind but with a smaller and fixed diameter ds = 0.1d. these grains are aligned horizontally and are equally spaced in order to form a grid without gaps. the grains are subject to a uniform gravitational field orthogonal to the substrate line, and to short range binary interactions. besides the viscoelastic and coulomb friction interactions used in past works [25, 27, 38–42], two grains in contact are also subject to a rolling resistance moment due to the finite contact length lc. the rolling resistance is introduced through a micromechanical model of the contact line between two grains [30]. this model treats the contact as an object formed by a set of springs and dashpots connecting the borders of the two grains. as one grain rolls over the other, the springs in one side of the contact line contract while the springs in the opposite side stretch. this configuration generates an unbalanced force distribution and a consequent moment with respect to the grain center (see fig. 1). this moment grows linearly as the grain rolls, until the rolling displacement δr reaches some threshold value at which the springs located near one end of the contact line break up and new ones emerge at the other end. at that time, the moment saturates at some value that depends on the properties of the grain. the rolling displacement is defined by δr = ∑ (ωi − ωj)∆t, where ωi refers to the angular velocity of grain i, ∆t is molecular dynamics time step, and the sum runs over time during the whole existence of contact. based on these assumptions, the authors have derived an analytic expression for the rolling resistance moment as a function of rolling displacement. they have also proposed a simplified version the one used in this study in order to improve numerical computations: 060007-3 papers in physics, vol. 6, art. 060007 (2014) / c. f. m. magalhães et al. τrr =   −krδr, kr|δr| ≤ µrfel − δr |δr| µrfel, kr|δr| > µrfel , (1) where kr is the rolling stiffness, µr is the coefficient of rolling resistance, and fel is the compressive elastic force, normal to contact line. as can be noted, the expression for the rolling resistance momentum possesses a striking resemblance with the coulomb static friction force, and as a matter of fact, the rolling resistance interaction is implemented in the molecular dynamics algorithm in the same way as the static friction. the model predicts that the rolling stiffness and the coefficient of rolling resistance are related to the contact length lc by the equations kr = knl 2 c and µr = µlc, where kn and µ are respectively the normal elastic constant and friction coefficient. the values used for these parameters were the same as in ref. [27], µ = 0.5 and kn = 1000 in normalized unities (see [40] for further details). two cases were investigated: systems in which the rolling resistance parameters kr and µr vary according to the above-mentioned equations, and systems with fixed rolling resistance parameters, assuming that all contacts have the same deformation value. the simulation procedure consists of two steps: (1) the formation of the granular pile with open boundaries, by deposing grains from rest, under gravity, over the substrate until an stationary state is reached; (2) the discharge itself, which consists in opening an orifice of a given width at the center of the substrate. in the first step, the initial positions of the free grains are randomly sorted along a horizontal line located at height l from the substrate the releasing height is equal to the substrate length. to avoid initial overlapping of grains, a 50% filling ratio was imposed to each line of grains released, and the time interval between successive rows is the inverse of the frequency f. each row was released after the predecessor had fallen a distance equivalent to the maximum grain diameter. this deposition protocol mimics a dense rain of grains. during deposition, the grains may leave the system through the lateral boundaries, so that the total number of grains in the pile fluctuates as the process evolves. the release of grains ceases when the number of grains in the pile reaches a stationary value. the deposition phase ends only when a figure 1: in panel (a), a perfectly rigid grain rolling over another one is exhibited. the contact force is concentrated on one point and is aligned through the grain center, thus, not generating momentum with respect to the center. panel (b) shows the same situation but with deformable grains. now, the contact force is non-uniformly distributed over a segment of length lc (contact length). the resultant force ~fr,c is dislocated from the center line and an opposing moment with respect to the center appears. mechanical equilibrium state is attained, and the configuration is recorded for later analysis. the equilibrium state must satisfy the following criteria: mechanical stability, absence of slipping contacts, vertical and horizontal force balance, and vanishingly small kinetic energy [43]. in the second step, the configuration recorded is loaded and an outlet of width w is opened at the center of the substrate, allowing the grains to flow through it. as the grains pass through the outlet, they are removed from the system, in order to improve computational efficiency. the simulation runs until a new equilibrium state is reached, which happens either due to the formation of an arch above the outlet or after the pile has been completely discharged. the remaining pile configuration is then recorded. the average height h of the resulting pile after discharge, measured from the center of the substrate, was taken as an order parameter to distinguish between the two regimes. if h does not change significantly with respect to the original height, it means that an arch does readily clog the flux after the orifice opening, but if h = 0, it means that the pile has collapsed, and a jammed state was not attained for the corresponding orifice diameter and substrate length. as the orifice width w approaches 060007-4 papers in physics, vol. 6, art. 060007 (2014) / c. f. m. magalhães et al. figure 2: diagram of a possible equilibrium configuration if rolling resistance is considered. the grain on top has only one contact and, even so, it can sustain a stable position. the maximum value of the angle θ depends on the friction coefficient and on the rolling resistance parameters. a certain threshold value wt, the system suffers a transition from jammed to unjammed state. this threshold is defined as the orifice width for which the height fluctuations is maximum. the connection to the jamming transition occurs when the substrate length diverges since the collapse of an infinite substrate pile implies a continuous flowing state. then, the jamming transition can be characterized by a critical aperture width, as defined by the following expression wc = lim l→∞ wt(l) . (2) iii. results and discussion two different models of rolling resistance interaction were tested in the simulations: the original model proposed by jiang et al. [30] mentioned earlier, with a rolling constant and a coefficient of rolling resistance that depends on the contact length (kr = knl 2 c and µr = µlc) and a simpler derived version, in which these parameters are constant over time assuming that lc = 0.05 d for all contacts. this fixed lc model assumes a mean contact length equivalent to 5% of the average grain diameter, a scenario which could be associated to a system composed by polygonal grains, for example, a extreme rolling resistance regime. for either models, the numerical simulations described in the last section were carried out for figure 3: images of the pile equilibrium states for the three grain interaction models before the discharge step. the piles were grown over a l = 100d substrate and are representative samples from each type of pile. as shown in the figure, the top image represents a pile composed by grains with a fixed kr rolling resistance interaction, the image in the middle is a pile of grains with a kr ∼ l2c rolling resistance interaction, and the bottom image is a pile of grains without rolling resistance. various system sizes and orifice widths. figure 3 shows equilibrium configurations of typical piles with l = 100d for the two tested rolling resistance models and for the absence of rolling resistance case. it can be seen that the inclusion of the rolling resistance term modifies significantly the macroscopic features of the pile. it provides more stability to the structure, which is reflected by a steeper free surface, a fact also observed elsewhere [31]. indeed, it should be noted that the rolling resistance makes possible some otherwise very unstable twograin configuration, as exemplified in fig. 2. the behavior of the order parameter h as function of w is presented in fig. 4 for the three cases. 060007-5 papers in physics, vol. 6, art. 060007 (2014) / c. f. m. magalhães et al. 0 2 4 6 8 10 12 14 w (d) 0 0.2 0.4 0.6 0.8 h ( n o rm a li z e d u n it ) k r = const. k r ~ l c 2 k r = 0 figure 4: order parameter h as a function of the orifice width w for all types of pile. the graphs were generated from the simulation data of l = 100d piles. while the symbols represent the parameter h itself, the lines indicate the corresponding fluctuations. the thick line is related to the fixed kr curve, the medium thickness line to the kr ∼ lc2 curve, and the dashed line to kr = 0 curve. note that the transition region translates to the right as the rolling resistance becomes more important, which is an expected result since the increasing of stability allows the formation of larger arches. figure 5 exhibits the dependence of wt (fluctuation maximum) on the system size for the three models considered and, again, the curves are dislocated along the wt axis. nevertheless, they all share the same functional aspect, which is an evidence that the rolling resistance does not change the nature of the jamming transition, only the critical value of the threshold width. despite the tendency to a heavy tail distribution observed for large l and w, in all scenarios, the data suggest the existence of the jamming transition. it means that the features described in ref. [27] to characterize the transition were also observed in both scenarios tested for the rolling resistance. the fitting values were wc = (5.3 ± 0.1)d for contact length dependent kr model and wc = (7.6 ± 0.2)d for the fixed kr one, while for the model without rolling resistance, wc = (5.0 ± 0.1)d [27]. these results indicate that the rolling resistance only slightly changes the critical width when included in a system of disks, expressed by the contact length model. but, in other 0 0.01 0.02 0.03 0.04 1 / l (normalized units) 0 1 2 3 4 5 6 7 8 w t (u n it s o f d ) k r = 0 k r ~ l c 2 fixed k r figure 5: graphs of the threshold orifice width wt as a function of the reciprocal system size 1/l for models with and without rolling resistance. as the caption indicates, the symbols represent the data obtained from numerical simulations and the lines are fitting curves, which provide the respective values of wc. approaches, as in the fixed length model, it can alter significantly the critical aperture width. the probability density functions (pdf) of h for the absent rolling friction and for the fixed rolling friction models are exhibited in fig. 6. for each model, the pdfs are obtained for three characteristic values of w: w = 1.0d , w ∼ wt, and w > wt. it can be noted that the pdfs have an approximately gaussian peak around the initial height for all values of w, meaning that there is always a certain amount of samples that are simply not disturbed by the outlet. apart from these samples, the great majority is completely discharged for w > wt. this result is in consonance with that obtained earlier in a different context [25]. in that study, it was shown that for large w, all blocked events occurred after only few grains had passed through the outlet. these facts indicate that the initial conditions may play an important role in jamming experiments. however, this issue needs further investigation. iv. conclusions evidence of the jamming transition was observed in molecular dynamics simulations of open granular piles with rolling resistance, in consonance with 060007-6 papers in physics, vol. 6, art. 060007 (2014) / c. f. m. magalhães et al. 0 0.2 0.4 0.6 h (normalized units) 0.01 0.1 1 10 100 h p d f w = 1.0 d w = 4.0 d w = 6.0 d gaussian fit 0 0.2 0.4 0.6 0.8 1 h (normalized units) 0.01 0.1 1 10 100 h p d f w = 1.0 d w = 7.0 d w = 9.0 d gaussian fit no rolling friction fixed rf parameters figure 6: probability density functions of the order parameter h for different values of w in the case of absent rolling friction grains (top) and fixed rolling friction grains (bottom). the results found in granular piles without this interaction. this result strengthens the expectation that the transition exists in real granular piles and is probably affected by the system geometry. we observe that when there was rolling resistance, the piles built were more stable, denoted for the large mean height of the samples, and also more robust against perturbations, since the critical aperture width increased. in future works, we plan to present a detailed study of the arching statistics for the two approaches considered here. acknowledgements we are grateful for cnpq, fapemig brazilian funding agencies. apfa and gc thanks to cefet-mg by the international interchange which made this interaction possible. [1] s j antony, w hoyle, y ding, granular materials: fundamentals and applications, the royal society of chemistry, cambridge (2004). [2] j duran, sands, powders and grains, springer, berlin (1997). [3] t halsey, a mehta, challenges in granular physics, world scientific publishing, new jersey (2002). [4] a yu , k dong , r yang, s luding, powders and grains 2013: proceedings of the 7th international conference on micromechanics of granular media, aip series. vol. 1542, sydney, australia (2013). [5] gdr midi, eur. phys. j. e 14, 341 (2004). [6] h m jaeger, s r nagel, granular solids, liquids, and gases, rev. mod. phys. 68, 1259 (1996). [7] p cixous, e kolb, n gaudouen, j-c charme, jamming and unjamming by penetration of a cylindrical intruder inside a 2 dimensional dense and disordered granular medium, in: powders and grains 2009, proceedings of the 6th international conference on micromechanics of granular media 1145, 539 (2009). [8] a p f atman, p claudin, g combe, r mari, mechanical response of an inclined frictional granular layer approaching unjamming, europhys. lett. 101, 44006 (2013). [9] a j liu, s r nagel, jamming is not just cool any more, nature 396, 21 (1998). [10] t s majmudar, m sperl, s luding, r p behringer, jamming transition in granular systems, phys. rev. lett. 98, 058001 (2007). [11] c mankoc, a janda, r arévalo, j m pastor, i zuriguel, a garcimart́ın and d maza. the flow rate of granular materials through an orifice, granul. matter 9, 407 (2007). [12] a p f atman, p claudin, g combe, g h b martins, mechanical properties of inclined frictional granular layers, granul. matter 16, 1 (2014). [13] g katgert, m van hecke, jamming and geometry of two-dimensional foams, europhys. lett. 92, 34002 (2010). [14] n d denkov, s tcholakova, k golemanov, a lips, jamming in sheared foams and emulsions, explained by critical instability of the films between neighboring bubbles and drops, phys. rev. lett. 103, 118302 (2009). 060007-7 papers in physics, vol. 6, art. 060007 (2014) / c. f. m. magalhães et al. [15] a kumar, j wu, structural and dynamic properties of colloids near jamming transition, colloid. surf. a 247, 145151 (2004). [16] a fluerasu, a moussaid, a madsen, a schofield, slow dynamics and aging in colloidal gels studied by x-ray photon correlation spectroscopy, phys. rev. e 76, 010401 (2007). [17] b duplantier, t c halsey, v rivasseau, glasses and grains: poincaré seminar 2009, springer, basel (2011). [18] c s o’hern, s a langer, a j liu, s r nagel, random packings of frictionless particles, phys. rev. lett. 88, 075507 (2002). [19] c o’hern, l e silbert, a j liu, s r nagel, jamming at zero temperature and zero applied stress: the epitome of disorder, phys. rev. e 68, 011306 (2003). [20] s s manna, d v khakhar, internal avalanches in a granular medium, phys. rev. e 58, r6935 (1998). [21] i zuriguel, a garcimart́ın, d maza, l a pugnaloni, j m pastor, jamming during the discharge of granular matter from a silo, phys. rev. e 71, 051303 (2005). [22] k to, p-y lai, h k pak, jamming of granular flow in a two-dimensional hopper, phys. rev. lett. 86, 71 (2001). [23] a garcimart́ın, i zuriguel, l a pugnaloni, a janda, shape of jamming arches in twodimensional deposits of granular materials, phys. rev. e 82, 031306 (2010). [24] a drescher, a j waters, c a rhoades, arching in hoppers .2. arching theories and critical outlet size, powder technol. 84, 177 (1995). [25] c f m magalhães, a p f atman, j g moreira, segregation in arch formation, eur. j. phys. e 35, 38 (2012). [26] a janda, i zuriguel, a garcimart́ın, l a pugnaloni, d maza, jamming and critical outlet size in the discharge of a two-dimensional silo, europhys. lett. 84, 44002 (2008). [27] c f m magalhães, j g moreira, a p f atman, catastrophic regime in the discharge of a granular pile, phys. rev. e 82, 051303 (2010). [28] b chevalier, g combe, p villard, experimental and discrete element modeling studies of the trapdoor problem: influence of the macro-mechanical frictional parameters, acta geotech. 7, 15 (2012). [29] j ai, j-f chen, j m rotter, j y ooi, assessment of rolling resistance models in discrete element simulations, powder technol. 206, 269 (2011). [30] m j jiang, h-s yu, d harris, a novel discrete model for granular material incorporating rolling resistance, comput. geotech. 32, 340357 (2005). [31] y c zhou, b d wright, r y yang, b h xu, a b yu, rolling friction in the dynamic simulation of sandpile formation, physica a 269, 536 (1999). [32] k iwashita, m oda, rolling resistance at contacts in simulation of shear band development by dem, j. eng. mech. 124, 285292 (1998). [33] x li, x chu, y t feng, a discrete particle model and numerical modeling of the failure modes of granular materials, eng. computation. 22, 894 (2005). [34] n estrada, e azéma, f radjai, a taboada, identification of rolling resistance as a shape parameter in sheared granular media, phys. rev. e 84, 011306 (2011). [35] e-m charalampidou, g combe, g viggiani, j lanier, mechanical behavior of mixtures of circular and rectangular 2d particles, in: powders and grains 2009: proceedings of the 6th international conference on micromechanics of granular media, aip conf. proc., vol. 1145, pag. 821, (2009). [36] d c rapaport, the art of molecular dynamics simulation, cambridge university press, cambridge (2004). [37] w c swope, h c andersen, p h berens, k r wilson, computer simulation method for the 060007-8 papers in physics, vol. 6, art. 060007 (2014) / c. f. m. magalhães et al. calculation of equilibrium constants for the formation of physical clusters of molecules: application to small water clusters, j. chem. phys. 76, 637 (1982). [38] c goldenberg, a p f atman, p claudin, g combe, i goldhirsch, scale separation in granular packings: stress plateaus and fluctuations, phys. rev. lett. 96, 168001 (2006). [39] s f pinto, a p f atman, m s couto, s g alves, a t bernardes, h f v resende, e c souza, granular fingers on jammed systems: new fluidlike patterns arising in graingrain invasion experiments, phys. rev. lett. 99, 068001 (2007). [40] a p f atman, p claudin, g combe, departure from elasticity in granular layers: investigation of a crossover overload force, comput. phys. commun. 180, 612 (2009). [41] p a cundall, o d l strack, a discrete numerical model for granular assemblies, geotechnique 29, 47 (1979). [42] r d mindlin, compliance of elastic bodies in contact, j. appl. mech. 71, 259 (1949). [43] a p f atman, p brunet, j geng, g reydellet, g combe, p claudin, r p behringer, e clement, sensitivity of the stress response function to packing preparation, j. phys.: cond. matter 17, s2391 (2005). [44] h hertz, on the contact of elastic solids, j. reine angew. math. 92, 156 (1881). [45] m p allen, d j tildesley, computer simulation of liquids, clarendon press, oxford (1987). [46] o o’sullivan, computing quaternions, in: the art of numerical manipulation, eds. a q rista, m nadola, pag. 132, north holland, amsterdam (2003). 060007-9 papers in physics, vol. 8, art. 080002 (2016) received: 11 november 2015, accepted: 7 january 2016 edited by: o. mart́ınez licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.080002 www.papersinphysics.org issn 1852-4249 autonomous open-source hardware apparatus for quantum key distribution ignacio h. lópez grande,1 christian t. schmiegelow,2 miguel a. larotonda1∗ we describe an autonomous, fully functional implementation of the bb84 quantum key distribution protocol using open source hardware microcontrollers for the synchronization, communication, key sifting and real-time key generation diagnostics. the quantum bits are prepared in the polarization of weak optical pulses generated with light emitting diodes, and detected using a sole single-photon counter and a temporally multiplexed scheme. the system generates a shared cryptographic key at a rate of 365 bps, with a raw quantum bit error rate of 2.7%. a detailed description of the peripheral electronics for control, driving and communication between stages is released as supplementary material. the device can be built using simple and reliable hardware and it is presented as an alternative for a practical realization of sophisticated, yet accessible quantum key distribution systems. i. introduction the main goal of cryptography is to obtain a secure method to share information. this is usually achieved by the encryption of the data, using a shared cryptographic key. the security of the protocol then relies on the secrecy of this key. the distribution of a secret key is therefore a crucial task for any symmetric-key cryptographic algorithm. classically, this can be achieved using the diffie-hellman method, or some variation based on it [1]. quantum key distribution (qkd) protocols exploit the quantum no-cloning theorem [2] and the ∗e-mail: mlarotonda@citedef.gob.ar 1 deilap-unidef (citedef-conicet), j. b. de la salle 4397, b1603alo villa martelli, buenos aires, argentina. 2 laboratorio de iones y átomos fŕıos, departamento de f́ısica, facultad de ciencias exactas y naturales, universidad de buenos aires & ifiba-conicet, pabellón 1, ciudad universitaria, 1428 c.a.b.a., argentina. indistinguishability upon measurement of quantum states belonging to non-orthogonal, conjugate bases to accomplish secure distribution of cryptographic keys [3]. these features, combined with the fact that a measurement performed on a quantum system disturbs its original state in some manner, are the fundamental principles in which every qkd protocol is based on, since they allow for the detection of an eventual eavesdropper by monitoring errors on the exchanged key: the attacker cannot completely determine the measured quantum state, nor can she/he copy it; therefore she/he must resend some imperfect copy to the receiver, which may introduce errors in the key. however, a practical real-world qkd implementation is still a technical challenge that combines concepts and technologies from different areas, such as classical and quantum information theory, quantum optics, electronics and optoelectronics [4]. in this work, we describe a functional autonomous apparatus that implements the bb84 quantum key distribution protocol [5] where we implement several solutions that contribute to the affordability of a naturally costly 080002-1 papers in physics, vol. 8, art. 080002 (2016) / i. h. lópez grande et al. piece of equipment. a critical parameter for the security of any quantum cryptography protocol is the quantum bit error rate (qber), which is obtained after an error estimation from the sifted keys sa and sb —which in theory should be identical— and in the absence of an eavesdropper they are similar up to experimental errors: a small part of the key is randomly selected and used to obtain the qber, which gives an estimation of the error rate in the whole length of the key. once the protocol is running, the qber is routinely monitored by resigning part of the key. it is assumed that any increase of the qber may be generated by the presence of an eavesdropper; in such case the whole key is discarded. theoretical upper limits have been found for the qber rate that if preserved, unconditional security of the key can be granted [6] by applying classical error correction and privacy amplification protocols to the sifted key [7]. the first implementation of a quantum cryptographic protocol dates from 1992 [8]. since then, the field has rapidly advanced towards sophisticated systems that provide high speed key generation [9], long distance key distribution [10, 11], transmitting photons either over optical fiber or open air, using polarization or time bin [12], or both [13], for qubit-encoding. such protocols can be based on single photon pulses [14, 15] or on entangled photon states [19]. the use of advanced optoelectronics and high performance detectors is intensive on any qkd implementation. in this work we show that the technologies used in such quantum information algorithms are mature enough to attempt a low cost, yet functional and robust implementation of a quantum key distribution protocol. we give a detailed explanation of the communication scheme and we release the firmware code and the circuit schematics to build the control units as supplementary material. the following section is devoted to the description of the optical arrangements used on alice and bob stages. section iii. discusses the initial setup, synchronization, transmission and processing routines needed in order to generate a sifted key. the overall performance of the apparatus and its response to different perturbations are discussed thereafter. ii. device layout the developed system comprises an emission stage and a reception stage for the quantum channel, and an ad-hoc classical communication system. quantum bits are encoded in the polarization of weak coherent pulses. these pulses are used as an approximation of a single photon pulsed source. we identify the canonical polarization states {|h〉 , |v 〉} with the computational basis bc = {|0〉 , |1〉} and the diagonal polarization states {|d〉 , |a〉} with the diagonal basis bd = {|+〉 , |−〉}. the complete scheme of the apparatus is shown in fig. 1. bobalice pc led driver v d a h arduino mega arduino mega d a h v demux spcm pbs bs bs pbsbpf bpf hwp hwp pc figure 1: setup of the qkd system: polarization selection and spatial overlap between states is obtained with a combination of polarizing (pbs) and non-polarizing (bs) beam splitters. bob uses a bs to randomly choose the measurement basis. polarization projections are obtained with a pbs and a half waveplate (hwp). projected light is coupled into optical fibers and temporally multiplexed with selected delays. a single photon counting module (spcm) is used for detection and bandpass filters (bpf) are used to reject unwanted light. ∆t: 250 ns delay. polarized weak light pulses are generated by fast pulsing four infrared leds and combining them with polarizing (pbs) and non-polarizing (bs) beamsplitters: each of the leds is used to encode one of the four possible polarization states. the leds outputs are coupled and later decoupled to multimode optical fibers to define a propagation direction and divergence, and also to equalize the intensities of the four outputs. this setup is based on off-the-shelf economic infrared leds and avoids the use of expensive pockel’s cells and high performance hv drivers for polarization state prepa080002-2 papers in physics, vol. 8, art. 080002 (2016) / i. h. lópez grande et al. ration. the mean photon number per pulse was set to approximately 0.1, measured between the emission and detection stages. assuming poissonian photon statistics, this means that in average nearly 90% of the clock pulses carry no photons at all, while less than 0.5% of the pulses are multiphoton pulses. both empty and multiple detection runs are considered null. it is worth to note that this particular choice of photon number per pulse does not guarantee the generation of a secure key by itself; rather, the conditions for distillation of a secure key from a raw key and the optimum photon rate depend on specific conditions of the setup, such as the length of the quantum channel –that implies distance-dependent losses–, the loss on bob’s receiver stage, and the efficiency and dark count rate of the detectors. security conditions under different kind of attacks on non-ideal qkd systems have been reported for example in [16, 17] and reviewed in [18]. the light paths from the sources entering a polarization beam splitter (pbs) at different inputs were combined by pairs: the reflected beams exit the pbs vertically polarized, while the transmitted outputs are left horizontally polarized. a halfwaveplate retarder placed in one of the outputs rotates the polarization of these two paths 45 degrees. a beam splitter cube further combines the paired sources into one common path. basis selection at the receiver stage is obtained using a 50% beam splitter cube to randomly obtain either a transmitted photon or a reflected photon. projection onto the states of the canonical basis is achieved by means of a pbs, while the diagonal basis projections are obtained adding a half-wave plate retarder between the beam splitter and the pbs in one of the paths. a straightforward implementation of the detection stage demands four single photon counting modules (spcms), which are expensive devices. with the purpose of obtaining a practical, cost-effective setup we implemented a time multiplexed detection, adding 250 ns delays between the projection paths. the four possible measurement outcomes are encoded into temporal bins: photons are detected using only one commercial single photon counting module and labeled by the time of arrival with respect to a clock reference. temporal demultiplexing and state determination are obtained measuring coincidences between the single photon detector output and temporal gates with selected delays. the use of a sole detector also avoids the unbalance of detection efficiencies that is present in multiple detector setups. as a drawback, this scheme presents 4 db insertion loss per coupler, which attenuates the input signal and lowers the extractable secure key rate, due to the reduced optimal photon rate. this issue can be circumvented by implementing a decoy-state strategy together with the bb84 protocol [20–22]. such application is currently under development at our laboratory. the following section deals with the synchronization and control tasks performed by the open source hardware microcontrollers that allow the system to operate in an autonomous manner. iii. control, driving and synchronization i. control and temporal synchronization open-source hardware was chosen for the processing of the cryptographic key and controlling units of the system, in order to obtain a practical, smallscale photonic implementation of the quantum protocol: all the synchronization, communication and processing operations, as well as system diagnosis were programmed on arduino mega 2560 microcontrollers. a diagram of the key generation protocol is sketched in fig. 2. the communication scheme is divided in stages where classical information is exchanged (c com) and a quantum communication stage (q com). an initial calibration of the system can be performed, where both parties measure the photon rate per pulse, the total temporal delay of the link and the delay between temporal bins. the communication begins with an exchange of the protocol parameters such as data structure and target key length. then, after a synchronization sequence, they exchange the quantum bits and the sifting procedure follows: both parties exchange information on basis emission and detection and coincidences between them, keeping only the bits that come from coincident bases. the routine is repeated until the target key length is reached. the shared key is locally transferred to personal computers on each stage via usb ports. 080002-3 papers in physics, vol. 8, art. 080002 (2016) / i. h. lópez grande et al. demux photon detection led driver + clock clock c com handshake bases coincidences clock synchronization q com polarization qubits 8-pulse bursts delay pattern pc arduino serial enable (d.o.28) irq (d.i.3) d.o. 30/32/34/36 usb arduino serial irq (d.o.28) d.i. 48/50/52/53 usb pc interrupt enable alice bob figure 2: communication and control setup of the bb84 qkd apparatus. the protocol is controlled by two arduino mega microprocessors. the synchronization start byte is generated at bob’s side and sent through an interrupt channel. after the quantum bits are sent and detected, bases are exchanged and the key is sifted. specific input and output pins of the arduino controllers are detailed in the figure. ii. electronic driving and peripherals the communication routines described above are implemented directly by the microcontrollers. specific tasks such as driving the pulsed leds, synchronizing the temporal mask and demultiplexing the temporal signals at the receiver side are performed with dedicated electronic peripherals. based on a random 2-bit sequence, the arduino microcontroller sets a logic high on one of the four possible outputs. a monostable multivibrator uses this logic transition to generate a 20 ns pulse that is used as the input for a high speed led driver. the shunt driving circuit that pulses the current on each led is constructed using the high-current, low impedance pull-up and pull-down mosfet transistors at the output of nand gates and a passive network to provide a prebias current and current overshoot to increase the performance of pulsed led drivers [23]. the optical pulse duration of 25 ns is limited by the led response. at bob’s side, single photon pulses are routed through different delay paths according to their polarization, and the delayed photon clicks are identified as polarization state projections by temporal demultiplexing the digital detections. pulses from the single photon detector are addressed to the corresponding state channel by comparison with a pulse pattern that repeats the temporal delays added by the optical fibers. iv. system performance and selfdiagnostics the main cause of bit errors are the non-ideal polarization splitting contrast of the pbss and low quality waveplates that produce incomplete rotations and distort the ideal linear polarization states at the input and output. also, off-the plane misalignment of the light paths within the preparation and measuring states can induce undesired rotations of the polarization axes. these are well-known problems for an open air optical setup, and workarounds to minimize them are common to any polarizationsensitive arrangement. detector dark counts and stray light that leaks through the optical setup are also a source of error. the gated detection helps to minimize these errors. the contribution of this effect to the overall error rate depends linearly on the gate pulse duration. the other main source of error is the temporal jitter of the signals, which can produce erroneous bit assignment of the temporally multiplexed pulses. the signal jitter is limited by the duration of the light pulse, which is approximately half the arduino clock period. larger pulse timing fluctuations can be produced at the generation and detection stages due to missed or added clock pulses at the microprocessors, specifically when handling interrupt signals. these temporal fluctuations can shift states from earlier to later temporal bins, in080002-4 papers in physics, vol. 8, art. 080002 (2016) / i. h. lópez grande et al. ducing errors on the key. the temporal order of the multiplexed states can be arranged to minimize such errors. a natural choice is to order the detections in the sequence h (first), v , d, a (last). such choice has an increased probability that temporal jitter can produce an error: assuming delayed detections that deterministically shift the states; in this configuration the probability of producing a bit error is 0.3125. if the delays are arranged to output the temporal sequence h (first), d, v , a (last), consecutive states at the detection pattern do not belong to the same basis. the probability of producing an error provided the states are identified in an adjacent temporal bin in this arrangement is 0.1875, and it is therefore chosen to minimize the error rate. an estimation of the bit error rate produced by this artifact in the actual protocol execution can be obtained as the product of this probability and the state-shift rate due to the overall timing jitter (0.6%), and gives approximately 1.1%. the system was tested using a mean photon rate of µ=0.09. a typical light distribution at the outputs for each polarization state generated by alice is shown in fig. 3a). the apparatus autonomously generates a cryptographic key until the target key length is reached. during the tests, light pulses were emitted in bursts of 19200 pulses per second, while a constant background light of 3000 counts/s at the detector was present in the actual experimental conditions. we obtained a raw key generation rate of 363 bits/s, with a quantum bit error rate (qber) of 2.7 %. approximately one third of this rate (0.9 %) corresponds to errors produced by stray light and detector dark counts, while the rest of the errors are due to the electronic jitter as discussed above, and to an imperfect preparation and selection of the polarization states at the optical setup. the measured key generation rate is limited by (and it can be also estimated from) the photon-per-pulse rate, the 50% data that is discarded in average due to non-coincident bases, and the dead times on the communication stage that allow for data processing, which represents roughly two thirds of the total execution time. during a key generation session, some parameters can be monitored for eavesdropping, inconsistencies or anomalous behavior. the sifted key can be periodically sampled and analyzed for error rate, key generation rate and bias rate (the relatime4[minutes] aq bq a d v h gk-6gk-66 gk-:5 gk-:7 m e a s u re d 4% b o b q gk5 gk4 gk: gkgko g h4444444444v44444444444444d444444444444a prepared4%aliceq g og -g :g 4g 5g 6g 7g 8g 9g ogg 4444444 : 444444 [i ] qber [b it s ys ] g og -g :g 4g 5g 6g 7g 8g 9g ogg -gg :gg 4gg key4generation4rate [i ] g og -g :g 4g 5g 6g 7g 8g 9g ogg 45 5g 55 6g bit9bias4%g:o4rateq g o figure 3: a) light distribution at the detection channels, for each generated polarization state. percentages on each row of the graph are the relative amounts of light obtained by adding the counts at each detection channel, for all the emitted states. b) temporal evolution of different system parameters during normal operation. tive abundance of “1”s to “0”s in the key, 0.98 in our setup), leading to charts like the one presented on fig. 3b). under normal operation conditions, the three parameters are constant through a typical one hour and a half experiment, with a relative dispersion on their average values below 2 × 10−2 for key rate, 7 × 10−3 for bit bias and 2 × 10−3 for qber (statistics obtained over 20 kbit partitions from a total 1.9 mbit key). the response of the system under anomalous conditions was tested disturbing the quantum channel in different manners, while the above parameters were being monitored. figure 4 shows a sequence of such perturbations: first, in a), the detector was blocked, which caused the key rate to 080002-5 papers in physics, vol. 8, art. 080002 (2016) / i. h. lópez grande et al. vanish with a characteristic time given by the integration time of the monitoring process. if one of the detection channels (v ) is blocked [fig. 4 b)], the effect is a diminished key rate and a key bias of 2/3. in c), both channels of a basis are blocked. if two channels that encode the same bit are blocked, the key rate remains at half the original rate, but now the series is completely biased, since only one logic bit is produced. more interestingly, during e), a pbs was inserted in the quantum channel, which has the following effect on the transmitted quantum states: |h〉 are left unchanged —since they are transmitted through the pbs— |v 〉 states are reflected out of the path at the pbs, while |d〉 and |a〉 are transmitted as |h〉 with a 50% chance. this last feature resembles the action of an eavesdropper (eve) using an intercept-resend strategy, where the bases in which eve resends bits to bob are randomly chosen. in this situation, states sent as |v 〉, and (in average) half of the states originally sent on the diagonal basis, are lost at the pbs reflection, leading to a reduction of the key generation rate by a factor of two. more importantly, half of the states originally sent on the diagonal basis are transmitted through the pbs and transformed to the |h〉 state. if these states are measured on the diagonal basis, they can be detected as either |d〉 or |a〉, regardless of the original state. the result of these successive projections is that a |d〉 (|a〉) state has a non-negligible probability to be detected as a |a〉 (|d〉) state. the quantum bit error rate now raises to 25% for this particular perturbation, signaling a possible eavesdropper. the bit bias of bob’s key is 0.75: the action of the pbs that prevents all the emitted |v 〉 states to be detected generates a ratio of “1”s to “0”s of 3:1. periodically sampling and an analysis of the generated key thus provides a means for detecting intercept-resend attacks, at the cost of reducing the final key length. with the setup placed on an optical table, qber variations as low as 0.2% can be detected. v. concluding remarks we have implemented an open source hardware based autonomous qkd apparatus. its stability and performance have been tested on megabitlength key distribution sessions, during which some key parameters were monitored. the device was 0 2 4 6 0 200 400 key rate 0 2 4 6 0 20 40 qber 4 6 bit bias 0 2 time [minutes] [s -1 ] [% ] 2.7% 25% 0.66 365 bps a) b) c) e)d) 0 1 [% ] 100 50 0 figure 4: behavior of the system under different perturbations on the detection stage and the quantum channel, labeled a) to e), consisting in blocking one or more detection channels and inserting a polarizing beamsplitter in the quantum channel. see the text for a detailed explanation. designed with a cost-effectiveness approach which includes a led-based single photon probabilistic source, a time multiplexed detection scheme that employs only one spcm and arduino-based controlling and processing units for alice and bob. the actual bit error rate can be lowered if the polarization dependent elements (pbs) on alice and bob sides are replaced with high-extinction ratio polarizers (at present around 1%). another way in which the error rate can be improved is by minimizing the incidence of errors originated by detector’s dark counts. this can be accomplished with a reduction on the light pulse width that leads to narrower temporal gates. also, an increase of the mean photon number per pulse can reduce the qber without compromising security, provided a decoy state protocol is implemented instead. the overall protocol speed can be raised by replacing the arduino microcontrollers with faster fpga-based boards, where the communication and the processing blocks may be parallelized. also, as mentioned above, the temporal demulti080002-6 papers in physics, vol. 8, art. 080002 (2016) / i. h. lópez grande et al. plexing can be done directly on the board. faster clock boards allow for an additional reduction of the temporal delays between channels on the time multiplexed detection scheme. these can be set to be as short as 50 ns, depending on pulse width and temporal jitter. the developed apparatus is able to autonomously generate a cryptographic key with limited yet simply improvable performance. the whole system can be used to establish a small-scale secure information channel between eye of sight distance sites, for academic purposes, or it can serve as a testbed for different quantum informationrelated resources, such as original protocols, detectors, light sources, or the development of alternative physical quantum channels. we understand that a cryptographic system based on wellknown, simple and available technology that can be fully mastered and controlled by the end user may turn out more useful and secure than a sophisticated, “black box” type system that has many parts that are beyond the user’s control, and which may depend on third party services to be operated or maintained. acknowledgements this work was supported by the anpcyt pict 2010-2483 and mindef piddef 012/11 grants. m.a.l. is a conicet fellow, c.t.s. and i.h.l.g. were funded by conicet scholarships. [1] w diffie, m hellman, new directions in cryptography, ieee t inform. theory 22, 644 (1976). [2] w k wootters, w h zurek, a single quantum cannot be cloned, nature 299, 802 (1982). [3] m planat, h c rosu, s perrine, a survey of finite algebraic geometrical structures underlying mutually unbiased quantum measurements, found. phys. 36, 1662 (2006). [4] n gisin, g ribordy, w tittel, h zbinden, quantum cryptography, rev. mod. phys. 74, 145 (2002). [5] c h bennett, g brassard, quantum cryptography: public key distribution and coin tossing, theor. comput. sci. 560, 7 (2014). [6] n j cerf, m bourennane, a karlsson, n gisin, security of quantum key distribution using dlevel systems, phys. rev. lett. 88, 127902 (2002). [7] c h bennett, g brassard, c crépeau, u m maurer, generalized privacy amplification, ieee t inform. theory 41, 1915 (1995). [8] c h bennett et al., experimental quantum cryptography, j. cryptol. 5, 3 (1992). [9] a r dixon et al., gigahertz decoy quantum key distribution with 1 mbit/s secure key rate, opt. express 16, 18790 (2008). [10] p a hiskett et al., long-distance quantum key distribution in optical fibre, new j. phys. 8, 193 (2006). [11] r ursin et al., entanglement-based quantum communication over 144 km, nat. phys. 3 481 (2007). [12] i marcikic et al., distribution of time-bin entangled qubits over 50 km of optical fiber, phys. rev. lett. 93, 180502 (2004). [13] w t buttler et al., practical four-dimensional quantum key distribution without entanglement, quantum inf. comput. 12, 1 (2012). [14] c h bennett, quantum cryptography using any two nonorthogonal states, phys. rev. lett. 68, 3121 (1992). [15] h bechmann-pasquinucci, w tittel, quantum cryptography using larger alphabets, phys. rev. a 61, 062308 (2000). [16] n lütkenhaus, security against individual attacks for realistic quantum key distribution, phys. rev. a 61, 052304 (2000). [17] a acin et al., device-independent security of quantum cryptography against collective attacks, phys. rev. lett. 98, 230501 (2007). [18] v scarani et al., the security of practical quantum key distribution, rev. mod. phys. 81, 1301 (2009). 080002-7 papers in physics, vol. 8, art. 080002 (2016) / i. h. lópez grande et al. [19] a k ekert, quantum cryptography based on bell’s theorem, phys. rev. lett. 67, 661 (1991). [20] w y hwang, quantum key distribution with high loss: toward global secure communication, phys. rev. lett. 91, 057901 (2003). [21] y zhao et al., experimental quantum key distribution with decoy states, phys. rev. lett. 96, 070502 (2006). [22] z l yuan et al., unconditionally secure one-way quantum key distribution using decoy pulses, appl. phys. lett. 90, 011118 (2007). [23] agilent application bulletin 78, low cost fiber-optic links for digital applications up to 155 mbd, agilent technologies inc. (1999). 080002-8 papers in physics, vol. 10, art. 100006 (2018) received: 5 january 2018, accepted: 20 april 2018 edited by: a. mart́ı, m. monteiro reviewed by: p. jeanjacquot, école normale supérieure de lyon – institut français de l’éducation, france licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.100006 www.papersinphysics.org issn 1852-4249 smartphone audio port data collection cookbook kyle forinash,1∗ raymond wisman1† the audio port of a smartphone is designed to send and receive audio but can be harnessed for portable, economical, and accurate data collection from a variety of sources. while smartphones have internal sensors to measure a number of physical phenomena such as acceleration, magnetism and illumination levels, measurement of other phenomena such as voltage, external temperature, or accurate timing of moving objects are excluded. the audio port cannot be only employed to sense external phenomena. it has the additional advantage of timing precision; because audio is recorded or played at a controlled rate separated from other smartphone activities, timings based on audio can be highly accurate. the following outlines unpublished details of the audio port technical elements for data collection, a general data collection recipe and an example timing application for android devices. i. audio port technical elements for data collection the audio port physical interface to the smartphone is a 4-pole jack connecting to some external device. table 1 presents the american headset jack (ahj) standard used by many smartphones for connections to the 4-pole jack. figures 1 and 2 present two similar measurement circuits based upon the same resistor-capacitor network, fig. 1 circuit for temperature measurements [1] and fig. 2 circuit with a pair of photoresistors acting as gates for timing measurements of a moving object [2]. although timing and temperature measures are obviously different, the circuits and data collection methods are quite similar. in each circuit, a variable resistor connects the speaker audio out∗e-mail: kforinas@ius.edu †e-mail: rwisman@ius.edu 1 indiana university southeast, 4201 grantline road, new albany, indiana 47150, usa. table 1: audio connections to 4-pole jack. tip left speaker 1 right speaker 2 ground 3 microphone figure 1: temperature measurement circuit. 100006-1 papers in physics, vol. 10, art. 100006 (2018) / k. forinash et al. figure 2: time measurement circuit. put to the microphone input; the speaker output is a wave which peak amplitude at the microphone input varies with the variable resistor, greater resistance resulting in lower peak amplitude. figure 3 illustrates the effect of a photoresistor on the microphone signal amplitude while a moving object momentarily blocks and reduces the illumination reaching the photoresistor as the illumination decreases, resistance increases and peak amplitude decreases. the microphone input amplitude is digitally sampled (44100 samples per second is common) over time as displayed in fig. 4. each digital value represents the speaker output amplitude as changed by the circuit, then received and digitized as the microphone input. a recipe for data collection with fig. 1 and 2 type circuits generally is: 1. choose a variable resistor to measure some phenomena such as force, humidity, etc. 2. generate a wave of fixed frequency and peak amplitude through the speaker output and sample it as the microphone input as illustrated in fig. 4. 3. detect the significant elements of the microphone input. significant for timing applications, wave peaks can be detected by the simple method described in fig. 4. 4. convert the significant elements to corresponding data. for example, by recording a figure 3: microphone input signal amplitude versus time graph from circuit in fig. 2 when measuring an object falling from 1 meter. the amplitude of the wave input at the microphone drops as the object entering the first photoresistor (pr1) at time 6.2041 s and again at time 6.6607 s when entering the second photoresistor (pr2). the measured time of 0.456 s compares favorably with the predicted time of 0.451 s. figure 4: speaker output of a 4410 hz sine wave sampled at 44100 samples per second at the microphone, producing the highlighted 10 samples per wave. simple peak detection is possible by comparing three consecutive samples (x0, x1, x2) where: 0 < x0 < x1 > x2 > 0 the wave peak occurs at sample x1. 100006-2 papers in physics, vol. 10, art. 100006 (2018) / k. forinash et al. range of peak amplitudes produced by the circuit of fig. 1 and the corresponding temperatures, an equation fitted to that amplitude vs. temperature data can convert the circuit amplitude output to temperature. ii. timing data collection recipe for timing of a moving object, some form of gate is needed to detect entry and exit. ideally, gates are binary, they only open or close with no intermediate states, unfortunately reality does not represent the ideal case. for timing applications, the circuit of fig. 2 includes one or more photoresistors (gates) connected serially so that when all are illuminated (closed) resistance is at its minimum and the peak signal amplitude input to the microphone is at its maximum; hence, when illumination of any one of the photoresistors is reduced, circuit resistance increases and the signal amplitude reduces. because the photoresistor resistance does not change instantaneously when illumination changes, as illustrated in fig. 3 by the gradual peak amplitude drop, and in the case where one photoresistor is illuminated differently than another producing a different amplitude result for each, a peak amplitude threshold must be determined. above the threshold a gate is considered closed (exited) and below the threshold a gate is considered open (entered). to determine the threshold, the peak amplitude is initially calibrated by illuminating (closing) all gates and incrementally increasing the speaker volume until the microphone peak amplitude approaches its maximum, m. after fixing the volume for m, the microphone peak amplitude for the circuit when any single gate is open (entered), l, is determined by blocking (entering) each gate in turn and recording its lowest peak amplitude. l is set to the greatest peak amplitude of any blocked gate to ensure the threshold is crossed when any gate is blocked. the maximum m and minimum l then set the upper and lower output limits of the circuit peak amplitude. the threshold amplitude, ta, the peak amplitude level that determines when an object enters or exits a gate is: ta=(m+l)/2, the amplitude midpoint between m and l. an amplitude crossing the threshold from above is defined as gate entered, while crossing it from below is defined as gate exited. with the threshold defined, deriving timing data from the microphone audio is both straightforward and accurate. from the general recipe, the practicalities of timing a moving object reduces to: 1. choose a photoresistor to measure illumination. 2. generate a sine wave of 4410 hz and volume setting at m through the speaker and sample at the microphone at 44100 samples per second which corresponds to 10 samples per wave. 3. detect the wave peaks by the method described in fig. 4. 4. convert wave peaks to corresponding timing data by noting the sample number at threshold crossings. microphone input sampling occurs at 44100 samples per second or one sample every 0.000023 s (equals 1 sample/44100 samples/second). accurate timing between events is simply a matter of counting samples between threshold crossings. gates are entered when a wave peak crosses the threshold from above and the sample number of the peak is recorded at this time. similarly, gates are exited when a wave peak crosses the threshold from below. the time between entering two gates is then: t = g2 − g1 44100 samples second , (1) where t=time between gate 1 and 2, g1 = gate 1 entry sample number, and g2 = gate 2 entry sample number. figure 3 illustrates the microphone input while timing the one meter free fall of an object when entering two photoresistor gates. the amplitude of the wave input at the microphone drops as the object entering the first photoresistor (pr1) at time 6.2041 s and again at time 6.6607 s when entering the second photoresistor (pr2). the measured time of 0.456 s = 20110 samples/44100 samples/second compares favorably with the predicted time of 0.451 s. the prescribed recipe above was followed in the design of the gatetiming [3] timing application that measures times of entry and exit at multiple gates. figure 5 presents the times recorded 100006-3 papers in physics, vol. 10, art. 100006 (2018) / k. forinash et al. figure 5: sample timing results from gatetiming application. during multiple runs through two gates. the date/utc.ms column is the clock time on entering the first gate, the time column is the time between entering the first and second gate; a more detailed display of each gate entry and exit times is a viewing option. table 2 lists the times as recorded by the application of a block falling one meter. the variation of measurements is primarily due to slight differences in the height when dropped above the first gate. for example, dropping from 1/2 inch (1.27 cm) above the first gate produced a timing in the range of 0.42 s, the time is lower than the one predicted because the block is traveling a little faster through the gates (about 0.5 mph or 0.80 km/h at the first gate). other timing variations are likely due to slight changes to the block profile relative to the gates, not always maintaining a stable orientation. iii. conclusions although only timing was closely examined in this paper, a range of audio port data collection applications are possible, including our own android table 2: times of a block falling one meter. time [s] 0.453 0.457 0.471 0.451 0.448 mean 0.456 applications: audiotime+ [4], a general purpose tool for signal display and analysis that can examine the behavior of a variable resistor in the circuit of fig. 1 and 2, gatetiming [3], to measure the time an object passes between or through a series of gates, dcvoltmeter [5], measurement of 010vdc via voltage-to-frequency conversion of microphone input, then converting the frequency to a corresponding voltage value; a simple, active voltage controlled oscillator circuit is required for the conversion. other applications for data collection from a variety of sources using the basic recipe are also possible. by using this method, data can be collected from almost any sensor based on variable resistance. [1] k forinash, r wisman, smartphones experiments with an external thermistor circuit, the physics teacher 50, 566 (2012). [2] k forinash, r wisman, photogate timing with a smartphone, the physics teacher 53, 234 (2015). [3] r wisman, k forinash, mobile science gatetiming, december 2017. google play: https://play.google.com/store/apps/details? id=edu.ius.rwisman.gatetiming [4] r wisman, k forinash, mobile science audiotime+, november 2013. google play: https://play.google.com/store/apps/details? id=edu.ius.audiotimeplus [5] r wisman, k forinash, mobile science dcvoltmeter, january 2016. google play: https://play.google.com/store/apps/details? id=edu.ius.dcvoltmeter 100006-4 papers in physics, vol. 10, art. 100004 (2018) received: 7 september 2017, accepted: 1 march 2018 edited by: a. mart́ı, m. monteiro reviewed by: e. arribas, universidad de castilla la mancha, albacete, spain licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.100004 www.papersinphysics.org issn 1852-4249 open-source sensors system for doing simple physics experiments césar llamas,1 jesús vegas,1 miguel á. gonzález,2 manuel á. gonzález3∗ an open-source platform to be used in high school or university laboratories has been developed. the platform permits the performance of dynamics experiments in a simple and affordable way, combining measurements of different sensors in the platform. the sensors are controlled by an arduino microcontroller, which can be wirelessly accessed with smartphones or tablets. the platform constitutes an economical sensing alternative to commercial configurations and can easily be extended by including new sensors that broaden the range of covered experiments. i. introduction we have designed and constructed a low cost platform that can be used to do simple physics experiments combining different sensors’ data easily. the purpose of this work consist in implementing a cheap system that can be used in high schools and universities regardless of their budget in diverse experiments, allowing a higher customization than equivalent commercial sets. ii. system description the initial requirements of the system are affordability, simplicity, flexibility and extensibility. under the affordability requirement, we have decided to employ systems that can be easily found on the internet at low prices. simplicity means that users ∗e-mail: manuelgd@termo.uva.es 1 departamento de informática, universidad de valladolid, 47011 valladolid, spain. 2 departamento de f́ısica de la materia condensada, universidad de valladolid, 47011 valladolid, spain. 3 departamento de f́ısica aplicada, universidad de valladolid, 47011 valladolid, spain. with little to no training can easily use the system in physics experiments without difficulty, but also, that its construction does not require special skills. flexibility implies that the system should be portable enough to work on different types of experiments. finally, its extensibility will allow users to include additional sensors in the platform to enhance its capabilities and to use it in new experiments. in order to obtain an innovative platform as well as to add sharing features to it, almost exclusively open-source hardware and software components are used. within this approach, the physics teacher community can work in open projects that can be used by other teachers, thus reducing the work of designing experiments with the platform. following this condition, the characteristics of the system are published under a public gnu license in the github repository [1]. all the technical details of the platform hardware and software are detailed in that repository: software code, assembling diagrams and technical characteristics of the components. we initially decided to include sensors in the platform that measure kinematic magnitudes: accelerometer, gyroscope, distance sensor and lightgate sensors. other sensors, such as a magnetome100004-1 papers in physics, vol. 10, art. 100004 (2018) / c. llamas et al. figure 1: a conceptual diagram showing the main components of the system. it comprises a mobile unit that includes several sensors (su, sv, ...) controlled by an arduino board and a portable unit with a raspberry pi that communicates with the arduino and establishes a wi-fi network. it can also host additional sensors (sx, sy, ...). the measurements of all the sensors can be transferred into the users’ devices as series of vectors m via a web front-end implemented in the raspberry pi. ter, a barometer, a thermometer, etc., can be easily added later. figure 1 shows a diagram of the use of the platform in a physics experiment: the platform includes a mobile unit, which can be attached to a mobile body, containing sensors (su, sv,... in the figure) and a portable unit responsible for the wi-fi connection that allows the user to access the measured data, but which can also include additional sensors (sx, sy,... in the figure). measured data are collected by the portable unit and organized in vectors (m = (ti, xi, ...)) in the figure) also containing the time stamp of the measurements. whenever it is desirable and possible, the mobile unit can be attached to a body in order to study its movement. then, the connection between the mobile and the portable units must be based on wireless communications, allowing the studied body and the attached unit to move freely. in our case, communications between the mobile and the portable units are performed via bluetooth. the design of the mobile and the portable units was based on previous works of embedded systems [2] and they comprise the following elements: • mobile unit: an arduino nano v3 controls a set of sensors. the version that has been tested in laboratory experiments included an infra-red line tracker, a motion processing unit consisting of 3-axis accelerometer, gyroscope and an ultrasound distance sensor. the atmel serial interface of the microcontroller is connected to a hc-sr06 bluetooth device. a lipo battery supplies power to all the system. in its current design, the motion processing unit can be replaced by a similar one that also includes a 3-axis magnetometer together with the accelerometer and gyroscope which would permit to have a inertial measurement unit with 9 degrees of freedom. this system is programmed in arduino 1.6 [3] and it implements a simple protocol through a serial wireless connection, which is established by using bluetooth connectivity that allows the initialization of the microcontroller as well as starting and stopping data acquisition. the main speed limitation in data measurements of the system is due to the bluetooth connection. figure 2 (top) shows an implemented mobile unit used in some physics experiments described below. • portable unit: it consists of a raspberry pi2 with raspbian jessie [4] as an operating system. the core process of the unit is programmed in go [5] and it governs the overall system as well as offering a web front-end. this unit supports the wi-fi and bluetooth connections. the data recorded in the measurements are stored in a sd card, constituting the local persistent storage unit. these data can be accessed and retrieved by the users’ laptops or smartphones via wi-fi connection. additionaly, the portable unit can also allocate other wired connected sensors using the gpi of the raspberry pi. in the current configuration of the system, four infrared emitter-receivers are connected to it. these can be used, for example, to measure instant speed at different points along the trajectory of a body. the unit can be powered by an ordinary usb battery, allowing to do physics experiments outside the laboratory. figure 2 (bottom) shows the portable unit. 100004-2 papers in physics, vol. 10, art. 100004 (2018) / c. llamas et al. figure 2: mobile (top) and portable (bottom) units of the system. portable unit shows the connections of four additional infrared sensors attached to it (connections a, b, c, and d in the figure). iii. use in the laboratory the system described above can be used in different ways in the physics laboratory depending on the users’ skill and knowledge of physics. we have tested the platform in two basic experiments: an air track in order to study an uniformly accelerated movement and a pendulum to analyze periodic motion and to obtain the gravity acceleration. figure 3 shows the mobile and portable units being used in the air track experiment. in this experiment, the mobile unit included a 3-axis accelerometer, a 3-axis gyroscope and an ultrasonic distance sensor. in addition, four static infrared beacons were placed at fixed points of the trajectory and connected to the portable unit. the sampling frequency was about 25 samples/s (∆t ≈ 0.0394 s). this frequency was the result of a compromise between the measuring capabilities of the arduino and the data communication limitations between the arduino board and the raspberry, using the bluetooth connection. with the arrangement of sensors used in the experiment, users can measure the cart acceleration (accelerometer), average velocities between different points (infrared beacons), and the change in the position with time (ultrasound sensor and infrared beacons). from the data figure 3: photography of an air track experiment using the platform. while the mobile unit is on the moving cart, the portable unit stays on the table with four static infra-red beacons connected to it. recorded by the different sensors, the user can analyze the uniformly accelerated movement and the relationships between kinematics magnitudes. on the other hand, in a pendulum experiment, users can combine the accelerometer and gyroscope measurements to study the acceleration and speed in different points and the periodic dependence of both magnitudes. infrared beacons can also be used to measure the passage of time at different points of the pendulum trajectory and enrich the physical measurements. see [6] for more details on an air track experiment. recorded data are stored in the sd card as a plain csv file with data of all the sensors used in the experiment. the csv files are transferred to the users’ computer or mobile devices using the wi-fi established by the portable unit. the rows of the csv file are tagged with the acquisition time, in order to ease the analysis of the data by simply importing the file on any spreadsheet program. figure 4 shows some results of the measurements with the air track. the measurements of the ultrasound distance sensor and the component of the accelerometer along the direction of the movement are represented as a function of time. from the data shown there, it can be observed how the experimental noise of the ultrasound sensor increases noticeably with the distance to the reflecting screen. we have checked this noise and, according to our exper100004-3 papers in physics, vol. 10, art. 100004 (2018) / c. llamas et al. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 52 54 56 58 60 62 -2 0 2 4 6 8 10 12 14 y=0.8 (1/2)(0.2)(t-56.75) 2 average acceleration a=-0.2 m/s d is ta n c e ( m ) a c c e le ra ti o n ( m /s 2 ) t (s) distance acceleration az figure 4: example of combined data recorded with the platform in an air track experiment. the dependence of the distance and the acceleration along with the direction of movement recorded in the same experiment are shown jointly. iments, it is due to the air exiting from the air track holes. reducing or eliminating the airtrack pumping, that noise drastically decreases, as well as when the distance between the sensor and the reflecting screen decreases. this can be seen as a possible limitation of the used sensor, which can be solved using, for example, an optical distance sensor. the superposition of the data from the accelerometer and distance sensors allows a clear identification of the collision events between the cart and a rubber band at the end of the air track. from the diference in distances reached after consecutive rebounds (marked as example in the figure) users can obtain the restitution coefficient of the collision. complementary, from the accelerometer data, users can obtain the change in speed during each collision and compare that result to the one obtained by analizyng the maximum distance before and after the bounce. as an illustrative result, a parabollic curve y = 0.8 − (1/2)0.2(t − 56.75)2 is shown in the figure. the distance 0.8 was chosen to displace a little the theoretical curve upwards over the experimental points for clarity. the acceleration 0.2 m/s2 corresponds to the average acceleration measured by the accelerometer sensor as shown in the figure. the agreement between different sensor data can help users to investigate relationships between different magnitudes in an experiment. iv. discussion a cheap open-source system to be used in simple experiments has been developed. it comprises the hardware and software of a sensorized platform. the technical characteristics and the control software of the system can be downloaded for free from the github repository [1]. the system permits the measurement of different magnitudes in the same experiment and the analysis of various phenomena. the electronic components used in the system can be acquired on the internet by less than 100e, and additional cheap sensors can be also included. unlike commercial sensor systems, this one constitutes an extensible and customizable platform that can be modified by adding other sensors and including other software characteristics. the use of the githup repository also allows the collaboration between users that can add new characteristics to the platform. acknowledgements this work has been supported by university of valladolid teaching innovation program under grant pid2016 82. [1] c llamas, j vegas, m á gonzález, m á gonzález, https://github.com/percomp/oshiwasp, last accessed september 4, 2017. [2] c llamas, m á gonzález, c hernández, j vegas, open source hardware based sensor platform suitable for human gait identification, pervasive mob. comput. 38, 154 (2017). [3] https://www.arduino.cc/en/main/software, last accessed september 5, 2017. [4] https://www.raspbian.org/, last accessed september 5, 2017. [5] https://golang.org/, last accessed september 5, 2017. [6] m á gonzález, a gómez, m á gonzález, smartphones in the air track, examples and difficulties, (unpublished). 100004-4 papers in physics, vol. 9, art. 090003 (2017) received: 6 february 2017, accepted: 8 march 2017 edited by: a. mart́ı reviewed by: c. masoller, universitat politécnica de catalunya, barcelona, spain. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.090003 www.papersinphysics.org issn 1852-4249 self-sustained oscillations with delayed velocity feedback d. h. zanette1∗ we study a model for a nonlinear mechanical oscillator, relevant to the dynamics of microand nanomechanical time-keeping devices, where periodic motion is sustained by a feedback force proportional to the oscillation velocity. specifically, we focus our attention on the effect of a time delay in the feedback loop, assumed to originate in the electric circuit that creates and injects the self-sustaining force. stationary oscillating solutions to the equation of motion, whose stability is insured by the crucial role of nonlinearity, are analytically obtained through suitable approximations. we show that a delay within the order of the oscillation period can suppress self-sustained oscillations. numerical solutions are used to validate the analytical approximations. i. introduction inside any modern time-keeping device, the principal component is an oscillator which autonomously generates a stationary periodic signal with a welldefined frequency. the only external input to the system is the power needed to sustain the oscillations. in devices based on mechanical oscillators –which comprise essentially all present-day clocks, with the exception of the atomic kind–, sustained periodic motion is achieved by forcing the oscillator with a conditioned version of the signal generated by the oscillator itself [1]. if this reinjected force is in-phase with the oscillation velocity, the resonant response is maximal, thus optimizing power consumption during the process of signal conditioning. purely mechanical devices employed this feedback scheme already in the middle ages. in fact, the escapement mechanism was routinely used in ∗e-mail: zanette@cab.cnea.gov.ar 1 centro atómico bariloche and instituto balseiro (comisión nacional de enerǵıa atómica, universidad nacional de cuyo), consejo nacional de investigaciones cient́ıficas y técnicas, avda. e. bustillo 9500, 8400 san carlos de bariloche, ŕıo negro, argentina. clock building since the 13th century. in modern clocks, oscillators are built from synthetic quartz crystals and feedback is implemented electronically. at the microand nanoscale, quartz crystals are expected to be replaced by simpler mechanical oscillators such as tiny vibrating silica beams [2, 3], which are easily built during circuit printing and can be actuated by very small electric fields [4, 5]. in a series of recent experiments on micromechanical oscillators, it has been shown that selfsustained oscillations can be achieved with a feedback force proportional to the oscillation velocity [6]. since a purely linear mechanical system cannot display stable periodic motion, this kind of feedback force must necessarily be compensated by some nonlinear contribution from the oscillator dynamics itself. in the experiments, this balance turned out to come from the damping force, which was proportional to the velocity but showed a nonlinear dependence on the oscillation amplitude. in the present paper, we analyze a model for selfsustained periodic motion in a mechanical oscillator subjected to a feedback force proportional to the velocity, and nonlinear amplitude-dependent terms both in damping and in the restoring force. emphasis is put on the effects of time delays in the 090003-1 papers in physics, vol. 9, art. 090003 (2017) / d. h. zanette feedback circuit, which modify the phase shift between the feedback force and the velocity. it is shown that these delays affect the response of the oscillator, to the point that stable periodic motion can even be suppressed. the model is studied analytically within suitable approximations, and the results are compared with numerical solutions to the equation of motion. ii. mechanical model for selfsustained oscillations our model is based on an equation of motion for a one-dimensional variable x(t), which represents the departure from equilibrium of a mechanical oscillator: ẍ + µ(1 + αx2)ẋ + (1 + βx2)x = gẋ(t− τ), (1) where µ is the damping coefficient per unit mass. the coefficients α and β weight the amplitudedependent nonlinear corrections to the damping force and to the elastic force, respectively. the former is a van der pol-like nonlinearity, while the cubic contribution to the restoring force defines a duffing oscillator [7]. both kinds of nonlinearity, with α,β > 0, have been experimentally verified to occur in micromechanical oscillators formed by silica beams clamped at their two ends (clampedclamped, or c-c beams [4, 6]). in the main oscillation mode, c-c beams vibrate much like a plucked string, so that x(t) can be associated with the displacement of the middle point of the beam with respect to its rest position. the right-hand side of eq. (1) represents the feedback force per unit mass, which is proportional to the velocity at the delayed time t − τ. as stated in the introduction, this delay is expected to originate in the electric circuit that reads, conditions, and reinjects the oscillator signal, due to the time elapsed during signal processing. although τ should be a very short time, it is not necessarily negligible as compared with other time scales in the system, in particular with the oscillation period. in fact, microand nano-oscillators vibrate with frequencies from the order of 100 khz [4] to 1 ghz [8]. time units in eq. (1) have been chosen in such a way that the natural frequency of the undamped (µ = 0), linear (β = 0), unforced (g = 0) oscillator equals unity. meanwhile, the units of x can be fixed in such a way that the coefficient α adopts any prescribed value. hence, without generality loss and for future convenience, we fix α = 4. i. approximate stationary solutions nonlinearity in eq. (1) prevents obtaining an exact solution. approximate stationary solutions can be found by the standard procedure of neglecting higher-harmonic contributions to the oscillations [7]. within this approximation, and proposing x(t) = 1 2 a exp(iωt) + c.c., we get a complex algebraic equation whose real and imaginary parts read 1 −ω2 + ba2 = gω sin ωτ, (2) with b = 3β/4, and µ(1 + a2) = g cos ωτ. (3) these are equations for the unknowns a and ω, which have been obtained assuming a,ω 6= 0. note that, to have a non-negative solution for a2, eq. (3) requires that g ≥ µ. physically, this amounts to require that the energy input from the feedback force is not less than the energy dissipated by damping. otherwise, stationary periodic motion cannot be sustained. in the following, thus, we work under the assumption that such condition holds. squaring and summing up eqs. (2) and (3), we obtain (1 −ω2 + ba2)2 + µ2ω2(1 + a2)2 = g2ω2, (4) namely, a relation between the oscillation amplitude a and the frequency ω which does not involve explicitly the delay τ. black (full and dotted) lines in fig. 1 represent this relation for the parameters indicated in the figure. the graph of the relation between amplitude and frequency expressed by eq. (4) can be interpreted as a resonance curve, in the sense that it characterizes the response of the oscillator to the feedback force. in fact, its shape resembles the upper part 090003-2 papers in physics, vol. 9, art. 090003 (2017) / d. h. zanette 0.9 1.0 1.1 1.2 1.3 1.4 0.0 0.1 0.2 0.3 0.4 0.5 1.33 1.34 0.43 0.44 0.45 am pl itu de frequency g b figure 1: amplitude-frequency relation, eq. (4), for µ = 0.1, g = 0.12, and b = 4. full and dotted black lines show the branches where τ is positive and negative, respectively. the green dot stands at the point of maximal amplitude, corresponding to delay τ = 0. for positive delays, oscillations become suppressed for τ = τ0, at frequency ω0, where the amplitude vanishes. the magenta line is the short-delay approximation, and the cyan line is the backbone curve. the inset shows a close-up around the peak. of the duffing resonance curve [7], and its mathematical origin is similar. from the physical point of view, however, it is important to stress that ω is not a control parameter –as in the standard case of an externally forced system– but emerges as an autonomous dynamical property of the self-sustained oscillator. the parameter whose variation defines the resonance curve of fig. 1 is, on the other hand, the time delay τ. in the figure, we have highlighted the point corresponding to τ = 0 which, as can be seen from eq. (3), corresponds to the maximum of the amplitude, amax = √ g µ − 1. (5) note that, due to the smooth profile of the resonance curve at its peak, the frequency at τ = 0, given by √ 1 + ba2max, does not equal the maximum frequency attainable by the oscillator, but lies slightly below. this is better appreciated in the figure inset. different line types (full and dotted) in fig. 1 represent the branches of positive and negative τ. naturally, in an experiment, only the branch of positive time delays can be observed. however, stationary oscillatory solutions exist –and, as we discuss later, are stable– irrespectively of the sign of τ. ii. oscillation suppression the most relevant dynamical feature emerging from eqs. (2) and (3) is that, as τ varies from zero to both positive and negative values, the oscillation amplitude decreases and, eventually, vanishes for sufficiently long delays. this situation is reached at the points where the resonance curve in fig. 1 intersects the horizontal axis. focusing on the branch of positive delays, oscillations are suppressed when their frequency reaches ω0 = √ 1 + ∆ − √ (1 + ∆)2 − 1, (6) with ∆ = (g2 − µ2)/2, corresponding to a critical delay τ0 = ω −1 0 ∣∣∣∣arccos µg ∣∣∣∣ . (7) oscillation suppression can be understood in qualitative terms as a consequence of the increasing phase shift between the feedback force and the velocity when τ grows. as the phase shift becomes larger, the “timing” of energy supply by feedback fails to counteract energy dissipation by damping. the energy balance –implicit in eq. (3)– cannot be further maintained, and periodic motion dies out. figure 2 shows the critical delay τ0 at which oscillations are suppressed, as a function of the feedback amplitude g, and for various values of the damping µ. for g just above µ, τ0 grows from zero and –if µ is small enough– attains a plateau around π/2. note that, since the oscillation frequency is always near unity, in this plateau the phase shift between feedback and velocity is also close to π/2. in other words, in this zone the feedback force is in-phase with the displacement x(t). for larger values of g, τ0 leaves the plateau and grows further. we point out that very small values of the damping µ, as considered in fig. 2, are realistic when 090003-3 papers in physics, vol. 9, art. 090003 (2017) / d. h. zanette cr iti ca l t im e de la y feedback force amplitude figure 2: critical time delay for oscillation suppression, τ0 as given by eq. (7), vs. the feedback force amplitude g, for various values of the damping coefficient µ. working with microand nanomechanical oscillators. indeed, these devices have quality factors q (∼ µ−1) typically above 104 [4]. in fig. 1, on the other hand, we have chosen a relatively large value of µ for clarity in the graphical representation. for very small values of µ, in fact, the resonance curve becomes too narrow for its features to be clearly discerned in a plot. iii. stability of self-sustained oscillations stability of the stationary oscillatory solutions presented above can be assessed by the method of multiple time scales [7], which provides equations of motion for the relatively slow dynamics of the oscillation amplitude and phase. the method works under the assumption that there is a clear separation between the oscillation period and other time scales involved in the system. in the present case, this condition is fulfilled for small damping, µ � 1, as it is energy dissipation which controls the relaxation of amplitude and frequency to their asymptotic values. application of the method of multiple scales to eq. (1) straightforwardly shows that when oscillatory solutions of non-vanishing amplitude do exist, they are stable under arbitrary perturbations to their amplitude and phase (or frequency), irrespectively of the sign of the delay. instead of giving the details of this standard calculation, we resort here to a simple physical argument to explain stability, which sheds useful light on the role of nonlinearity. assume that the oscillator is in stationary periodic motion. this implies, in particular, that the energy input from feedback is exactly balanced by dissipation. if motion is now perturbed in such a way that the oscillation amplitude increases, the velocity increases accordingly, thus making feedback and damping to grow in the same proportion (since both are proportional to the velocity). however, due to the amplitude-dependent nonlinear correction in the damping coefficient, there is an extra growth in damping which enhances dissipation and therefore counteracts the perturbation, as the amplitude will tend to decrease. a symmetric argument applies if, on the contrary, the amplitude is perturbed to lower values. if, on the other hand, the perturbation makes the frequency to increase, the velocity also increases as so do feedback and damping. through the cubic nonlinearity in the restoring force, however, such change in the frequency leads the amplitude to grow. this growth, in turn, implies an enhancement in dissipation, as explained in the preceding paragraph. consequently, the perturbation is counteracted by the oscillator response. the case where the frequency decreases is analogous. under perturbations of both amplitude and frequency, therefore, nonlinearity plays a key role in insuring the stability of stationary oscillations. iii. further approximations i. backbone approximation as mentioned in section ii.ii., a realistic limit when working with microand nanomechanical oscillators consists in considering very small values of the damping coefficient µ. in order to insure the existence of physically meaningful solutions for the amplitude and the frequency, however, eq. (3) requires that the limit of µ → 0 is taken along with the limit g → 0, maintaining a finite ratio r = g/µ. in physical terms, this joint limit is justified by the fact that the smaller the rate of energy dissipation, the smaller the feedback force necessary to maintain oscillatory motion. in the crudest approximation, the resonance 090003-4 papers in physics, vol. 9, art. 090003 (2017) / d. h. zanette curve collapses to a single-valued curve with equation a = √ 1 −ω2 b , (8) plotted as a cyan line in fig. 1. this is the so-called backbone approximation to the resonance curve [7, 9]. within this approximation, the time delay as a function of the frequency along the backbone curve is τ = ω−1 arccos ( 1 + b−ω2 rb ) . (9) although, for the relatively large value of µ considered in fig. 1, the backbone curve gives a poor approximation to the resonance curve; as µ becomes smaller the two branches of the latter collapse to the backbone. a measure of this collapse is provided by computing the frequency of oscillation suppression, ω0, in the limit of small damping. equation (6) yields ω0 ≈ 1 − µ 2 √ r2 − 1, (10) showing that the width of the resonance curve at its base (a = 0) is proportional to µ. in this limit, the time delay for oscillation suppression is still given by eq. (7). ii. short-delay approximation another relevant approximation, which drastically simplifies the problem of solving eqs. (2) and (3), is obtained for small values of τ. in fact, approximating sin ωτ ≈ ωτ and cos ωτ ≈ 1 − ω2τ2/2, we obtain linear equations for a2 and ω2, whose solutions yield a = √ 2(g −µ)(1 + gτ) −gτ2 2µ(1 + gτ) + bgτ2 (11) and ω = √ 2 b(g −µ) + µ 2µ(1 + gτ) + bgτ2 . (12) the magenta line in fig. 1 represents these results. for the parameters of the figure, thus, this is an excellent approximation along the entire resonance curve. iv. numerical results as a validation of the analytical results obtained by means of the approximation considered in section ii., i.e., neglecting the higher-harmonic contributions to oscillatory motion, we have numerically solved eq. (1) for various parameter sets. we have used a fourth-order runge-kutta scheme, with the only non-standard feature that, due to the delay in the feedback force, it is necessary to specify the solution x(t) for all times −τ ≤ t ≤ 0, instead of the initial condition at t = 0. in all cases, we have considered a constant value of x(t), hence with ẋ(t) = 0, in that interval. for all the parameter sets taken into account in the numerical calculations, we have found that, after a transient time which scaled as µ−1, the oscillator reached periodic motion. this observation is in qualitative agreement with our argument of section ii.iii., which predicts stability of periodic motion under very general conditions. a more quantitative comparison between theoretical and numerical results, illustrated for the parameter set of fig. 1, is given in fig. 3. the main panel shows, as green dots, numerical measurements of the frequency and amplitude of long-time periodic motion for various (positive) values of the delay in the interval 0 ≤ τ ≤ 0.4. the black full line stands for the theoretical result. the inset shows the same data in the amplitude-delay plane. we see that the agreement is generally very good. not unexpectedly, the theoretical approximation improves towards smaller amplitudes, where nonlinear effects are weaker and periodic motion is better described by a pure harmonic oscillation, as assumed in section ii. as a matter of fact, fig. 3 is limited to amplitudes above 0.25 since, in the plot, theoretical and numerical results are indistinguishable for smaller values. the agreement between our theoretical and numerical results rises the question to what extent higher-harmonic components, which are a direct product of nonlinearity but have been neglected in our analytical approximation, constitute a substan090003-5 papers in physics, vol. 9, art. 090003 (2017) / d. h. zanette 1.1 1.2 1.3 0.25 0.30 0.35 0.40 0.45 0.3 0.4 0.0 0.1 0.2 0.3 0.4 am pl itu de frequency theoretical numerical time delay am pl itu de figure 3: theoretical (black line) and numerical (green dots) results for the amplitude-frequency relation of periodic oscillatory motion, with µ = 0.1, g = 0.12, and b = 4, and for various values of the delay in the interval [0, 0.4]. the inset shows the same data in the amplitude-delay domain. dotted green lines have been plotted as a guide to the eye. tial contribution to oscillatory motion. to evaluate this, we calculate the amplitudes of different fourier components in the long-time numerical solution for x(t). with the parameters of fig. 3 and τ = 0.1, for which the oscillation amplitude and frequency are a = 0.428 and ω = 1.303, the first fourier amplitude (corresponding to frequency ω) is a1 = 0.42. due to the cubic nonlinearity in the restoring force, in turn, the next significant amplitude corresponds to the third-harmonic component (frequency 3ω), a3 = 0.0076. higher-harmonic amplitudes are even smaller. the first correction to harmonic motion, thus, is almost two orders of magnitude weaker than the main contribution, which reasonably justifies our analytical approximation. on the other hand, this modest contribution of higher-harmonic components to the overall motion markedly contrasts with the sizable phenomenology studied in section ii., which is also a direct consequence of nonlinearity. v. conclusions we have analyzed the dynamics of a mechanical oscillator whose periodic motion is sustained by a feedback force proportional to the oscillation velocity. the key ingredient that makes oscillations stable is a nonlinear dependence of damping with the oscillation amplitude, such that energy dissipation increases or decreases when the amplitude respectively grows or drops. this kind of nonlinearity, together with a cubic component in the restoring force, has been experimentally observed to occur in c-c beam micromechanical oscillators [6], which can therefore exhibit self-sustained motion under the action of linear velocity feedback. our emphasis was put on the effect of a time delay in the feedback force, assumed to originate in the electric circuit that reads, conditions, and reinjects the oscillation signal. the most significant consequence of this ingredient is that oscillation can be suppressed if the delay is large enough. it is well known that differential equations with time-delayed terms can exhibit either stationary oscillatory solutions or fixed rest points, depending on the delay [10]. also, oscillation suppression (or “death”) due to delays has been reported to occur in a variety of dynamical systems, such as synchronized limitcycle oscillators [11,12]. here, this same occurrence has been characterized for a nonlinear mechanical system that can become relevant for the design of micromechanical time-keeping devices. in connection with possible applications of this phenomenology, it is worth mentioning that classical-mechanical models such as eq. (1) provide a suitable starting point for the description of microand nanomechanical machines. as length scales become smaller, however, the effects of thermal fluctuations and electrical noise cannot be further ignored [13, 14]. from the viewpoint of our model, therefore, an important step forward would be to include noise in the theoretical analysis. acknowledgements this project has been supported by agencia nacional de promoción cient́ıfica y tecnológica, argentina, through grant pict2014-1611. fruitful collaboration with d. antonio, s. arroyo, changyao chen, d. czaplewski, j. guest, d. lópez, and f. mangussi, as well as 090003-6 papers in physics, vol. 9, art. 090003 (2017) / d. h. zanette discussions with s. risau gusmn, are gratefully acknowledged. [1] b yurke, d s greywall, a n pargellis, p a busch, theory of amplifier-noise evasion in an oscillator employing a nonlinear resonator, phys. rev. a 51, 4211 (1995). [2] d bishop, p gammel, r giles, the little machines that are making it big, phys. today 54, 38 (2001). [3] k l ekinci, m l roukes, nanoelectromechanical systems, rev. sci. instrum. 76, 061101 (2005). [4] d antonio, d h zanette, d lópez, frequency stabilization in nonlinear micromechanical oscillators, nat. commun. 3, 806 (2012). [5] d antonio, d a czaplewski, j r guest, d lópez, s i arroyo, d h zanette, nonlinearityinduced synchronization enhancement in micromechanical oscillators, phys. rev. lett. 114, 034103 (2015). [6] changyao chen, d h zanette, j r guest, d a czaplewski, d lópez, self-sustained micromechanical oscillator with linear feedback, phys. rev. lett. 117, 017203 (2016). [7] a h nayfeh, d t mook, nonlinear oscillations, wiley, new york (2008). [8] h b peng, c w chang, s aloni, t d yuzvinsky, a zettl, ultrahigh frequency nanotube resonators, phys. rev. lett. 97, 087203 (2006). [9] s i arroyo, d h zanette, duffing revisited: phase-shift control and internal resonance in self-sustained oscillators, eur. phys. j. b 89, 12 (2016). [10] t erneux, applied delay differential equations, springer, new york (2009). [11] d v ramana reddy, a sen, g l johnston, time delay induced death in coupled limit cycle oscillators, phys. rev. lett. 80, 5109 (1998). [12] s h strogatz, death by delay, nature 394, 316 (1998). [13] a n cleland, m l roukes, noise processes in nanomechanical resonators, j. appl. phys. 92, 2758 (2002). [14] p ward, a duwel, oscillator phase noise: systematic construction of an analytical mode encompassing nonlinearity, ieee trans. ultrason. ferroelectr. freq. control 58, 195 (2011). 090003-7 papers in physics, vol. 8, art. 080001 (2016) received: 2 november 2015, accepted: 22 december 2015 edited by: c. s. o’hern reviewed by: a. baule, queen mary university of london, uk. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.080001 www.papersinphysics.org issn 1852-4249 ergodic–nonergodic transition in tapped granular systems: the role of persistent contacts paula a. gago,1–3 diego maza,4 luis a. pugnaloni1, 2∗ static granular packs have been studied in the last three decades in the frame of a modified equilibrium statistical mechanics that assumes ergodicity as a basic postulate. the canonical example on which this framework is tested consists in the series of static configurations visited by a granular column subjected to taps. by analyzing the response of a realistic model of grains, we demonstrate that volume and stress variables visit different regions of the phase space at low tap intensities in different realizations of the experiment. we show that the tap intensity beyond which sampling by tapping becomes ergodic coincides with the forcing necessary to break all particle–particle contacts during each tap. these results imply that the well-known “reversible” branch of tapped granular columns is only valid at relatively high tap intensities. i. introduction granular matter is ubiquitous in nature. however, due to the complexity of the real particle– particle interactions, the standard approaches of continuum mechanics and thermodynamics are still limited in providing meaningful descriptions of the states in which these systems can be. edwards and oakeshot introduced a tentative approach inspired by the ideas of equilibrium statistical mechanics to formally describe the global properties of a static ∗e-mail: luis.pugnaloni@frlp.utn.edu.ar 1 dpto. ingenieŕıa mecánica, facultad regional la plata, universidad tecnológica nacional, av. 60 esq. 124, 1900 la plata, argentina. 2 consejo nacional de investigaciones cient́ıficas y técnicas, argentina. 3 current address: department of earth science and engineering, imperial college london, south kensington campus, london sw7 2az, uk. 4 departamento de f́ısica y matemática aplicada, facultad de ciencias, universidad de navarra, navarra, spain. granular pack. since the introduction of this theory —where the entropy of the systems is governed by the spatial disorder of the grains [1]—, a number of studies have used it to frame the interpretation of the results of specific experiments. the most relevant case is the so-called “chicago experiment”, where a column of grains was repeatedly tapped following an annealing-type protocol [2, 3]. the main outcome of this experiment is that a stationary state can be reached, where the mean volume fraction, φ, is a well defined function of the tap amplitude, γ. others have also obtained seemingly reproducible states without the need of annealing [4]. however, it has been shown recently, by simulation of frictionless grains, that these stationary states are not necessarily ergodic [5]. at low γ, different members of an ensemble of steady-states prepared with a well defined protocol may sample a different region of the phase space, as the fluctuations of φ indicate. in this paper, we demonstrate that not only the volume but also the force moment tensor, σ, are sampled in a non-ergodic fashion and that ergod080001-1 papers in physics, vol. 8, art. 080001 (2016) / p. a. gago et al. icity seems to be recovered if all particle–particle contacts are lost during each tap. this sets a clear limit to the range of driving forces able to generate a sequence of configurations for which the edwards framework can be applied. ii. numerical protocol we simulated using the lammps package [6] a quasi-two-dimensional cell containing n = 1000 spherical particles of diameter d. the cell is 1.1 d thick and 27.8 d wide (the granular column is about 35 layers deep) to have a one to one representation of a previously introduced experimental device [7, 8]. we use a model for soft frictional spheres described in refs. [9, 10]. the normal component, fn, of the contact interaction is given by an elastic repulsive force proportional to the overlap of the interacting spheres and a dissipative term proportional to the normal component of the relative velocity. the tangential term, ft, implements an elastic shear force and a damping force. the shear force takes into account the cumulated tangential displacement between the particles while they remain in contact. whenever ft > µfn (µ is the friction coefficient), this lower dynamic friction force is used. in this work we use the same interaction parameters as in ref. [11, 12]. the wall–particle interaction is defined with the same parameters as the particle–particle force. tapping is simulated by imposing an external vertical motion to the cell. this pulse is a single sinusoidal cycle a sin(ωt). we fix ω = 2π×33 hz and use the tap amplitude a as control parameter. the tap intensity is characterized by γ = aω2/g. the mechanical equilibrium after each tap is deemed achieved if the kinetic energy of each particle has fallen (in average) below 10−6mgd. where m is the mass of one particle and g the acceleration of gravity. we study 20 independent realizations of a decreasing ramp of the tap amplitude. we initially fill the cell by placing the spheres at random positions before letting them deposit under the action of gravity. in each realization, we decrease γ in small steps, from γ = 20.0 down to γ = 0.8, and apply 200 taps for each γ. note that for γ < 1.0, the column of grains does not detach from the base during a tap. the 200 taps at each value of γ are enough to reach a steady-state. we do not observe any drift of the mean values of φ or σ after the initial 100 taps, which we will discard later in our analysis. finally, we also study a cyclic annealing protocol: starting from the final configuration at γ = 0.8 for each of the former realizations; the tap amplitude is cyclically increased and decreased every 200 taps in the range 0.8 < γ < 5.5 in order to compare the steady-states reached with an alternative method. iii. data analysis to measure the packing fraction we use the 2d voronoi tessellation (implemented in [13]) of the x–z plane projection of the particle positions, disregarding the third coordinate on the thin direction of the cell. then, we associate to each particle a “local volume” fraction by dividing the particle area by the corresponding voronoi area. in order to avoid boundary effects, we disregard particles closer than 2d to the lateral walls. following the recommendations in ref. [14], we analyze horizontal slices of the granular column 15 d thick measured at approximately the same depth with respect to the free surface in order to retrieve unbiased results for the force moment tensor due to the uneven free surface. averaging over the n particles contained in the slice of interest, we obtain the volume fraction of each static configuration. to obtain the steadystate φ corresponding to a given γ, we averaged this quantity over the last 100 configurations obtained for each tap intensity. we also obtain the force moment tensor σ αβ i = ∑ c rαc f β c of each particle in the slice of interest. here, the sum runs over all the contacts c of the particle, −→r c is the vector from the center of the grain to the contact c and −→ f c is the corresponding contact force. we apply the same averaging protocol used for φ to obtain the force moment tensor for the configuration and σ for the steady-state of a given realization and γ. iv. results in fig. 1(a) the ensemble average of the steadystate φ (i.e., averaged over the 20 independent realizations) is displayed as a function of γ. the error bars correspond to the standard deviation over the 20 realizations. as observed by a number of authors, the curve seems to be very well defined with 080001-2 papers in physics, vol. 8, art. 080001 (2016) / p. a. gago et al. 0.895 0.9 1 2 3 φ γ 0.84 0.88 0.92 φ (a) 16 20 24 28 1 2 3 t r (σ ) n -1 [m d g ] γ 16 20 24 28 5 10 15 20 t r (σ ) n -1 [m d g ] γ (b) figure 1: ensemble average of the steady-state packing fraction φ (a) and trace of the force moment tensor tr(σ) (b) as a function of γ. the error bars correspond to the standard deviation over the 20 averaged realizations. the insets show the same results and two of the 20 independent realizations (dashed lines) in the low-tap-intensity region. the error bars on the single realization data correspond to the estimated standard error of the mean. independent realizations falling within a very narrow range of φ values for any given γ. in the past, this led to the conclusion that this was a truly reversible process, where lowering or raising γ would lead to the same steady-state φ. in the inset of fig. 1(a) two of the independent realizations are shown for the low-tap-intensity region. from this picture, it is clear that steady-states corresponding to a given γ can differ from one realization to another. notice that in the inset the error bars for the two isolated realizations correspond to the standard error of the mean (sem), which gives an estimate of the uncertainty of the mean value reported rather than the size of the φ fluctuations. for these two realizations, although the mean φ seems to agree within the estimated error for intensities γ > 1.5, it is clear that they are different for 1.0 1.3 0.900 0.901 0.902 0.903 0.904 0.905 φ (a) 0 500 1000 1500 2000 number of taps 16 18 20 22 24 26 28 t r ( σ ) n −1 [m gd ] (b) figure 2: φ (a) and tr(σ) (b) as a function of the tap number for two of the 20 independent realizations at γ = 0.8. the notched boxes and “violin” diagrams shown suggest that both realizations can be hardly considered as representing the same steady-state. low γ. this is consistent with the findings of paillusson and frenkel [5] for frictionless spheres under event-driven simulations. however, in our simulations we are able to extract the stress state of the system as well as the history of the contacts. these reveal valuable information, as we discuss below. in fig. 1(b), we show the trace tr(σ) of σ averaged over all 20 realizations as a function of γ. as before, the error bars indicate the standard deviation over the 20 realizations. as we can see, the variability of the mean stress is significantly large at low γ. this is not due to the large fluctuations during a given series of taps but to the variations observed from one realization of the protocol to the other. indeed, the inset in fig. 1(b) shows that, for low γ, the mean values of tr(σ) for two of the 20 realizations have a relatively small sem (i.e., the fluctuations in each realization are small). however, realizations differ from each other. the difference between results corresponding to different realizations become much more evident here than in the case of the φ–γ plot. we suggest that the stress tensor may be more sensitive and then more suitable to sense if ergodicity is fulfilled in experimental data. overall, as γ is decreased, different realizations explore non-overlapping ranges of volume/stress. therefore, temporal averages (on 080001-3 papers in physics, vol. 8, art. 080001 (2016) / p. a. gago et al. 0.82 0.84 0.86 0.88 0.9 φ (a) 0.901 0.905 1 1.5 1.9 φ γ 18 21 24 27 1 2 3 4 5 t r (σ ) n -1 [m d g ] γ (b) figure 3: steady-state φ (a) and tr(σ) (b) as a function of γ corresponding to a full annealing protocol. starting from the filled circles (red) increasing ramp, and following by filled squares (blue), open circles (green) and open squares (magenta). the error bars correspond to the estimated sem. a single time series) do not match with ensemble averages (over the realizations). since the number of taps we have explored for each γ may be small to assume that the steadystate has been properly sampled, we carried out 2000 additional taps at γ = 0.8 for each realization. since some of the signals are not normally distributed, we confirmed the stationarity of these states by using a non-parametric test at a level of significance of 5% [15, 16]. two of the normally distributed realizations are displayed in fig. 2 as a function of the tap number. comparing the corresponding notched boxes and “violin” diagrams of both signals, it is clear that the states do not match. hence, the reversible branch found in ref. [3] is not so at low γ in our case since truly stationary states with distinguishable φ and tr(σ) may be obtained on different realizations of the annealing protocol. of course, this is harder to detect in φ since the dispersion between realizations is much smaller compared with the range of φ values obtained at different γ. the former results also confirm the hypothesis that non-ergodicity is present in a typical tapping protocol beyond the special case reported in ref. [5], where the steady-states obtained did not follow any annealing-type protocol. hence, we observe this non-ergodic behavior even after annealing the system from high tapping strengths. to stress this point and assess if the speed of the annealing may prevent the system from reaching a unique steadystate on each realization, we apply a slower cyclic annealing protocol (similar to the one introduced in [2]) to each of the 20 final states at low γ in order to reproduce the “reversible branch”. in fig. 3 we display a sequence of two successive up and down ramps applied to one of the 20 initial realizations using γ-steps about one half of those used in fig. 1. although in the scale used for φ the steady-state packing fraction seems reversible, a close inspection shows that the states have distinguishable φ at low γ [see fig. 3(a) and the corresponding inset]. this is much more evident when the stress is analyzed [see fig. 3(b)]. in order to set a criterion to decide if the steadystates are not ergodic for a given γ, we show in fig. 4 the p-values for the kruskal–wallis [17] one-way analysis of variance performed on the 20 realizations at each γ. this simple non-parametric test allows for the rejection of the null hypothesis that all 20 data series are drawn from a unique distribution (which does not need to be normal), hence that they correspond to a unique steady-state. if the p-value is significant (in our case p > 0.01), then we cannot rule out the possibility that the 20 series come from the same steady-state. as we can see from fig. 4(a), the test run on the data for φ indicates that for γ < 5.0, the null hypothesis must be rejected and therefore there exist at least two out of the 20 steady-states that are not the same. however, for higher γ, the test is significant and then the 20 realizations may correspond to the same steady-state. interestingly, when the test is run on σ [see fig. 4(b)], the steady-state seems to be unique for all 20 realizations if γ > 3.75. although differences between realizations are simpler to detect on visual inspection of tr(σ), it is actually φ that sets a higher threshold for the γ values needed to ensure an ergodic steady-state (i.e., γ > 5.0). the previous results indicate that the steadystates sampled at low tap intensities do not only 080001-4 papers in physics, vol. 8, art. 080001 (2016) / p. a. gago et al. 0.83 0.85 0.87 0.89 â��10 -11 â��10 -6 â��10 -2 â��10 0 φ p -v a lu e (a) p-value (φ) φ 17 20 23 5 10 15 20 25 â��10 -11 â��10 -6 â��10 -2 â��10 0 t r (σ ) n -1 [m d g ] p -v a lu e γ (b) p-value (tr(σ)) tr(σ) figure 4: p-value (up-triangles, right axis) as a function of γ for the kruskal–wallis [17] one-way analysis of variance for φ (a) and tr(σ) (b). the horizontal dotted line corresponds to the significance level used (1%). the black circles correspond to the φ and tr(σ) data from fig. 1. 0 1 2 3 3 6 9 12 15 18 21 c /c 0 [ % ] γ (a) p-value (φ ) p-value (tr(σ)) c/c0 3 6 9 12 15 18 21 â��10 -11 â��10 -6 â��10 -2 â��10 0 p -v a lu e γ (b) figure 5: (a) percentage c/c0 × 100 of persistent contacts (black circles) as a function of γ, averaged over 5 taps on 6 independent realizations. the error bars correspond to the standard deviation over the 6 realizations. up-triangles (green) correspond to the p-values in fig. 4 for φ and down-triangles (magenta) to tr(σ). (b) same as (a) for frictionless grains. depend on γ but on the particular history of each realization. notice that this goes beyond the history dependent out-of-equilibrium trajectories already reported in tapped systems [18] since here we are focusing on the steady-states. one may hypothesize that the constraints imposed by the contacts is one of the reasons for the non-ergodic behavior at low tap. if a contact persists from one tap to the next, the contact force after coming back to rest will depend on the history of all contacts of that particular grain. in order to test this idea, we analyze the evolution of all contacts during each tap and identify those that persist (i.e., contacts that did not break at any time during the pulse of energy). figure 5 shows the average ratio c/c0 of persistent contacts, c, to the total number of contacts, c0, as a function of γ. for this calculation, each contact was tracked during the final 5 taps for each γ on 6 of the independent realizations and only grains that fall within the layer of interest, as discussed above, where included in the analysis. the percentage of persistent contacts is very small but non-zero up to γ ≈ 5.0. as it is expected, when γ is increased sufficiently all the contacts are broken and new ones are made during each tap (resulting in c/c0 = 0). this transition coincides with the value of γ where the realizations seem to sample the same steady-state (see the pvalues included in fig. 5). therefore, when small taps are applied, the aging of some of the contacts seem to lead the system to sample different regions of the phase space during independent realizations. however, if all contacts are made anew at each tap, the sampling becomes compatible with the idea of ergodicity introduced in fig. 4. in order to generalize this result, we also simulated frictionless grains. interestingly, the same conclusion drawn for frictional grains is true for frictionless ones: different realizations seem to sample the same steady-state only if all contacts are made anew upon each tap [see fig. 5 (b)]. v. conclusions our analysis of the steady-states of tapped granular systems indicate that these states are historydependent for tap intensities below a certain threshold. this is in contradiction with the general assumption that macroscopic time averages —such as the volume fraction— can be recovered when the amplitude of the perturbation applied to the 080001-5 papers in physics, vol. 8, art. 080001 (2016) / p. a. gago et al. system is tuned back and forth. the differences between independent realizations become particularly noticeable in the stress distribution. these findings show that the postulates of the equilibrium statistical thermodynamics may not be always fulfilled to describe the steady state of static granular systems (see also ref. [19] for a discussion on the boltzmann distribution failure for an analytically solvable model). focusing on tap intensities that warrant that all contacts are made anew after each tap may allow exploring the available phase space in agreement with the ergodic hypothesis. however, gentle perturbations deserve an approach that includes memory effects to suitably describe the states. in that sense, non-equilibrium thermodynamic approaches may be a suitable alternative [20]. further research on such alternative formalisms, the effect of other types of forcing mechanisms (e.g., shear), and possible extensions to other complex systems (e.g., active matter) become necessary. acknowledgements this work has been partially supported by projects pict-2012-2155 anpcyt (argentina), fis2011-26675 mineco (spain) and piuna (universidad de navarra). [1] s f edwards, r b oakeshot, theory of powders, physica 157d, 1091 (1989). [2] e r nowak, j b knight, e ben-naim, h m jaeger, s r nagel, density fluctuations in vibrated granular materials, phys. rev. e 57, 1971 (1998). [3] e r nowak, j b knight, m l povinelli, h m jeager, s r nagel, reversibility and irreversibility in the packing of vibrated granular material, powder technol. 94, 79 (1997). [4] ph ribière, p richard, p philippe, d bideau, r delannay, on the existence of stationary states during granular compaction, eur. phys. j. e 22, 249 (2007). [5] f paillusson, d frenkel, probing ergodicity in granular matter, phys. rev. lett. 109, 208001 (2012). [6] s plimpton, p crozier, a thompson, lammps-large-scale atomic/molecular massively parallel simulator, sandia national laboratories (2007). [7] l a pugnaloni, i snchez, p a gago, j damas, i zuriguel, d maza, towards a relevant set of state variables to describe static granular packings, phys. rev. e 82, 050301 (2010). [8] s ardanza-trevijano, i zuriguel, r arévalo, d maza, topological analysis of tapped granular media using persistent homology, phys. rev. e 89, 052212 (2014). [9] n v brilliantov, f spahn, j-m hertzsch, t pöschel, model for collisions in granular gases, phys. rev. e. 53, 5382 (1996). [10] l e silbert, d ertaş, g s grest, t c halsey, d levine, s plimpton, granular flow down an inclined plane: bagnold scaling and rheology, phys. rev. e 64, 051302 (2001). [11] l a pugnaloni, j damas, i zuriguel, d maza, master curves for the stress tensor invariants in stationary states of static granular beds. implications for the thermodynamic phase space, papers in physics 3, 030004 (2011). [12] l a pugnaloni, m mizrahi, c m carlevaro, f vericat, nonmonotonic reversible branch in four model granular beds subjected to vertical vibration, phys. rev. e. 78, 051305 (2008). [13] c rycroft, voro++: a three-dimensional voronoi cell library in c++, chaos 19, 041111 (2009). [14] p a gago, l a pugnaloni, d maza, relevance of system size to the steady-state properties of tapped granular systems, phys. rev. e 91, 032207 (2015). [15] w constantine, d percival, fractal: fractal time series modeling and analysis. r package version 2.0-0, http://cran.rproject.org/package=fractal (2014). [16] r core team, r: a language and environment for statistical computing, r foundation for statistical computing, vienna, austria (2013). 080001-6 papers in physics, vol. 8, art. 080001 (2016) / p. a. gago et al. [17] w h kruskal, w a wallis, use of ranks in one-criterion variance analysis, j. am. stat. assoc. 47, 583 (1952). [18] ch josserand, a v tkachenko, d m mueth, h m jaeger, memory effects in granular material, phys. rev. lett. 85, 3632 (2000). [19] r m irastorza, c m carlevaro, l a pugnaloni, exact predictions from the edwards ensemble versus realistic simulations of tapped narrow two-dimensional granular columns, j. stat. mech. p12012 (2013). [20] g lebon, d jou, jc casas-vazquez, understanding non-equilibrium thermodinamics, springer-verlag, berlin heidelberg (2008). 080001-7 papers in physics, vol. 7, art. 070001 (2015) received: 19 january 2015, accepted: 25 february 2015 edited by: c. s. o’hern reviewed by: m. pica ciamarra, nanyang technological university, singapore. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070001 www.papersinphysics.org issn 1852-4249 wang–landau algorithm for entropic sampling of arch-based microstates in the volume ensemble of static granular packings d. slobinsky,1, 2∗ luis a. pugnaloni1, 2† we implement the wang–landau algorithm to sample with equal probabilities the static configurations of a model granular system. the “non-interacting rigid arch model” used is based on the description of static configurations by means of splitting the assembly of grains into sets of stable arches. this technique allows us to build the entropy as a function of the volume of the packing for large systems. we make a special note of the details that have to be considered when defining the microstates and proposing the moves for the correct sampling in these unusual models. we compare our results with previous exact calculations of the model made at moderate system sizes. the technique opens a new opportunity to calculate the entropy of more complex granular models. i. introduction in the study of static packings there exists still a lack of predictive capabilities of the available theories. assemblies of objects that pack (such as grains) can generally sample such packed configurations only by the external excitation of the system. these packings can be built by repeating a given packing protocol (e.g., homogeneous compression or deposition under an external field against a confining boundary) on an initial random configuration. also, a markovian or non-markovian series can be constructed by exciting the system from the previous packing configuration. to what extent the series of packings obtained (using either type of protocol) can be modeled without information on ∗e-mail: dslobinsky@frlp.utn.edu.ar †e-mail: luis.pugnaloni@frlp.utn.edu.ar 1 departamento de ingenieŕıa mecánica, facultad regional la plata, universidad tecnológica nacional, av. 60 esq. 124, 1900 la plata, argentina. 2 consejo nacional de investigaciones cient́ıficas y técnicas (conicet), argentina. the dynamics that drives the system to the packed configuration is still uncertain. the main reason for this is that the few statistical approaches that attempt to do this are strongly hinder by the poor current ability to generate such packed structures without using a dynamics to build the packings. one might expect that the packing fraction and its fluctuations, among other properties, could be obtained from basic statistics without resourcing to a full molecular dynamic type of simulations (also known as “discrete element method”, dem). although these types of simulations are powerful enough to predict the behaviour of most systems that pack, it is desirable to find a description that could neglect the detailed dynamics between consecutive packed configurations. the use of the tools provided by the ensemble theory of statistical mechanics in problems of granular matter is still limited. although many studies perform statistical analysis of configurations of a granular sample obtained during careful preparations in the laboratory or in molecular dynamictype simulations, very rarely sampling in a particular statistical ensemble is carried out either via an 070001-1 papers in physics, vol. 7, art. 070001 (2015) / d. slobinsky et al. analytic or a computational calculation of a model system. this has prevented a direct assessment as to whether ensemble theory is appropriate to describe the behaviour of these peculiar systems. one pioneering contribution to the topic is the idea that granular systems at mechanical equilibrium could be treated as ensemble members, putting forward the conjecture that the mean values of measurable quantities could be calculated using statistical mechanics for these ensembles [1]. in this scheme, each of the thermodynamic variables finds a counterpart, the volume taking the place of the energy, and the compactivity that of the temperature, amongst other transformations. however beautiful this description may seem, the computational challenges to generate ensemble samples in this context are extraordinaries [2–5]. one complication that prevents to a large extent the use of the machinery of statistical mechanics in this case is the fact that configurations, unlike in traditional liquid theories, have to be checked for the constraint of mechanical equilibrium. in a previous work [6], we have made a proposal on how to deal with this, at least in a first approximation, by describing the excitations of static granular systems under gravity in terms of its arches. since arches are sets of grains that stabilize each other, these are the basic units of mechanically stable structures in the packing. any static configuration can be described in terms of the arches formed by its grains, their arch shape, position, orientations, etc. we have considered, as an example, a model of a two-dimensional (2d) granular system composed of disks where arches are assumed to take a single possible structure and the arch–arch interactions (due to the interlocking of arches) is neglected. we calculated the exact entropy of this model (the noninteracting rigid arch model, nira) by constructing all possible configurations for moderate system sizes. of course, generating each state is a cumbersome task if the system size is increased or if the number of degrees of freedoms (dof) is increased by using more realistic models. therefore, an alternative approach based on sampling the phase space with the desired probability is necessary. in the present work, we will calculate the entropy of the nira model in the microcanonical ensemble [7] using entropic sampling through the wang– landau (wl) algorithm [8–11]. this approach allows us to obtain the entire entropy function for all possible volumes of the system in one single simulation for larger systems and potentially for more complex models. all derived properties, such as compactivity and volume fluctuations, can then be calculated through numerical differentiation. we pay particular attention to the different descriptions that can be realized for the nira model. some of these representations do not provide a direct way of sampling the configuration space uniformly. this work is organized as follows: in section ii., we will review the wl algorithm. in section iii., we will review the nira model and discuss different ways of representing it, along with the issues related to uniform sampling of the configurations. we then present a representation that allows very fast calculations of the entropy and we compare the results with the exact counting of all configurations for systems of moderate size. finally, we discuss future directions to refine the arch-based ensemble volume function towards capturing detailed features of more realistic systems. ii. wang-landau algorithm the wl method has revolutionized computational statistical mechanics [8–11]. wl is a pure statistical method that can retrieve the density of states (dos) (hence the entropy) over a bounded region of the energy spectrum from the sole knowledge of the energy function. the spectacular computational performance achieved by this method stems from the fact that it presents no limitations for the system to tunnel between potential barriers, in stark contrast with classical monte carlo methods that underperform when they encounter deep valleys in the energy landscape. wl finds the entropy of the system by means of a markov chain in the energy landscape which is conveniently biased towards the less probable energies in a strongly history dependent manner. this is achieved by using the multicanonical approach [12] in which each possible configuration is sampled with a probability given by the inverse of the density of states for its given energy. specifically, wl aims to obtain a flat histogram of visited energies e by forcing the system to go through all configurations with a probability which is inverse to the previous occurrence of that energy in the markov 070001-2 papers in physics, vol. 7, art. 070001 (2015) / d. slobinsky et al. chain. the method is ergodic and asymptotically fulfils the detailed balance condition [13]. there exists extensive literature on the wl algorithm but here, we only summarize its most relevant steps. in the following sections, we will refer to a configuration of the system as a fixed set of values of all its dof. in wl, one defines two histograms that are continuously updated as the markov chain proceeds. these histograms are the entropy s(e), which is the output of the algorithm, and a control histogram h(e). after initializing h(e) = 0, s(e) = 1 and a starting configuration with energy e0, the rules to update these histograms are: i propose a new configuration and calculate its energy e1. the new configuration is generally derived from the previous configuration by a change in the value of one of its dof. ii accept the new configuration according to a probability given by: min [1, exp(s(e0) −s(e1))]. iii update the two histograms in the correct energy bin e (e1 if the new configuration is accepted and e0 if otherwise), accordingly: h(e) = h(e) + 1 for the control histogram, and s(e) = s(e) + f for the entropy. here, f is a correction that controls the precision of the algorithm, which will be decreased (see next step), usually starting at f = 1. iv if the control histogram h(e) is flat enough according to some arbitrary criterion, decrease f (for instance by making f = f/2) and reset all entries of h(e) to zero. v if f > � (with � a prescribed tolerance), return to step i, otherwise stop. after each reduction of the correction term f, the entropy histogram is built with a finer grained precision. however, to speed up the initial estimates of s(e), f is set to a high initial value. different approaches are followed to accelerate the final stages of refinement by decreasing f with alternative criteria [14]. in monte carlo approaches, the detailed balance condition ensures that the markov chain has a limiting distribution [15]. detailed balance can be stated as follows: pµp(µ → ν) = pνp(ν → µ) (1) where pµ is the probability distribution of configuration µ and p(µ → ν) the transition probability from configuration µ to configuration ν which can be written as: p(µ → ν) = sp (µ → ν)ap (µ → ν) (2) with sp (µ → ν) the selection probability, which is the probability that the algorithm generates a trial configuration ν starting from configuration µ; and ap (µ → ν) the acceptance probability, i.e., the probability that the algorithm will accept the trial configuration ν. hence, since the target distribution in the entropic sampling is the inverse of the dos, i.e., pµ ∝ g(eµ)−1 = exp[−s(eµ)], eqs. (1) and (2) implies ap (µ → ν) ap (ν → µ) = exp[s(eµ)−s(eν)] sp (ν → µ) sp (µ → ν) (3) in wl, detailed balance is not fulfilled in general. during the construction of entropy, the acceptance probability given in step ii above (i.e., ap (µ → ν) = min [1, exp(s(eµ) −s(eν))]) evolves and it is only when the entropy gets sufficiently refined that detailed balance is met. notice that the form chosen for ap requires that the selection probability be symmetric (i.e., sp (µ → ν) = sp (ν → µ)). although this condition is simple to comply with in models like ising, for arch-based descriptions of static granular packs this is non-trivial. therefore, one must be careful to represent a system in such a way that trial moves between different configurations have the same backward and forward selection probability. after reviewing the main characteristics of the arch-based ensemble in the next section, we will show that some natural representations of the microstates lead to non-symmetric selection probability schemes. we present, however, a way of representing the configurations that does allow for the direct use of wl. in all the wl simulations, we have used 300 bins for the histograms. the tolerance for the f correction was set to � = 2−15. the histogram h(e) is considered flat (see step iv) whenever (hmax −hmin)/hmax < 0.2, with hmax and hmin the maximum and minimum height of the histogram. 070001-3 papers in physics, vol. 7, art. 070001 (2015) / d. slobinsky et al. iii. arch-based microstates in ref. [6], we have introduced a way of describing the microstate of a static granular system under gravity by considering the arches that the grains form instead of the more traditional approach of using the particle positions. arches are defined as sets of mutually stable grains. all other particles being fixed, the removal of any of the grains in the arch would induce the destabilization of the rest of the set. any assembly of grains, static under gravity, can be split into a number of arches which are mutually exclusive [16, 17]. the major difficulty in sampling static granular configurations is the fact that these are sparse (with zero measure) in the overwhelming number of possible particle positions. moreover, there are not recipes to generate a static configuration from another static configuration by simply moving a grain from its position. since each arch is stable on its own right, the arch-based description warrants that any configuration proposed fulfils a basic stability constraint; i.e., that each set of grains identified as an arch has internal contacts that keep it stable. the problem is now moved to generating all possible combination of arches, including how many of them are of a given size, shape, orientation and position and also generating all arrangements of these that can be stable resting on each other. of course, all of these dof can be represented with different levels of approximation. in ref. [6], we have described the five general steps necessary to carry out an edwards entropy calculation (i.e., the number of states associated to each given volume of the packing) within an archbased scheme. these are: 1. define the microstate of the system in terms of arches. 2. define the external constraints imposed to arches. 3. define a volume function that yields the total volume of the microstate in terms of the arches. 4. define an algorithm to generate all microstates defined in step 1 that comply with the external constraints of step 2. or sample microstates with equal probabilities. 5. calculate the volume of each microstate generated in step 4 using the function in step 3 and build the dos. step 4 may constitute a significant limitation to the real possibility of calculating the density of states for systems of reasonable size. generating all configurations is certainly impractical for most models (especially if they have continuous dof). this paper demonstrates how to sample configurations by using wl (rather than generating all of them) to accomplish step 4. we will focus on a model we have already solved exactly by counting every single possible configurations so as to have a reference system to compare with the wl algorithm results. this is the “non-interacting rigid arch model” (nira). in this model, only the number of arches ni of each size i (in number of grains) that form part of the packing is used to describe the microstate. all arches consisting of the same number of grains are considered to occupy the same volume (hence one single possible shape is assumed for each arch size; this is implied in the word “rigid” used to name the model). the total volume of the system is assumed to be the sum of the volume of the individual arches and the arch–arch interlocking dof is not taken into account (hence the word “non-interacting”). to represent a two-dimensional pack of equal-sized disks, we have taken the volume vi of an arch of i grains to be the area under the regular polygon that inscribes all disks in a “regular” arch. the special cases of “arches” of one particle and twoparticle arches are considered separately [6]. the total volume v [{ni}] of the system is v [{ni}] = n∑ i=1 vini where v1 = √ 3d2 2 , v2 = 2.1v1, (4) vi = nd2 4 tan π 2n [ 1 + ( tan π 2n )−1]2 for i > 2. where n is the total number of disks of diameter d in the system. it is important to mention that, typically, the maximum size that an arch can take is physically bounded (e.g., due to the size of the container that holds the granular sample). hence, we will put a cutoff c to the largest arch allowed in the system. 070001-4 papers in physics, vol. 7, art. 070001 (2015) / d. slobinsky et al. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.9 1 1.1 1.2 1.3 1.4 1.5 e n tr o p y /p a rt ic le volume/particle cut 3 cut 4 cut 5 cut 6 (a) -15 -10 -5 0 5 10 15 0.9 1 1.1 1.2 1.3 1.4 1.5 c o m p a c ti v it y volume/particle cut 3 cut 4 cut 5 cut 6 (b) figure 1: (a) entropy as a function of volume for the nira model in 2d calculated by counting all possible configurations for 500 disks [6] using different arch size cutoffs. (b) the corresponding compactivity calculated by numerical differentiation. the cutoff c is an external constraint (that is imposed in step 2, above). importantly, this cutoff imposes a limit to the correlations in the system which leads to an extensive entropy (see ref. [6]) in counting microstates, one has to bear in mind that arches of the same size are indistinguishable in the nira model, whereas arches of a different size can be distinguished. hence, if there are ni arches of i grains in a configuration (with i = 1, ...c, being c the size cutoff), then the number of permutation of arches that yield distinguishable microstates with these arches is na!/(n1!...nc!), (5) where na = ∑ i ni is the total number of arches of the configuration (including those “arches” of size 1). despite all the simplifications, the model applied to a 2d system of equal-sized disks yields qualitative agreement with dem simulations of tapped disks [6]. the nira model is in many respects similar to an ideal gas of excitations or quasi-particles (the arches) with a single dof (their size). figure 1(a) shows the entropy s calculated by counting all possible microstates for 500 disks using different c for the largest arch allowed [6]. as it is expected, if c increases, looser configurations are possible and hence states with higher volumes become more significant. however, the entropy for low volumes rapidly converges to a well defined curve. part (b) of fig. 1 shows the compactivity χ defined as χ−1 = ∂s/∂v . this is the analogue to the temperature in thermal systems [1]. the entropy presents a maximum as observed by others [2,3]. states for volumes beyond this maximum correspond to negative compactivities [see fig. 1(b)]. this is caused by the inversion population of these volume bounded systems. some authors suggest these negative χ macrostates may be inaccessible, however, this does not need to be the case; some preparation protocols may indeed lead to very low packing fractions [3, 18]. an interesting prediction of the nira model is that systems constrained by different c achieve the same χ at different specific volume v/n (i.e., at different packing fractions). as a consequence, two samples of grains “equilibrated” to the same compactivity will show distinct packing fractions if the maximum arch size possible in each sample is different. different values of c in practice may be achieved by using narrow containers or by changing the static friction coefficient of the grains. there have been some progress in the study of the equilibration of vibrated granular samples in “contact” [19]. however, there are still no attempts to couple static granular packs under gravity. further developments in this direction may help validating this prediction of the nira model. the configurations of the nira model are compatible with different representations. in the following subsections we will discuss some of these representations and their suitability for the implementation of the wl algorithm. i. the arch size distribution representation in our previous paper [6], we have used a vector {ni} that represents the number of arches consisting of i grains in the configuration, i.e., {ni} = (n1,n2, ...,nc), with c the largest arch allowed in the system and n1 the number of grains not forming arches. as an example, a possible configuration in a system of n = 10 grains and a cutoff arch length 070001-5 papers in physics, vol. 7, art. 070001 (2015) / d. slobinsky et al. of c = 6 represented in this way could be: (3, 1, 0, 0, 1, 0). (6) in this example, there are three grains not forming arches, two grains forming an arch of size two, and five grains forming another arch of size 5. in ref. [6], we have swept all possible configurations and multiplied each by its analytical degeneration due to the different permutations of arches with repetitions given by eq. (5). it is difficult to propose an algorithm to move between configurations represented in this way and yet comply with the symmetry of the selection probability required by the wl algorithm. for example, consider a move that consists in removing a grain from one arch of size k and adding it to another arch of size k′. such move would require subtracting 1 from the coordinate k of {ni} and adding 1 to the coordinate k − 1, since this arch will now be smaller by one grain. additionally, the coordinate k′ of {ni} needs to be reduced in 1 and the coordinate k′ + 1 increased in 1, since this arch will now be part of the set of arches larger by one grain. there are different ways of selecting a grain to be moved and to select its arch of destination. let us consider, for instance, that all grains and destination arches are chosen with same probability. now consider a move in the markov chain that takes configuration (6) [i.e., µ = (3, 1, 0, 0, 1, 0)] into configuration ν = (2, 1, 0, 0, 0, 1). this corresponds to taking one grain that was not forming an arch and inserting it in the five-particle arch to make it a six-particle arch. the probability of selecting a particle from an “arch of size one” in this case is 3/10. the probability of choosing the arch of size five as the destination is 1/5 (there are four other arches in configuration µ plus the possibility of leaving the grain on its own without forming an arch with others). hence the selection probability is sp (µ → ν) = 3/50. a similar analysis shows that to return to the original configuration sp (ν → µ) = 6/10×1/4 = 3/20 (there are 6 grains out of ten that can be taken from the six-grain arch and there are three possible other destination arches plus the case with the grain not forming an arch in the new configuration). clearly, this selection probabilities are non-symmetric as they should be to apply the algorithm of section ii. a possible workaround to the previous representation is to multiply each coordinate of the vector {ni} by the corresponding arch size. therefore, each coordinate now indicates how many grains are involved in all arches of the given size. in this new representation, the configuration of eq. (6) (10 particles with a cutoff c = 6) is written as {n′i} = (3, 2, 0, 0, 5, 0) (7) in this case, a trial move may consist in randomly transferring a grain from one coordinate k to another coordinate k′. this is done simply by subtracting 1 to {n′k} and adding 1 to {n ′ k′}. in this representation, the selection probability of moving a particle from one arch of size k to another of size k′ is sp (µ → ν) = 1/n×1/(c−1) irrespective of k and k′. hence, the selection probability is symmetric and suitable to implement the entropic sampling using wl. unfortunately, the representation of eq. (7) does not tell apart two microstates that differ only in the permutation of two arches of different sizes. therefore, the corresponding degeneracy given by eq. (5) cannot be accounted for 1. more importantly, the change of representations from eq. (6) to eq. (7) is not an exact mapping because these newly proposed moves allow for “fractions of arches” to exist since we do not request that the coordinates of {n′i} be a multiple of i after each move. for instance, the configuration (0, 0, 0, 0, 0, 10) is allowed in representation (7) but does not exist in representation (6). in the new representation such configuration implies that there is one arch of size 6 plus a fraction of an arch of size six. we can still treat these “fracrtions of arches” by assigning to them a fraction of the arch volume. however, there is a large number of new unrealistic configurations (there are many ways of chosing a numbers that are not conmensurate with the arch size) in this representation that will bias the result for the entropy. figure 2 shows the entropy for the nira model using representation (7) for different maximum arch sizes c compared with the exact result obtained by counting all configurations and all permutations [6]. as we can see, not including all distinguishable permutations and including the new “fractional arches” gives a wrong entropy function. 1one is tempted to add to the degeneracy factor (5) to correct the entropy in step iii of the algorithm. however, the algorithm compensates this factor in order to obtain a flat histogram. therefore this is not a viable solution. 070001-6 papers in physics, vol. 7, art. 070001 (2015) / d. slobinsky et al. 0 0.5 1 1.5 2 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 e n tr o p y /p a rt ic le volume/particle cut 5 cut 6 cut 7 cut 8 cut 4 figure 2: entropy as a function of volume for the nira model calculated using the wl algorithm (symbols) for representation (7) for 4 < c < 8. the full lines represent the exact results for 200 grains. ii. the arch listing representation as previously discussed, other ways of representing the system should be carefully chosen in order to ensure that there exist moves that sample the different configurations uniformly. an alternative natural representation consists of using a vector, {mi}, with n coordinates, where each coordinate mi can take any value from 0 to c, the cutoff for the arch size, provided that ∑ i mi = n. the content of each coordinate indicates that the configuration has an arch of that size. the configuration of eq. (6) in these representations can be expressed, for example, as (5, 0, 0, 1, 0, 1, 2, 0, 1, 0). (8) there are, of course, multiple permutations in eq. (8) that lead to the same distribution of arch sizes compatible with eq. (6). trial moves in the system represented in this way can be done by subtracting one from a non-zero mi and adding one to any other coordinate that has a value smaller than c. this is equivalent to reducing the size of one arch in one particle and either creating a new arch of size one (if the new coordinate had a zero value) or increasing the size of another arch. unfortunately, this algorithm has a non-symmetric selection probability. sp = 1/na × 1/(n − nc), where na is the number of arches (i.e., the number of non-zero coordinates in the configuration) and 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 e n tr o p y /p a rt ic le volume/particle cut4 cut5 cut6 cut7 cut8 figure 3: entropy as a function of volume for the nira model with cutoff 4 ≤ c ≤ 8 (symbols as in fig. 2) using the binary arch representation to carry out the entropic sampling through the wl algorithm for 200 grains. the solid lines correspond to the exact counting of microstates from ref. [6]. nc is the number of arches with the maximum allowed size c. therefore, sp will depend on the total number of arches and the number of arches of size c. besides the non-symmetric sp , the system represented in this way and sampled with these trial moves clearly overestimates the number of states of a given volume. this is because, apart from the na!/(n1!...nc!) permutations of the distinguishable arches [see eq. (5)], there are n!/(n −na)! additional permutations due to the zeros in a given vector in eq. (8) (there are n −na zeros). one can, in principle, resolve these issues. the selection probability may be turned into a symmetric one by adapting the acceptance probability in eq. (3). the degeneracy due to the presence of zeros in the representation (8) can be handled by switching to a representation without including these, and allowing for a vector of variable length. however, although less intuitive, there is a much simpler, suitable representation that we discuss in the next section. iii. the binary arch representation finally, we present a representation that comply with the symmetric selection probability and simultaneously account for the permutation of distinguishable arches. 070001-7 papers in physics, vol. 7, art. 070001 (2015) / d. slobinsky et al. in this case, the system is chosen to be represented by a vector of n coordinates with binary values (zeros and ones). these coordinates are not associated to specific particles 1, 2, 3, etc. rather, as we move along the vector from left to right, we can think the ones as representing a first grain in an arch (whatever its identity) and the following zeroes as the remaining particles. for instance, the configuration in equation 6 in this new representation can be given by (1, 0, 0, 0, 0︸ ︷︷ ︸ 5 , 1︸︷︷︸ 1 , 1︸︷︷︸ 1 , 1, 0︸︷︷︸ 2 , 1︸︷︷︸ 1 ). (9) in this representation, all permutations of arches of different sizes are accounted for naturally. the sections of the vector representing an arch [underbraced in eq. (9)] can be permuted to yield all distinguishable configurations [see eq. (5)], which correspond to different vectors in this binary representation. indistinguishable configurations corresponding to permutations of arches of same size are also indistinguishable for the binary vector. the number na of arches in a configuration is simply the sum of all the elements of the vector. note that the underbraced numbers coincide in value and order with the non-zero figures of the vector described by eq. (8). the trial moves consist in picking a coordinate and changing the state of that coordinate (if 1 change to 0 and vice versa). this results in a symmetric selection probability of sp (µ → ν) = sp (ν → µ) = 1/n. in each move, the constraint imposed by the cutoff c must be checked and the trial configuration must be rejected whenever the constraint is not complied with. in fig. 3, the result for this representation and sampling strategy is plotted along with the exact result showing a remarkable agreement. iv. conclusions we have been able to compute the entropy of a system of non-interacting rigid arches using a wl algorithm in the volume ensemble in different representations. we have exposed the difficulties in dealing with different representations of the configurations of arches and the mechanisms used to propose trial moves for the wl algorithm. these difficulties appear during the choice of a simple sampling scheme that ensure a symmetric selection probability of configurations. additionally, the degeneracy due to distinguishable permutations of arches pose a further complication in the use of wl. the most suitable representation that we found for a noninteracting system of rigid arches resulted in a binary vector. we believe that entropic sampling of arches through the wl algorithm has a great potential for testing the granular statistical mechanics hypothesis (such as equiprobability and ergodicity). having a sampling algorithm like wl adapted for these types of models is crucial to continue the road map towards the refinement of an arch-based framework for static granular packs. in particular, the non-interacting condition is clearly a crude approximation and should be lifted, along with the introduction of a more accurate volume function. [1] s edwards, r oakeshott, theory of powders, physica a 157, 1080 (1989). [2] s mcnamara, p richard, s k de richter, g le caër, r delannay, measurement of granular entropy, phys. rev. e 80, 031301 (2009). [3] m p ciamarra, a coniglio, random very loose packings, phys. rev. lett. 101, 128001 (2008). [4] d asenjo, f paillusson, d frenkel, numerical calculation of granular entropy, phys. rev. lett. 112, 098002 (2014). [5] c s o’hern, l e silbert, a j liu, s r nagel, jamming at zero temperature and zero applied stress: the epitome of disorder, phys. rev. e 68, 011306 (2003). [6] d slobinsky, l a pugnaloni, arch-based configurations in the volume ensemble of static granular systems, j. stat. mech. p02005 (2015). [7] j lee, new monte carlo algorithm: entropic sampling, phys. rev. lett. 71, 211 (1993). [8] f wang, d p landau, efficient, multiplerange random walk algorithm to calculate the density of states, phys. rev. lett. 86, 2050 (2001). 070001-8 papers in physics, vol. 7, art. 070001 (2015) / d. slobinsky et al. [9] f wang, d p landau, determining the density of states for classical statistical models: a random walk algorithm to produce a flat histogram, phys. rev. e 64, 056101 (2001). [10] c zhou, r n bhatt, understanding and improving the wang-landau algorithm, phys. rev. e 72, 025701 (2005). [11] s trebst, d a huse, m troyer, optimizing the ensemble for equilibration in broad-histogram monte carlo simulations, phys. rev. e 70, 046701 (2004). [12] b a berg, t neuhaus, multicanonical ensemble: a new approach to simulate firstorder phase transitions, phys. rev. lett. 68, 9 (1992). [13] d p landau, k binder, a guide to monte carlo simulations in statistical physics, 2nd ed., cambridge university press (2005). [14] r e belardinelli, v d pereyra, fast algorithm to calculate density of states, phys. rev. e 75, 046701 (2007). [15] m e newman, g t barkema, monte carlo methods in statistical physics, vol. 13, pp. 3642, clarendon press oxford (1999). [16] l a pugnaloni, g barker, a mehta, multiparticle structures in non-sequentially reorganized hard sphere deposits, adv. complex syst. 4, 289 (2001). [17] r arévalo, d maza, l a pugnaloni, identification of arches in two-dimensional granular packings, phys. rev. e 74, 021303 (2006). [18] l a pugnaloni, m mizrahi, c m carlevaro, f vericat, nonmonotonic reversible branch in four model granular beds subjected to vertical vibration, phys. rev. e 78, 051305 (2008). [19] j g puckett, k e daniels, equilibrating temperaturelike variables in jammed granular subsystems, phys. rev. lett. 110, 058001 (2013). 070001-9 papers in physics, vol. 2, art. 020010 (2010) received: 22 april 2010, accepted: 2 december 2010 edited by: v. lakshminarayanan reviewed by: s. roy, dayalbagh educational institute, agra, india. licence: creative commons attribution 3.0 doi: 10.4279/pip.020010 www.papersinphysics.org issn 1852-4249 experimental determination of distance and orientation of metallic nanodimers by polarization dependent plasmon coupling h. e. grecco,1, 2∗ o. e. mart́ınez1† live cell imaging using metallic nanoparticles as tags is an emerging technique to visualize long and highly dynamic processes due to the lack of photobleaching and high photon rate. however, the lack of excited states as compared to fluorescent dyes prevents the use of resonance energy transfer and recently developed super resolution methods to measure distances between objects closer than the diffraction limit. in this work, we experimentally demonstrate a technique to determine subdiffraction distances based on the near field coupling of metallic nanoparticles. due to the symmetry breaking in the scattering cross section, not only distances but also relative orientations can be measured. single gold nanoparticles were prepared on glass, statistically yielding a small fraction of dimers. the sample was sequentially illuminated with two wavelengths to separate background from nanoparticle scattering based on their spectral properties. a novel total internal reflection illumination scheme in which the polarization can be rotated was used to further minimize background contributions. in this way, radii, distance and orientation were measured for each individual dimer, and their statistical distributions were found to be in agreement with the expected ones. we envision that this technique will allow fast and long term tracking of relative distance and orientation in biological processes. i. introduction microscopy is an example of the ongoing symbiotic relationship between physics and biology: as early microscopes allowed fundamental discoveries like microorganisms or dna; the need to see smaller, faster and deeper has pushed the development of a plethora of optical concepts and microscopy techniques. today, fluorescence microscopy is an es∗e-mail: hgrecco@df.uba.ar †e-mail: oem@df.uba.ar 1 laboratorio de electrónica cuántica, universidad de buenos aires. buenos aires, argentina. 2 current address: department of systemic cell biology max planck institute of molecular physiology. dortmund, germany sential tool in biology as it can visualize the spatiotemporal dynamics of intracellular processes. however, many important mechanisms, like protein interaction, clustering or conformational changes, occur at length scales smaller than the resolution limit of conventional microscopy and therefore cannot be assessed by standard imaging. unraveling the dynamics of such interand intramolecular mechanisms that provides function richness to molecules and molecular complexes is essential to understand key biological processes such as cellular signal propagation. subdiffraction distances have been determined by exploiting quantum and near field properties of the interaction between light and matter in the nanometer scale. for example, fluorescence/förster resonance energy transfer (fret) 020010-1 papers in physics, vol. 2, art. 020010 (2010) / h. e. grecco et al. [5–7, 14] has proven to be a valuable technique as it provides an optical signal directly related to the proximity of the molecules. the desire to extend this technique to other biological systems with different time and length scales has been hindered by the inherent limitations of fluorescent dyes (i.e. lack of photostability, low brightness and short range of interaction). super resolution techniques such as sted [23] or palm [2] have recently gained momentum to directly observe fluorescent molecules spaced closer than the diffraction limit. although much work has been done to increase the total acquisition time and frame rate, these methods are still limited by the lack of photostability and the need to image a single resolvable structure per diffraction limited spot at a time. in the past, it has been shown that scattering microscopy using metallic nanoparticles can complement its fluorescence sibling as it uses an everlasting tag with no rate-limited amount of photons [10]. metallic nanoparticles are stable, biocompatible and easy to synthesize and conjugate to biological targets and thereby ideal as contrast agents. a landmark example of the biological application of such techniques was the direct observation of receptors hopping across previously unknown membrane domains. this provided valuable insight into the spatial regulation of signaling complexes and closed a 30-year controversy about the diffusion coefficients of membrane proteins [21]. while previous experiments using fluorescent tags yielded a diffusion coefficient in biological membranes much slower than the one observed in synthetic membranes, the fast acquisition speed (40 103 frames per second) enabled by scattering microscopy showed that this is the result of a fast diffusion and a slow hopping rate between domains [9]. in addition, the presence of a plasmon, i.e. a collective oscillation of the free electrons within the nanoparticle, converts metallic nanoparticles into very effective scatterers when illuminated at their resonance optical frequency. the resulting strong electromagnetic enhancement in the vicinity of the particle provides a near field effect that can be used to sense information about their surroundings such as the effective index of refraction or the presence of other scatterers [8, 16]. for example, it has been experimentally shown that the shift in the plasmon resonance can be used to determine the length of dna molecules attached to a metallic nanoparticle [13]. moreover, the coupling between two nanoparticles in close proximity produces an alteration of the plasmon spectra. this alteration has been used as a nanometric ruler to determine the distance between them [19, 20]. in a previous work, [4] we theoretically showed that the coupling between two nanoparticles is highly sensitive to the polarization of the external field. the scattering cross section (csca) is maximum when the incident polarization is parallel to the dimer orientation due to the reinforcement of the external field by the induced dipoles [fig. 1(a)]. as the coupling decreases monotonically with the distance between nanoparticles, so does the average csca over all polarizations (vm) [fig. 1(b)] and the anisotropy [fig. 1(c)] defined as: η = c ‖ sca − c⊥sca c ‖ sca + c⊥sca . (1) we have proposed, in our previous work, that by measuring the scattering cross section as a function of the incident polarization angle, the axis of the dimer and the distance between nanoparticles could be determined. in this work, we provide experimental evidence supporting this concept by measuring gold nanodimers on a glass surface and we introduce a novel total internal reflection experimental setup that provides polarized illumination with a high na objective. ii. materials and methods i. sample preparation coverslips were cleaned by sonication at 50◦ for 20 minutes in milli-q water, and then sequentially immersed for 5 seconds in hfl 5%, sodium bicarbonate and acetone (analytic level). after the cleaning process, coverslips were dried and stored in a chamber overpressurized with nitrogen until further use. before sample preparation, a parafilm chamber was assembled on top of the coverslip. to create a hydrophilic surface, bovine serum albumina (bsa) in phosphate buffer solution (pbs) was incubated for 15 min and then rinsed with pbs. fluoresceinstreptavidin in pbs (50 mg/ml) was then incubated for 30 min and rinsed with pbs, to obtain an ad020010-2 papers in physics, vol. 2, art. 020010 (2010) / h. e. grecco et al. figure 1: conceptual idea of the technique. (a) two metallic nanoparticles of radii a located at a distance d are illuminated with a linearly polarized electromagnetic field. (b) theoretical results as a function of the interparticle distance for the average csca over all polarizations (vm) and (c) the anisotropy. parameters of the calculation: a = 20 nm, wavelength of the light: 532 nm. sorbed layer that was verified using confocal fluorescence microscopy. finally, a solution of biotinylated gold nanoparticles, nominal radius (20 ± 5) nm (gb-01-40. ey laboratories, usa), was incubated for 15 min and then rinsed by washing 5 times with pbs. the concentration and incubation time where empirically chosen to provide a concentration about 1 nanoparticle/10 µm2. as the particles are randomly distributed, it is expected to find many monomers, some dimers, and very few trimers and higher n-mers. a negative control sample was prepared in the same way but omitting the incubation of gold nanoparticles. ii. dual color scheme spurious reflections and scattering centers other than gold will produce unwanted bright spots in the images. even thresholding the image taken at the resonance peak (532 nm) will result in many false positive regions. the presence of a plasmon resonance in the scattering spectrum of gold nanoparticles was used as a signature to distinguish them. the ratio between the scattering cross section at 532 nm and 473 nm was found to be larger than 1.4 for gold monomers [fig. 2(a), solid thin line] using mie theory [3] and even larger for dimers (dashed line) as calculated using gmmie, a multiparticle extension of the mie theory [11, 12]. in contrast, non metallic scattering centers lack of a plasmon resonance and therefore yield a smaller ratio between 532 nm and 473 nm csca (solid thick line). therefore, by imaging at these two wavelengths and thresholding the ratio image above 1.4, the pixels containing gold nanoparticles were further segmented. iii. polarization control in total internal reflection we used total internal reflection (tir) microscopy [1] to restrict the illumination to the surface of the coverslip using an evanescent wave. in objective-based tir, the beam is focused off-axis in the back focal plane (bfp) of the objective to achieve critical illumination. the components of evanescent field are defined by the angle of incidence (θ) and the incident polarization with respect to the plane of incidence. indeed, rotating the excitation polarization before entering the microscope does not produce a constant intensity in the sample plane as the transmission efficiency for the parallel polarization [fig. 2(b), left] will be much smaller than for the perpendicular one [fig. 2(b), center]. a circularly polarized beam before the objective results in an “elliptically”1 polarized field which has the minor axis in the plane of incidence [fig. 2(b), right]. the ratio between the major and minor axis of this evanescent “elliptical” beam depends on θ and if the plane of incidence is changed, the ellipse will rotate with it. we therefore modified 1the electromagnetic field in the sample cannot be said to be strictly elliptically polarized as an evanescent field (not a propagating beam) is generated after the interface. nevertheless, an elliptical rotation of the electric field is achieved. 020010-3 papers in physics, vol. 2, art. 020010 (2010) / h. e. grecco et al. figure 2: experimental setup. (a) comparison of spectra averaging over all polarizations. while the spectrum of dielectric particles (thick solid line) decreases monotonically, the spectra of metallic monomers and dimers (thin lines) show a plasmon resonance. (b) the polarization of the refracted wave is dependent on the direction of the incident polarization with respect to the displacement in the back focal plane (bfp) of the objective which defines the plane of incidence. (c) the sample is illuminated in total internal reflection and imaged using a cooled ccd. laser light is tightly focused off-center in the back focal plane of the objective and the angle β is moved using a pair of galvanometer scanners moving in orthogonal directions. a wide-field inverted microscope (ix71. olympus, japan) using a tirf objective (olympus tirfm 63x/1.45 planapo oil) to allow rotating the plane of incidence [fig. 2(c)] by changing the position in which the beam is focused in the bfp. two lasers were used: one near the gold particle plasmon resonance (532 nm, compass c315m. coherent inc., usa) and another shifted towards shorter wavelengths (473 nm, va-i-n-473. viasho technology, china). the power of the lasers after the objective was 13 µw. circularly polarized light was achieved at the bfp by inserting a quarter and a half wave plate in the beam path adjusted to precompensate for the polarization dependent transmission of the beam splitter, filters and mirrors. the beam was expanded and filtered to achieve a diffraction limited spot in the bfp. in order to displace the beam in the bfp and therefore change the plane of incidence, a pair of computer controlled galvanometer scanners (sc2000 controller, minisax amplifier and m2 galvanometer. gsi group, usa) were used. the polarization distortion due to the change in the angle of incidence onto the mirrors of the scanners (while moving) was verified to be negligible. images of the sample were acquired using a cooled monochrome ccd camera (alta u32. apogee instruments, usa. 2148 × 1472 pixels each 6.8 × 6.8 µm2) through a dichroic filter for the fluorescence sample (xf2009 550dclp. chroma technology, usa) or a 30/70 beam splitter for the gold nanoparticles (21009. chroma technology, usa). iv. intrinsic anisotropy determination to assure a constant ratio between the two polarizations of the beam and a uniform intensity, the beam needed to be moved on the bfp in a circle centered in the optical axis. failure to do this would have reduced the dynamic range of the system by introducing an intrinsic anisotropy. to minimize this value, the path of the beam was iteratively modified while measuring the anisotropy (see below) of a diluted solution of rhodamine 101. the emission of such a sample is independent of the excitation polarization and thus the measured anisotropy can be assigned only to the system. after optimization, the obtained anisotropy for the 532 and 473 channels was 0.06 and 0.05 in a region of 50 × 50 µm2 (440 × 440 pixel2). it is worth noting than these values are five times smaller than 020010-4 papers in physics, vol. 2, art. 020010 (2010) / h. e. grecco et al. the expected anisotropy for a 20 nm homodimer. v. image acquisition and processing the acquisition process consisted in sequentially imaging at 473 nm and 532 nm while changing the angle β in 20 discrete steps over 2π to sample different polarizations. an image with both lasers off was also acquired to account for ambient light and dark counts of the camera. each image was background corrected and normalized by the excitation power and detection efficiency at the corresponding wavelength. mean images (m473 and m532) were obtained by averaging over all polarizations and, from these, the ratio image m = m532/m473 was calculated. the scattering image at the resonance peak (m532) was segmented by otsu’s thresholding and masked with the ratio image thresholded above 1.4 to detect pixels containing gold. a connected region analysis was performed to keep only those regions with area between 3 and 15 pixels. the upper bound was chosen to be slightly bigger than the airy diffraction limited spot for the system, but still much smaller than the mean distance between gold nanoparticles. regions containing single gold nanoparticles should have a constant intensity over the stack of frames acquired for different polarization orientation, while dimers should provide an oscillating signal with period π. therefore, a fourier analysis [eq. (2)] was performed. for each pixel, the following coefficients were calculated: c̃2 = 2∑ β i(β) ∑ β i(β)cos(2β) (2a) s̃2 = 2∑ β i(β) ∑ β i(β)sin(2β) (2b) η = √ c̃2n + s̃ 2 n (2c) tan(δ) = s̃2 c̃2 (2d) this was done for each wavelength obtaining values for the anisotropy (η473 and η532) and orientation (δ473 and δ532) in each pixel. the acquisition and analysis process were repeated for 30 and 10 fields of view of the sample and negative control sample respectively. retrieval uncertainty was estimated by performing the same numerical analysis figure 3: experimental results (a) representative images of a field of view for single wavelength (left) and a ratio (right) imaging. some points (circles) are bright in both images (gold) while others squares faint in the ratio image (not gold). gold containing regions were segmented by finding bright pixels in both images. images size: 50x50 µm2. 440x440 pixels. (b) anisotropy vs. mean value. a strong correlation is observed between anisotropy and mean value for both wavelengths as expected. the 473 nm data shows a plateau due to the intrinsic anisotropy of the system. on simulated data calculated by adding two terms to the theoretical response for different dimers. the first, an oscillating term with 2π periodicity, emulated a small misalignment that produced a nonconstant illumination while rotating the beam in the bfp. the amplitude of this term was obtained from the rhodamine calibration. the second term simulated coherent background and its values for each pixel were drawn from a gaussian distribution obtained from the control images. 020010-5 papers in physics, vol. 2, art. 020010 (2010) / h. e. grecco et al. iii. results and discussion the need for a two color approach is evident when comparing single wavelength with ratio images: while regions with and without nanoparticles [fig. 3(a), circles and squares respectively] were bright due to the high background in the single channel image, only gold nanoparticles were above the threshold in the ratio image. importantly, in the negative control stack, no pixel was found above the a priori defined threshold. for the 35 regions identified as gold monomers/dimer, a strong correlation between anisotropy and mean value was found as expected [fig. 3(b)]. the variance over each region for all values was below the corresponding retrieval uncertainty. a plateau was observed for the 473 nm channel due to the intrinsic anisotropy of the system. in this set of candidates for dimers, eight points presented an unexpected high individual anisotropy and hence were rejected. although the exact origin of this eight scattering centers could not be established, it is worthwhile noting that it is extremely relevant in a tracking experiment to avoid false positives that would severely distort the retrieved information, and this ability to reject scattering centers based on their response is an additional advantage of the technique. the recovered scattering parameters were compared with the expected results for a homodimer configuration (fig. 4) obtaining a good correspondence with the nominal size of the nanoparticles used (20 nm). indeed, the mesh shown in fig. 4 was calculated using only the photophysical and geometrical properties of the dimer (no fitted parameters). the ability of the technique to blindly recover the correct size of the particles was a crosscheck for its reliability. the actual configuration of each dimer was obtained by fitting the theoretical model to the experimental values. the in-plane orientation was directly obtained as a weighted average of these values. to fit the radii of each particle and the distance between them, the values of η473, η532 and m were used. the values were first fitted using the analytical solution of a homodimer configuration in the dipole-dipole approach, in which the induced dipole moment ~p of each particle in an incident field figure 4: comparison between experimental data and homodimer model. the mean value ratio is plotted against the anisotropy ratio for the regions segmented from the images (blue dots). theoretical calculations for different homodimers are also plotted. the vertical lines show the results keeping constant the surface to surface distance (dss) while changing the radii (a). the opposite is shown in the horizontal lines. remarkably, experimental data distributed close to the curve for 20 nm homodimers (solid line) as expected as this is in fact the mean radii of the particles used. notice that this is not a fit (no free parameters), but the predictions from the homodimer model superimposed to the experimental data. ~einc can be expressed as: ~p‖ = 1 1 − α 2d3π �mα~einc (3a) ~p⊥ = 1 2 + α 2d3π �mα~einc (3b) �m being the dielectric constant of the medium and α the polarizability of the sphere which is proportional to the cube of its radius [3, 4]. this homodimer configuration was used as an initial value in the time consuming iterative process of finding a heterodimer configuration compatible with the experimental data using gmmie calculation. the traveling wave approximation of the evanescent field was used as the particles are small compared to the decay length of the field [18,22] and the collection efficiency is much smaller for the dipole 020010-6 papers in physics, vol. 2, art. 020010 (2010) / h. e. grecco et al. induced in the optical axis orientation than for the one perpendicular to it. in this way, the two radii, orientation and distance for each dimer were obtained (fig. 5). the distribution of radii was found to be centered in 20 nm, compatible with the nominal size of the particles. for the interparticle distances, the distribution showed an increase as expected but then a decrease for distances at which the anisotropy is close to the intrinsic anisotropy of the system. this mismatch at larger distances is due to the conservative criterion to separate dimers from monomers that fails to identify correctly nanoparticles that couple weakly. the dimer orientation was uniformly distributed between −π and π, as expected. to further test this, we compared the experimental and simulated distributions using a kolmogorovsmirnov [15] statistical test. the level of significance set at the usual value of 5%. the expected distributions (fig. 5, right column) were obtained from the nominal radius of the nanoparticles and monte carlo simulation of the adsorption process. the experimental and simulated distributions for radii and orientation were found in close agreement. for the interparticle distance, the distributions were found compatible when compared up to 70 nm. iv. conclusions we have experimentally shown that distance between two nanoparticles, as well as their individual radii, can be obtained by measuring the intensity of spot as a function of the incident polarization. additionally, the in-plane orientation of the dimer was obtained with less than 10◦ uncertainty. the presented method strongly exploits the particular spectroscopic properties of metallic nanoparticles to sense their environment. it is worth noting that the distance in which the technique is sensitive scales with the radii of the used particles. by using nanoparticles with radius between 4 and 20 nm, the gap between fret and standard superresolution techniques (10 nm to 50 nm) could be bridged. this fact, together with the ability to recover orientation, makes this approach unique. a non-uniform anisotropic illumination is the major source of uncertainty as it will mask the anisotropy of the dimers, specially for those in figure 5: fit to a heterodimer configuration using gmmie. for each dimer (left column, x axis), the radii (top), the distance (middle) and the angle (bottom) were obtained. experimental and simulated histograms are shown for each magnitude (middle and right columns). which the distance is much larger than the radius of the particles. this should be properly controlled by measuring an isotropic sample as it was done in this work. additionally, it is important to mention that various factors such as a non-monodisperse or non-spherical population of particles will have an incidence in the recovery of dimer distance, size and orientation from model based fittings. however, having a multiparametric readout (i.e. m532, m473, η473 and η532) with a non-trivial dependence of the physical parameters (i.e. distance, size, shape) provides a way to control for this and exclude points that do no match the expected relations between photophysical properties. such conservative criterion would be recommended for tracking experiments where false negatives have minor impact as they only reduce the amount of information gathered per frame. if the yield of dimers can be raised and several dozens of dimers can be imaged in the same field of view, we expect that the presented technique will be useful to add information about the relative movement of the two particles to already existing tracking assays. numerical simulations showed that if coherent background can be diminished, an order of magnitude (i.e. by the use of broader band excitation source), distance and 020010-7 papers in physics, vol. 2, art. 020010 (2010) / h. e. grecco et al. orientation could be tracked at 100 hz. if just the rotation and the movement of the center of mass is desired, the retrieval can be performed much faster as only one wavelength (532 nm) would be necessary after an initial identification of the dimers is made by the two color method. as other scattering based techniques, the lack of photobleaching constitutes a major advantage of this approach. moreover, the absence of saturation in the light scattering of metallic nanoparticles, as compared to the absorption of fluorescent molecules, provides an acquisition rate only limited by the detector speed. the combination of these two aspects means that scattering based microscopy does not need to make compromises between experiment length and temporal resolution. we have also demonstrated that the use of two color imaging can provide an efficient way to detect scattering centers that have plasmon resonances. recent work by olk et al. [17] has shown that upon illumination with a wideband light source, a modulation of the spectra due to far-field effects can be observed as a function of the incident polarization. the combination of the two techniques could lead to a more robust detection of both, orientation and distance. finally, the novel illumination setup introduced in this work provides a robust way to change the polarization in tir, allowing the implementation of anisotropy based techniques in fluorescence and scattering microscopy. additionally, the same scheme permits a fast switching between tir and standard wide-field as well as sweeping of multiple evanescent field penetration depths. acknowledgements heg was funded by the universidad de buenos aires. [1] d axelrod, t p burghardt, n l thompson, total internal reflection fluorescence, annu. rev. biophys. bio. 13, 247 (1984). [2] e betzig, g h patterson, r sougrat, o w lindwasser, s olenych, j s bonifacino, m w davidson, j lippincott-schwartz, h f hess, imaging intracellular fluorescent proteins at nanometer resolution, science 313, 1642 (2006). [3] c f bohren, d r huffman, absorption and scattering of light by small particles, whiley (1983). [4] h e grecco, o e mart́ınez, distance and orientation measurement in the nanometric scale based on polarization anisotropy of metallic dimers, opt. express 14, 8716 (2006). [5] f g haj, p j verveer, a squire, b g neel, p i h bastiaens, imaging sites of receptor dephosphorylation by ptp1b on the surface of the endoplasmic reticulum, science 295, 1708 (2002). [6] e a jares-erijman, t m jovin, fret imaging, nat. biotechnol. 21, 1387 (2003). [7] z kam, t volberg, b geiger, mapping of adherens junction components using microscopic resonance energy transfer imaging, j. cell sci. 108, 1051 (1995). [8] k l kelly, e coronado, l l zhao, g c schatz, the optical properties of metal nanoparticles: the influence of size, shape, and dielectric environment, j. phys. chem. b 107, 668 (2003). [9] a kusumi, c nakada, k ritchie, k murase, k suzuki, h murakoshi, r s kasai, j kondo, t fujiwara, paradigm shift of the plasma membrane concept from the two-dimensional continuum fluid to the partitioned fluid: highspeed single-molecule tracking of membrane molecules, annu. rev. bioph. biom. 34, 351 (2005). [10] d lasne, g blab, s berciaud, m heine, l groc, d choquet, l cognet, b lounis, single nanoparticle photothermal tracking (snapt) of 5-nm gold beads in live cells, biophys. j. 91, 4598 (2006). [11] y lin xu, electromagnetic scattering by an aggregate of spheres, appl. optics 34, 4573 (1995). [12] y lin xu, electromagnetic scattering by an aggregate of spheres: far field, appl. optics 36, 9496 (1997). [13] g l liu, et al., a nanoplasmonic molecular ruler for measuring nuclease activity and dna footprinting, nat. nanotechnol. 1, 47 (2006). 020010-8 papers in physics, vol. 2, art. 020010 (2010) / h. e. grecco et al. [14] n p mahajan, k linder, g berry, g w gordon, r heim, b herman, bcl-2 and bax interactions in mitochondria probed with green fluorescent protein and fluorescence resonance energy transfer, nat. biotechnol. 16, 547 (1998). [15] f j massey, the kolmogorov–smirnov test for goodness of fit, j. am. stat. assoc. 46, 68 (1951). [16] a d mcfarland, r p v duyne, single silver nanoparticles as real-time optical sensors with zeptomole sensitivity, nano lett. 3, 1057 (2003). [17] p olk, j renger, m t wenzel, l m eng, distance dependent spectral tuning of two coupled metal nanoparticles, nano lett. 8, 1174 (2008). [18] m quinten, a pack, r wannemacher, scattering and extinction of evanescent waves by small particles, appl. phys. b: lasers o. 68, 87 (1999). [19] b m reinhard, m siu, h agarwal, a p alivisatos, j liphardt, calibration of dynamic molecular rulers based on plasmon coupling between gold nanoparticles, nano lett. 5, 2246 (2005). [20] c sönnichsen, b m reinhard, j liphardt, a p alivisatos, a molecular ruler based on plasmon coupling of single gold and silver nanoparticles, nature biotech. 23, 741 (2005). [21] k suzuki, k ritchie, e kajikawa, t fujiwara, a kusumi, rapid hop diffusion of a g-proteincoupled receptor in the plasma membrane as revealed by single-molecule techniques, biophys. j. 88, 3659 (2005). [22] r wannemacher, a pack, m quinten, resonant absorption and scattering in evanescent fields, appl. phys. b: lasers o. 68, 225 (1999). [23] v westphal, s o rizzoli, m a lauterbach, d kamin, r jahn, s w hell, video-rate farfield optical nanoscopy dissects synaptic vesicle movement, science 320, 246 (2008). 020010-9 papers in physics, vol. 2, art. 020004 (2010) received: 29 september 2010, accepted: 4 october 2010 edited by: a. g. green licence: creative commons attribution 3.0 doi: 10.4279/pip.020004 www.papersinphysics.org issn 1852-4249 commentary on “expansions for eigenfunctions and eigenvalues of large-n toeplitz matrices” torsten ehrhardt1∗ the paper by l. p. kadanoff [1] is concerned with the problem of describing the asymptotics of the eigenvalues and eigenvectors of toeplitz matrices, tn(φ) = (φj−k) n−1 j,k=0, as the matrix size n goes to infinity. here, φk are the fourier coefficients of the generating function φ, i.e., φ(z) = ∞∑ k=−∞ φkz k, |z| = 1. the concrete focus of the paper are the specific symbols a(z) = (2 −z − 1/z)α(−z)β, |z| = 1, with 0 < α < |β| < 1 being real parameters. the function a(z) is smooth (in fact, analytic) except at the point z = 1, where it has a “mild” zero. its image describes a simple closed curve in the complex plane as z passes along the unit circle. the problem of the asymptotics of the eigenvalues of toeplitz matrices has a long history and is a multi-faceted and difficult topic. it is closedly connected with the asymptotics of the determinants of toeplitz matrices and thus with the szegö limit ∗e-mail: tehrhard@ucsc.edu 1 department of mathematics, university of california, santa cruz, ca-95064, usa. theorem and its generalizations. the reader is advised to consult, e.g. [3] for many more details and references. for various classes of symbols φ, descriptions have been given for the limiting set (in the hausdorff metric) of the spectrum of tn(φ) as n goes to infinity. for instance, a result of widom [5] states that for certain symbols, the eigenvalues of tn(φ) accumulate asymptotically along the curve described by φ(z), |z| = 1. the result applies to the symbols considered here, where it is of importance that a(z) is non-smooth at precisely one point. moreover, under certain conditions on φ, one variant of the szegö limit theorem states that lim n→∞ 1 n n∑ k=1 f(λ (n) k ) = 1 2π ∫ 2π 0 f(φ(eix)) dx, where f is a smooth test function and λ (n) k are the eigenvalues of tn(φ). one should point out that there are other classes of symbols which show a completely different asymptotics of the toeplitz eigenvalues. for instance, if φ is a rational function, then it is proved that the eigenvalues do (in general) accumulate along arcs which lie inside the curve described by φ. this case is best understood because there is an explicit formula for the characteristic polynomial of tn(φ). furthermore, if φ is a piecewise continuous function with at least two jump discontinuities, then it is conjectured and numerically substantiated that 020004-1 papers in physics, vol. 2, art. 020004 (2010) / t. ehrhardt “most”, but not all eigenvalues, accumulate along the image. if there is precisely one jump discontinuity, then one expects that all eigenvalues accumulate along the image. for continuous real-valued symbols, i.e. for hermitian toeplitz matrices, the asymptotics of the eigenvalues is again “canonical”, i.e. the eigenvalues accumulate along the image and the above formula holds for continuous test functions f. the two afore-mentioned results give some, but limited information about the eigenvalues of the toeplitz matrices. the paper under consideration (together with a preceding paper [4]) makes a significant first attempt to determine the asymptotics of the individual eigenvalues of tn(a). the asymptotics are obtained up to third order and can be described by λ (n) k = a ( e−ip (n) k ) , p (n) k = 2π k n − i(1 + 2α) ln n n + 1 n d ( k n ) + o ( 1 n ) , as n →∞, with an explicit expression for d = d(x), which is continuous in 0 < x < 1. (we made some slight changes regarding notation and formulation in comparison with the main formula (25) in [1].) the asymptotics holds uniformly in k under the assumption 0 < ε ≤ k/n ≤ 1−ε. the latter means that the description does not catch the eigenvalues accumulating near the point 0 = a(1) where the curve a(z) is not smooth. at this point, perhaps a different, more complicated asymptotics holds. the derivation of the results in the paper is not completely rigorous, despite the arguments being quite convincing. the methods are appropriate for dealing with toeplitz systems. for instance, it is made use of the fact that the finite matrices tn(φ) are naturally related to two semi-infinite toeplitz systems t(φ) = (φj−k) and t(φ̃) = (φk−j), j,k ≥ 0. in the paper, this is reflected by the use of the auxiliary functions φ− and φ+. the symbol φ is equal to k(z) = a(z)−λ, where λ is an eigenvalue (which is to be determined). the two semi-infinite systems are analyzed by wiener-hopf factorization, the factors of which serve as approximations for the auxiliary functions φ− and φ+. both auxiliary functions allow to reconstruct the eigenfunction for tn(φ). in view of the argumentation, it seems plausible that the results can be generalized without too much effort to slightly more general symbols, a(z) = (2 −z − 1/z)α(−z)βb(z), |z| = 1, where b(z) is a smooth (or analytic) function for |z| = 1 such that a(z) describes a simple closed curve in the complex plane. on the other hand, notice that symbols with two or more singularities could produce a more complicated eigenvalue behavior [5]. after a preprint version of the paper appeared, bogoya, böttcher and grudsky [2] gave a rigorous proof of the eigenvalue asymptotics in the special case of symbols a(z) with β = α− 1. the general case is (as of now) still open. acknowledgements supported in part by nsf grant dms-0901434. [1] l p kadanoff, expansions for eigenfunctions and eigenvalues of large-n toeplitz matrices, pap. phys. 2, 020003 (2010). [2] j m bogoya, a böttcher, s m grudsky, asymptotics of individual eigenvalues of large hessenberg toeplitz matrices, preprint 2010-8, fakultät für mathematik, technische universität chemnitz, issn 1614-8835. [3] a böttcher, b silbermann, introduction to large truncated toeplitz matrices, universitext, springer, new york (1999). [4] h dai, z geary, l p kadanoff, asymptotics of eigenvalues and eigenvectors of toeplitz matrices, j. stat. mech. p05012 (2009). [5] h widom, eigenvalue distribution of nonselfadjoint toeplitz matrices and the asymptotics of toeplitz determinants in the case of nonvanishing index, oper. theory: adv. appl. 48, 387 (1990). 020004-2 papers in physics, vol. 8, art. 080006 (2016) received: 2 august 2016, accepted: 12 october 2016 edited by: k. hallberg licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.080006 www.papersinphysics.org issn 1852-4249 an efficient density matrix renormalization group algorithm for chains with periodic boundary condition dayasindhu dey,1 debasmita maiti,1 manoranjan kumar1∗ the density matrix renormalization group (dmrg) is a state-of-the-art numerical technique for a one dimensional quantum many-body system; but calculating accurate results for a system with periodic boundary condition (pbc) from the conventional dmrg has been a challenging job from the inception of dmrg. the recent development of the matrix product state (mps) algorithm gives a new approach to find accurate results for the one dimensional pbc system. the most efficient implementation of the mps algorithm can scale as o(p × m3), where p can vary from 4 to m2. in this paper, we propose a new dmrg algorithm, which is very similar to the conventional dmrg and gives comparable accuracy to that of mps. the computation effort of the new algorithm goes as o(m3) and the conventional dmrg code can be easily modified for the new algorithm. i. introduction the quantum many-body effect in the condensed matter gives rise to many exotic states such as superconductivity [1], multipolar phases [2,3], valence bond state [4], vector chiral phase [2,5] and topological superconductivity [6]. these effects are prominent in the one dimensional (1d) electronic systems due to the confinement of electrons. the confinement of electrons and the competition between the electron-electron repulsion and the kinetic energies of electrons produce many interesting phases like spin density wave (sdw), dimer or the bond order wave phase and charge density wave (cdw) phase in one dimensional systems [7–9]. although these quantum many-body effects in the system are crucial for exotic phases, dealing with these systems is a challenging job because of the large degrees of ∗e-mail: manoranjan.kumar@bose.res.in 1 s. n. bose national centre for basic sciences, block jd, sector iii, salt lake, kolkata 700106, india. freedom. the degrees of freedom increase as 2n or 4n for a spin-1/2 system or a fermionic system, respectively. in most cases, the exact solutions for these systems with large degrees of freedom are almost impossible to reach. therefore, during the last three decades many numerical techniques have been developed, e.g., quantum monte carlo (qmc) [10], density functional theory (dft) [11], renormalization group (rg) [12] and density matrix renormalization group (dmrg) method [13, 14]. the dmrg is a state-of-the-art numerical technique for 1d systems with open boundary condition (obc). however, the numerical effort to maintain the accuracy for pbc systems becomes exponential [15, 16]. it is well known that the periodic boundary condition (pbc) is essential to get rid of the boundary effect of a finite open chain and also to preserve the inversion symmetry in the systems [17]. the dmrg technique is based on the systematic truncation of irrelevant degrees of freedom and has been reviewed extensively in ref. [15, 16]. in a 080006-1 papers in physics, vol. 8, art. 080006 (2016) / d. deyet al. 1d system with obc, the number of relevant degrees of freedom is small [15, 16]. let us consider that for a given accuracy of the obc system, m obc number of eigenvectors of the density matrix is required, then the conventional dmrg for the pbc system requires o(m2obc) [18]. in the conventional dmrg, computational effort for the obc systems with sparse matrices goes as o(m3obc), whereas, it goes as o(m6obc) for the pbc system [19]. the accuracy of energies for the pbc systems calculated from the conventional dmrg decreases significantly, and there is a long bond in the system which connects both ends. the accuracy of operators decreases with the number of renormalization, especially the raising/creation s+/a+ and lowering/annihilation s−/a− operator of spin/fermionic systems. the conventional dmrg is solved in a sz basis, therefore the exact sz operator remains diagonal and, multiple times, renormalization deteriorates the accuracy slowly; but s+/a+ and s−/a− are off diagonal in this exact basis and therefore, the multiple time renormalization of these operators decreases the accuracy of the operators. a similar type of accuracy problems occurs for multiple time renormalized a+ and a− in the fermionic systems. in fact, it has been noted that accuracy of energy of a system with pbc significantly increases if the superblock is constructed with very few times renormalized operators [9]. to avoid multiple renormalization, new sites are added at both ends of the chain in such a way that only second time renormalized operators are used to construct the superblock. in this algorithm, there is a connectivity between the old-old sites and their operators are renormalized; and this connectivity spoils the sparsity of the superblock hamiltonian [9]. in this paper, a new dmrg algorithm is proposed, which can be implemented upon the existing conventional dmrg code in a few hours and gives accurate results which are comparable to those of mps algorithm. in fact, this algorithm can be implemented for two-legged ladders without much effort [20]. we have studied the spectrum of density matrix of the system block, ground state energy and correlation functions of a heisenberg antiferromagnetic hamiltonian for spin-1/2 and spin-1 on a 1d chain with pbc. this paper is divided into five sections. the model hamiltonian is discussed in section ii. the new algorithm and the comparative studies of algorithms are done in section iii. the accuracy of various quantities is studied in the section iv. in section v, results and algorithm are discussed. ii. model hamiltonian let us consider a strongly correlated electronic system where coulomb repulsion is dominant, therefore the charge degree of freedom gets localized, for example, hubbard model in large u limit in a half filled band. in this limit, the system becomes insulating, but the electrons can still exchange their spin. the heisenberg model is one of the most celebrated models in this limit, and only the spin degrees of freedom are active in the model. the heisenberg model hamiltonian can be written as h = ∑ 〈ij〉 jij ~si · ~sj (1) where, jij is the anti-ferromagnetic exchange interaction between the nearest neighbor spin. in the rest of the paper, jij is set to be 1. iii. comparison of algorithms a ground state wavefunction calculated from the conventional dmrg can be represented in terms of the matrix product state (mps), as shown by ostlund and rommer [21]. the wavefunction can be written as |ψ〉 = ∑ n1,n2,...,nl tr(a1n1a 2 n2 . . .alnl )|n1,n2, . . . ,nl〉, (2) where aknk are a set of matrices of dimension m×m for site k and with nk degrees of freedom. the wave function |ψ〉 can be accurately found if m is sufficiently large. the expectation value of an operator ok in the gs [19, 22] can be written as 〈ok〉 = tr   ∑ nk,n ′ k 〈nk|ok|n ′ k〉a k nk ⊗ak n ′ k   , (3) 080006-2 papers in physics, vol. 8, art. 080006 (2016) / d. deyet al. where nk is the local degrees of freedom of site k. the matrix a can be evaluated by using the following equations hk|ψk〉 = λnk|ψk〉, (4) nk = ek+11 e k 1 . . .e l 1 e 1 1 . . .e k−1 1 , (5) where ek1 = ∑ nk,n ′ k 〈nk|1|n ′ k〉a k nk ⊗ak n ′ k . (6) here, hk is the effective hamiltonian of k th site and λ is the expectation value of energy. the a matrices are evaluated at this point and the matrices are rearranged to keep the algorithm stable. the hk and n can be calculated recursively while evaluating a, one site at a time [19]. here, n matrices are ill-conditioned and require storing, approximately l2 matrices as well as multiplication of l2 matrices of m×m size [19] at each step. the evaluation of a and n are done for all the sites and backward and forward sweeps for all the sites are executed similarly to the finite system dmrg. the mathematical operations of matrices of dimension m2 ×m2 represent the hamiltonian cost ∼ (o)m6, but the special form of these matrices reduces the cost by a factor of m. therefore, this algorithm scales as ∼ (o)m5 [19]. the above algorithm is extended by verstraete et. al., for translational invariant systems [23]. only two types of matrices, a1 and a2, are considered [23]. product of the two matrices can be repeated to compute n. in this algorithm, only two matrices, a1 and a2, are updated and optimized to get the gs properties. this algorithm scales as (o)m3, although it does not work for a finite system, or systems with impurity, etc. pippan et. al. introduced another mps based efficient algorithm for translational invariant pbc systems [19]. in the old version of mps, most of the computation cost goes to constructing the product of m2×m2 matrices e defined in eq. (6). the new mps algorithm overcomes this problem by performing a svd of the product of sufficiently large (l � 1) number of m2 × m2 transfer matrices[19, 28]. the singular values, in general, decay very fast; therefore, only p (� m2) among m2 singular values are significant [28]. thus, the computational cost now is reduced to (o) p×m3 [19]. however, one requires p ∼ m to reach adequate numerical accuracy of physical measures, as pointed out in ref. [28]. although the above technique is efficient and accurate, there are various reasons for developing the new algorithm. first, the modified mps works efficiently for a system where the singular values of products of matrices decay exponentially and this algorithm scales as (o) pm3, where p can vary linearly with m. second, the implementation of the mps based numerical technique is quite different from the conventional dmrg, and the mps algorithm should be written from scratch. third, many conventional numerical techniques like the dynamical correction vector [24] or continued fraction [25], and implementation of symmetries like parity or inversion symmetries are difficult. in this paper, we will explain a new algorithm which is very similar to the conventional dmrg technique, and also show that the new algorithm can give accuracy comparable to that of mps based techniques. this algorithm is applied for s = 1/2 and s = 1 chains with pbc. but first, let us try to understand the algorithm before discussing the results. in this algorithm, we will try to avoid the multiple renormalization of operators, whereas the other parts of the algorithm remain the same as the conventional dmrg. before going to the new algorithm, let us recap the conventional dmrg. 1. start with a superblock of four sites consisting of one site for both the left and the right block and two new sites. 2. get desired eigenvalues and eigenvectors of the superblock and construct the density matrix ρ of the system which consists of the left or the right block and a new site. 3. now, construct an effective ρ̃ with m number of eigenvectors of ρ, corresponding to the m largest eigenvalues. the effective system hamiltonian and all operators in the truncated basis are constructed using the following equations: h = ρ̃†hρ̃; o = ρ̃†hρ̃ (7) 4. superblock is constructed using the effective hamiltonian and operators of the system block and two new sites. 080006-3 papers in physics, vol. 8, art. 080006 (2016) / d. deyet al. (a) (b) (c) figure 1: pictorial representation of the new dmrg algorithm with only one site in the new block. (a) one starts with two blocks left and right represented by filled circles and two new sites blocks represented as open circles. the dotted box represents the system block for the next step. (b) superblock of the next dmrg step is shown. (c) the final step of infinite dmrg of n = 4n system size is shown with 2n−1 number of sites in each of the left and the right blocks and two new sites. 5. repeat all the steps from 2 until the desired system size is reached. the full process is called infinite dmrg. as mentioned earlier, the conventional algorithm is excellent for a 1d open chain as superblock is constructed with only one time renormalized operators. however, for a pbc system, one needs a long bond; therefore, at least two operators of superblocks are renormalized multiple times. in the new algorithm, the multiple time renormalization of operator is avoided and the algorithm goes as: 1. start with a superblock with four blocks consisting of a left and a right block and two new site blocks. the blocks are shown in fig. 1 as filled circles and may have more than one site. here, we have considered only one site in each block. new blocks may also have more than one site and are shown as open circles. in this paper, new blocks have one site in a chain or two for a ladder, like a structure with pbc [17]. 2. get the eigenvalues and eigenvectors of the superblock and construct the density matrix ρ of the system which consists of the left or the right block and two new blocks. the left system block is shown inside the box in fig. 1a. 3. now, construct an effective ρ̃ with m number of eigenvectors of ρ corresponding to the m largest eigenvalues of the density matrix. the effective system hamiltonian and operators in the truncated basis are calculated using eq. (7). 4. the superblock is constructed using the effective hamiltonian and operators of system blocks and two new sites. 5. now, go to step 2 and the processes 2 – 5 are repeated till the desired system size is reached. we notice that the superblock hamiltonian is constructed using the effective hamiltonian of blocks and operator which are renormalized once. therefore, the massive truncation because of long bond is avoided in this algorithm. bonds in the superblock are only between new-new sites or newold sites. for the construction of a hamiltonian matrix of old-new site bond, a new site operator is highly sparse. however, old sites renormalized operators are highly dense. the old-old sites interaction in the conventional algorithm generates a large number of non-zero matrix elements in the superblock hamiltonian matrix and the diagonalization of dense matrix goes as m4. but, in the new algorithm, old-new sites interaction in the superblock generates only a sparse hamiltonian matrix, and its digonalization scales as m3. iv. results we consider spin-1/2 and spin-1 chains with pbc of length up to n = 500 to check the accuracy of results for the heisenberg hamiltonian. in this part, we study the truncation error of density matrix t, error in relative ground state energy ∆e|e0| and dependence of correlation function c(r) on m. the correlation function c(r) is defined as c(r) = ~s0 · ~sr, (8) where ~s0 corresponds to the reference spin and ~sr is the spin at a distance r from the reference spin. 080006-4 papers in physics, vol. 8, art. 080006 (2016) / d. deyet al. 16 32 64 128 256 512 m i 10 -12 10 -9 10 -6 10 -3 10 0 t n = 102 = 502 100 200 300 400 500 m i 10 -12 10 -9 10 -6 10 -3 10 0 t s = 1 s = 1/2 n = 502 figure 2: truncation error t of the density matrix for spin-1/2 chain (main). the inset shows the truncation error for the s = 1 chain. for s = 1/2, the truncation error follows power law decay whereas it follows exponential decay for a s = 1 system. the relative ground state energy can be defined as ∆e/|e0|, where ∆e = e(m) − e0 with e0 is the most accurate value for s = 1 chain [19] and e0 = e(m = 1200) for s = 1/2 chain. as discussed earlier, the dmrg is based on the systematic truncation of the irrelevant degrees of freedom and the eigenvalues of the density matrix represent the importance of the respective states. in the dmrg, only selected states corresponding to the highest eigenvalues are kept and the rest of the other states are thrown. we define truncation error t as t = 1 − m∑ i=1 λi (9) where λi are eigenvalues of density matrix of the system block arranged in descending order. in the main fig. 2, t is shown as function of m for s = 1/2 and the inset shows the same for a s = 1 system with n=102 and 502. we notice that m ∼ 350 for s = 1/2 and m ∼ 300 for s = 1 are sufficient to achieve t = 10−9. in the main fig. 2, t vs m in a log-log plot shows a linear behavior, i.e., t for both system sizes of n = 102 and 502 for s = 1/2 follows a power law t ∝ mαi with α = 4.0 and 3.4, respectively. the m dependence of t for s = 1 ring is shown in the inset of fig. 2. the truncation 16 32 64 128 256 512 m 10 -8 10 -6 10 -4 10 -2 10 0 ∆ e /| e 0 | 0 200 400 600 m 10 -8 10 -6 10 -4 10 -2 10 0 ∆ e /| e 0 |s=1/2 s=1 figure 3: energy accuracy ∆e/|e0| for spin half chain with pbc (main) which shows a power law behavior with m. the inset shows the energy accuracy for spin one chain with pbc which shows exponential behavior with m. error t in this case decays exponentially, i.e., t ∝ exp(−βmi) with β = 0.03 for both n = 102 and 502. the relative error in energies ∆e/|e0| for s = 1/2 and 1 with n = 102 spins are shown in fig. 3, main and inset respectively. the exact energies of a spin-1 system is e0/n ∼ 1.4014840386 and this value is obtained by using conventional dmrg with m = 2000 and up to n = 100 site chain [19]. extrapolation of energies with m is done to obtain the above value [19]. we notice that ∆e e0 for s = 1/2 systems goes to 10−6 for m = 256 whereas it goes to 10−8 for m ≈ 500, as shown in the main fig. 3. although similar accuracy of the energy can be achieved with m = 100 in mps approach, the scaling is ∼ m4, rather than ∼ m3 in our algorithm. for s = 1 accuracy of 10−8 can be reached with m ∼ 450, as shown in the inset of fig. 3. the dependence of accuracy of correlation function |c(r)| of s = 1/2 for n = 130 as a function of m is shown in the main fig. 4. we notice that m = 256 is sufficient to achieve an accuracy of ∼ 10−4. we also notice that |c(r)| ∝ r−1 with the well known logarithmic correction specially for smaller r [26]. we have fitted the correlations with the well known form |c(r)| = ar−1 ln1/2(r/r0) with a = 0.22 and r0 = 0.08 [29]. deviation in function for large r is the artifact of finite sys080006-5 papers in physics, vol. 8, art. 080006 (2016) / d. deyet al. 1 2 4 8 16 32 64 r 10 -2 10 -1 10 0 |c (r )| m = 64 = 132 = 256 = 400 c(r) = a r -1 ln 1/2 (r/r 0 ) 0 0.01 0.02 0.03 n -1 0 0.01 0.02 0.03 0.04 0.05 c (r = n /2 ) m = 132 = 256 = 400 figure 4: the main figure shows the variation of |c(r)| (defined in eq. (8)) as a function of r for different m. the solid curve in the main figure is the logarithmic correction formula: ar−1 ln1/2(r/r0) with a = 0.22 and r0 = 0.08. [26, 29] the inset shows the variation of c(n/2) with the inverse of n for different m. the solid curve is the form: a(n/2)−1 ln1/2(n/2r0) with a = 0.323 and r0 = 0.08. tems. in our algorithm, two sites are added symmetrically and the new sites are added n/2 sites apart, consequently the distance between two new sites is n/2. therefore, the new-new sites correlation function is c(n/2) and is plotted with n−1 in the inset of fig 4. we observe that m = 256 is sufficient for n ∼ 200 to achieve sufficient numerical accuracy. the curve behaves almost linearly with the logarithmic correction: |c(n/2)| = 2an−1 ln1/2(n/2r0) with a = 0.323 and r0 = 0.08. v. summary the dmrg is a state-of-the-art numerical technique for solving the 1d quantum many-body systems with open boundary condition. however, the accuracy of the 1d pbc system is rather poor. the mps approach gives very accurate results but the computational cost goes as (o) m5 [23]. later, this algorithm was modified and the computational cost of the modified algorithm goes as (o) p×m3, where p in general varies linearly with m [28], but p can go as m2 in case of long range order in the system. the computational cost of the algorithm presented in this paper scales as (o) m3 because of the sparse superblock hamiltonian and is very similar to the conventional dmrg. to achieve this goal, we avoid the multiple times renormalization of the operators which are used to construct the superblock. this algorithm can readily be used to solve general 1d and quasi-1d systems, e.g., j1–j2 model, two-legged ladder. the new algorithm can be implemented with ease using the conventional dmrg code. our calculation suggests that most of the quantities, e.g., ground state energies, energy gaps and correlation function, can accurately be calculated by keeping m ∼ 400. the superfluity stiffness [27] and dynamical structure factors using the correction vector technique [24] or continued fraction method [25] can be calculated with this algorithm. the symmetries, e.g., spin parity, electron-hole, inversion, can easily be implemented in this algorithm [24]. this algorithm is used by us in calculating accurate static structure factors and correlation function for j1 − j2 model for a spin-1/2 ring geometry [17]. acknowledgements we thank z. g. soos for his valuable comments. m. k. thanks dst for ramanujan fellowship and computation facility provided under the dst project snb/mk/1415/137. [1] m tinkham, introduction to superconductivity: second edition (dover books on physics) (vol i), dover publications inc, new york (2004). [2] a v chubukov, chiral, nematic, and dimer states in quantum spin chains, phys. rev. b 44, 4693 (1991). [3] l kecke, t momoi, a furusaki, multimagnon bound states in the frustrated ferromagnetic one-dimensional chain, phys. rev. b 76, 060407 (2007). [4] p w anderson, the resonating valence bond state in la2cuo4 and superconductivity, science 235, 1196 (1987). [5] a parvej, m kumar, degeneracies and exotic phases in an isotropic frustrated spin-1/2 chain, j. magn. magn. mater. 401, 96 (2016). 080006-6 papers in physics, vol. 8, art. 080006 (2016) / d. deyet al. [6] x l qi, s c zhang, topological insulators and superconductors, rev. mod. phys. 83, 1057 (2011). [7] m nakamura, tricritical behavior in the extended hubbard chains, phys. rev. b 61, 16377 (2000). [8] p sengupta, a w sandvik, d k campbell, bond-order-wave phase and quantum phase transitions in the one-dimensional extended hubbard model, phys. rev. b 65, 155113 (2002). [9] m kumar, s ramasesha, z g soos, tuning the bond-order wave phase in the half-filled extended hubbard model, phys. rev. b 79, 035102 (2009). [10] m suzuki (ed.), quantum monte carlo methods in condensed matter physics, world scientific, singapore (1993). [11] w kohn, l j sham, self-consistent equations including exchange and correlation effects, phys. rev. 140, a1133 (1965). [12] k g wilson, the renormalization group and critical phenomena, rev. mod. phys. 55, 583 (1983). [13] s r white, density matrix formulation for quantum renormalization groups, phys. rev. lett. 69, 2863 (1992). [14] s r white, density-matrix algorithms for quantum renormalization groups, phys. rev. b 48, 10345 (1993). [15] u schollwöck, the density-matrix renormalization group, rev. mod. phys. 77, 259 (2005). [16] k a hallberg, new trends in density matrix renormalization, adv. phys. 55, 477 (2006). [17] z g soos, a parvej, m kumar, numerical study of incommensurate and decoupled phases of spin-1/2 chains with isotropic exchange j1 , j2 between first and second neighbors, j. phys. condens. mat. 28, 175603 (2016). [18] u schollwöck, the density-matrix renormalization group in the age of matrix product states, ann. phys. 326, 96 (2011). [19] p pippan, s r white, h g evertz, efficient matrix-product state method for periodic boundary conditions, phys. rev. b 81, 081103 (2011). [20] e dagotto, t m rice, surprises on the way from oneto two-dimensional quantum magnets: the ladder materials, science 271, 618 (1996). [21] s östlund, s rommer, thermodynamic limit of density matrix renormalization, phys. rev. lett. 75, 3537 (1995). [22] f verstraete, d porras, j i cirac, density matrix renormalization group and periodic boundary conditions: a quantum information perspective, phys. rev. lett. 93, 227205 (2004). [23] f verstraete, j i cirac, j i latorre, e rico, m m wolf, renormalization-group transformations on quantum states, phys. rev. lett. 94, 140601 (2005). [24] s k pati, s ramasesha, z shuai, j l brédas, dynamical nonlinear optical coefficients from the symmetrized density-matrix renormalization-group method, phys. rev. b 59, 14827 (1999). [25] k a hallberg, density-matrix algorithm for the calculation of dynamical properties of lowdimensional systems, phys. rev. b 52, r9827 (1995). [26] i affleck, d gepner, h j schulz, t ziman, critical behaviour of spin-s heisenberg antiferromagnetic chains: analytic and numerical results, j. phys. a math. gen. 22, 511 (1989). [27] d rossini, v giovannetti, r fazio, spinsupersolid phase in heisenberg chains: a characterization via matrix product states with periodic boundary conditions, phys. rev. b 83, 140411 (2011). [28] d rossini, v giovannetti, r fazio, stiffness in 1d matrix product states with periodic boundary conditions, j. stat. mech. 2011, p05021 (2011). [29] a w sandvik, computational studies of quantum spin systems, aip conf. proc. 1297, 135 (2010). 080006-7 papers in physics, vol. 7, art. 070007 (2015) received: 9 december 2014, accepted: 13 april 2015 edited by: l. a. pugnaloni reviewed by: f. vivanco, dpto. de f́ısica, universidad de santiago de chile, chile. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070007 www.papersinphysics.org issn 1852-4249 density distribution of particles upon jamming after an avalanche in a 2d silo r. o. uñac,1∗ j. l. sales,2 m. v. gargiulo,2 a. m. vidales1† we present a complete analysis of the density distribution of particles in a two dimensional silo after discharge. simulations through a pseudo-dynamic algorithm are performed for filling and subsequent discharge of a plane silo. particles are monosized hard disks deposited in the container and subjected to a tapping process for compaction. then, a hole of a given size is open at the bottom of the silo and the discharge is triggered. after a clogging at the opening is produced, and equilibrium is restored, the final distribution of the remaining particles at the silo is analyzed by dividing the space into cells with different geometrical arrangements to visualize the way in which the density depression near the opening is propagated throughout the system. the different behavior as a function of the compaction degree is discussed. i. introduction numerous studies have investigated the flow of granular materials (such as glass beads, aggregates or minerals, among others) through hoppers of various geometries [1–8]. commonly, the silo is initially filled with the granular material at a given height. then, the outlet of the silo is opened and the mass flow rate is recorded as a function of time. these experiments provide useful information on the relation between the flow rate and different geometrical and physical properties of the system, such as ∗e-mail: runiac@unsl.edu.ar †e-mail: avidales@unsl.edu.ar 1 departamento de f́ısica, instituto de f́ısica aplicada (unsl-conicet), universidad nacional de san luis, ejército de los andes 950, d5700hhw san luis, argentina. 2 departamento de geof́ısica y astronomı́a. facultad de ciencias exactas f́ısicas y naturales. universidad nacional de san juan, mitre 396 (e), j5402cwh san juan, argentina. the size and shape of the particles and the outlet, the presence of friction forces and density fluctuations [9]. empirically, the flow rate is determined by the known beverloo’s equation which depends, among other variables, on the apparent density in the immediate neighborhood of the outlet region of the silo. furthermore, in the derivation of the beverloo’s equation, it is assumed that the region primarily affected by the discharge is near the outlet and has an effective diameter of the order of the width of it, sometimes referred to as beverloo’s diameter [2]. simulations have provided details about the granular flow that are not accessible to experiments such as the influence of frictional parameters between particles, particle shape, stress chains and the statistics of the arches formed during a jamming process [3, 4, 10–12]. when the size of the flowing particles is in the order of the width of the aperture, jamming can take place, and the particles stop flowing unless additional energy is provided to the system. many practical problems are caused by jamming at the 070007-1 papers in physics, vol. 7, art. 070007 (2015) / r. o. uñac et al. outlet of hoppers used in production lines, where it is necessary to maintain a constant flow of material. analogous problems occur in the storage of raw materials in silos, especially after a certain period of accumulation or under manipulation operations that may cause a change in the packing fraction. thus, when it is necessary to empty the silo, the material does not flow, either due to the presence of unwanted moisture or because of the compaction of grains sealing the outlet [13–17]. these problems are caused by particle arches that form at the outlet. authors in ref. [9] report data suggesting temporal oscillations in the packing fraction near the outlet. those oscillations had a frequency around 2 hz. this result is important because it is directly related to the likelihood of jamming in the silo [4, 5]. other authors performed flow experiments using metal disks in a two dimensional hopper in order to study the statistical properties of the arches forming at the outlet [5]. authors in ref. [7] found experimentally a linear variation for the number of particles forming an arch with the outlet size in a 2-d silo. there are many works concerning the study of packing density in flowing granular materials out of a hopper. in particular, those dealing with the presence of density waves and density fluctuations in the bulk of the silo [18, 19]. others focus on the density distribution during discharge and analyze the change of the density between the stagnant zone and the flowing zone [20]. in a recent work [12], authors have studied the jamming occurring in the flow through small apertures for a column of granular disks via a pseudodynamic model. the effect that the preparation of the granular assembly has on the size of the avalanches was investigated. to this end, packing ensembles with different mean packing fractions were created by tapping the system at different intensities. this work succeeded in demonstrating that, for a given outlet size, different mean avalanche sizes are obtained for deposits with the same mean packing fraction that were prepared with very different tap intensities. nevertheless, a complete characterization of the density of particles inside the column, both before and after the discharging process, was left out. for all above, a study of the variation of density near the outlet of a silo, before and after the discharge in the presence of a jamming, is important to relate it to the subsequent behavior of the system during the restart of the flow. ii. simulation procedure the implementation and description of a simulation algorithm using a pseudo-dynamic code has been developed in numerous studies since the liminal work of manna and khakhar [12, 21–23]. assuming inelastic massless hard disks that will be deposited in a 2d die simulating a silo, the pseudodynamics will consist in small falls and rolls of the grains until they come to rest by contacting other particles or the system boundaries. we use a container formed by a flat base and two flat vertical walls. no periodic boundary conditions are applied. the deposition algorithm consists in choosing a disk in the system and allowing a free fall of length δ if the disk has no supporting contacts, or a roll of arc-length δ over its supporting disk if the disk has one single supporting contact [12, 21, 22, 24]. disks with two supporting contacts are considered stable and left in their positions. if during a fall of length δ a disk collides with another disk (or the base), the falling disk is put just in contact and this contact is defined as its first supporting contact. analogously, if in the course of a roll of length δ, a disk collides with another disk (or a wall), the rolling disk is put just in contact. if the first supporting contact and the second contact are such that the disk is in a stable position, the second contact is defined as the second supporting contact ; otherwise, the lowest of the two contacting particles is taken as the first supporting contact of the rolling disk and the second supporting contact is left undefined. if, during a roll, a particle reaches a lower position than the supporting particle over which it is rolling, its first supporting contact is left undefined (in this way, the particle will fall vertically in the next step instead of rolling underneath the first contact). a moving disk can change the stability state of other disks supported by it; therefore, this information is updated after each move. the deposition is over once each particle in the system has both supporting contacts defined or is in contact with the base (particles at the base are supported by a single contact). then, the coordinates of the centers of the disks and the corresponding labels of the two supporting particles, wall or base are saved 070007-2 papers in physics, vol. 7, art. 070007 (2015) / r. o. uñac et al. for analysis. an important point in these simulations is the effect that the parameter δ has in the results since particles do not move simultaneously but one at a time. one might expect that in the limit δ → 0, we should recover a fairly ”realistic” dynamics for a fully inelastic non-slipping disk dragged downwards at constant velocity. this should represent particles deposited in a viscous medium or carried by a conveyor belt. we chose δ = 0.0062d (with d the particle diameter) since we have observed that for smaller values of δ, results are indistinguishable from those obtained here [12, 24]. it is worth saying that the pseudo-dynamic algorithm used here allows the final configurations obtained after each tapping to be completely static, given that each disk is supported by other two disks as required by the equilibrium conditions in the model. in this way, one can follow the history of formation of the packing and, thus, the occurrence of the arches. this leads to a straightforward definition of the arches in the system, as we will see below. on the other hand, this algorithm is faster in the generation of a given ensemble of particles and, the subsequent tapping process, than the dem one. the ones above are the most important advantages of the algorithm that justify our choice. because arch formation has been identified as a potential cause for segregation in non-convecting systems [25–27], we have centered previous research on detecting arches and analyzing their behavior and distribution in piles and jammed silos [23, 28]. indeed, identification of arches is a rather complex task and the results presented in those works were novel and original at that time. we recommend to those readers interested in the characterization of the arches formed before and after the discharge of a 2d silo filled with disks to address our previous work in ref. [12]. arches are sets of mutually stabilizing particles in a static granular sample. in our pseudo-dynamic simulations we first identify all mutually stable particles and then find the arches as chains of particles connected through these mutual stability contacts. two disks a and b are said mutually stable if a is the left supporting particle of b and b is the right supporting particle of a, or vice versa. since the pseudo-dynamics rest on defining which disk is a support for another disk during the deposition, this information is available in our simulations. chains of mutually stable particles can, thus, be found straightforwardly. these chains can have, in principle, any size starting from two particles. details on the properties of arches found in pseudodynamic simulations can be found in previous works [7, 24, 28]. iii. filling and emptying the silo as explained above, the aim of the present work is to analyze the density patterns of particles inside a silo after its discharge and as a function of the compaction degree before the avalanche event. we first need to prepare packings at reproducible initial packing fractions. to achieve this, we have chosen a well known technique to generate reproducible ensembles of packings [12]. thus, we use a simulated tapping protocol (see below) to generate sets of initial configurations that have well defined mean packing fractions. the simulations are carried out in a rectangular box of width 24.78d containing 1500 equal-sized disks of diameter d. initially, disks are placed at random in the simulation box (with no overlaps) and deposited using the pseudo-dynamic algorithm. once all the grains come to rest, the system is expanded in the vertical direction and randomly shaken to simulate a vertical tap. then, a new deposition cycle begins. after many taps of given amplitude, the system achieves a steady state where all characterizing parameters fluctuate around equilibrium values independently of the previous history of the granular bed. the existence of such ”equilibrium” states has been previously reported in experiments [29]. the tapping of the system is simulated by multiplying the vertical coordinate of each particle by a factor a (with a > 1). then, the particles are subjected to several (about 20) monte carlo loops where positions are changed by displacing particles a random length ∆r uniformly distributed in the range 0 < ∆r < a − 1. new configurations that correspond to overlaps are rejected. this disordering phase is crucial to avoid particles falling back again into the same positions. moreover, the upper limit for ∆r (i.e., a− 1) is deliberately chosen such that a larger tap promotes larger random changes in the particle positions. for each value of a studied, 103 taps are carried out for equilibration followed by 5 × 103 taps for production. 070007-3 papers in physics, vol. 7, art. 070007 (2015) / r. o. uñac et al. 500 deposited configurations are stored which are obtained by saving every ten taps during the production run after equilibration. these deposits will be used later as initial conditions for the discharge and flow through an opening [12]. it is worth mentioning that friction is not considered in our present model, neither between particles nor between particles and the walls (convection is not present [12]). the key feature to stabilize the particles inside the silo is the sticking to the base of the bottom particles. this ensures the achieving of a final static system. as one would expect, the presence of friction would affect the packing density. in this sense, we have already studied the effect of a kind of adhesion force like capillary forces in the packing fraction of tapped columns of disks in a previous work [28]. for each deposit generated, we trigger a discharge by opening an aperture of width d relative to the diameter d of the disk in the center of the containing box base. grains will flow out of the box following the pseudo-dynamics until a blocking arch forms or until the entire system is discharged (with the exception of two piles resting on each side of the aperture). during the dynamics, disks that reach the bottom and whose centers lie on the interval that defines the opening will fall vertically (even if the surface of the disk touches the edge of the aperture). this prevents the formation of arches with end disks sustained by the vertical edge of the orifice. after each discharge, we record the final arrangement of the grains left in the box. one single discharge attempt is carried out for each initial deposit. this allows us to assure that the initial preparation of the pack belongs to the ensemble of deposits corresponding to the steady state of the particular tap intensity chosen. to illustrate the initial and final packing configurations inside the silo, we present two snapshots in fig. 1 showing the particles before, part (a), and after, part (b), the discharge for the case a = 1.1 and d = 2.5. in a previous work, we have focused on the effect that the preparation of the granular assembly has on the size of the avalanches obtained. in the next section, we present the analysis of the corresponding packing densities before and after the discharge. (b)(a) figure 1: snapshots of the packed particles inside the plane silo, (a) before the discharge and (b) after it. the case corresponds to a = 1.1 and d = 2.5. thin blue lines indicate contacts between particles and thick red lines indicate the arches. iv. density sampling to measure the density patterns, we perform the analysis by dividing the packing space into cells, following different geometrical criteria, and for a = 1.1, 1.5, 2.0, and d = 2.5. the first measurements are performed over all the particles inside the silo. we choose two particular arrangements for the sampling cells, i.e., circular ring sectors centered at the base, the maximum radius for the ring being five times the width of the base, and radial angular sectors starting at the center of the base, where the opening of the sector is ∆θ, with the inclination θ measured from the horizontal axis. figure 2 (a) and (b) illustrate the cases. on the other hand, and to especially focus on the region near the outlet, we define in addition three different cell configurations. they are shown in fig. 2 (c)-(e). the details are depicted in the figure caption. in all the sampling cell analysis, the criterion for density calculation was the same. the particles whose centers fall into a given sector are counted, and that number is then divided by the sector area, 070007-4 papers in physics, vol. 7, art. 070007 (2015) / r. o. uñac et al. figure 2: sketch of the geometry of the cells used to measure the density of particles inside the silo. (a) circular ring sectors of thickness ρ, centered at the middle point of the base. (b) radial angular sectors starting at the center of the base, spanning all the packing. (c) sectors as those in part (a) but delimited by two symmetrical lines at 60◦ respect to the horizontal and a maximum radius of half width of the base. (d) sectors as in part (b) but limited by a semicircle with radius of half width of the base. (e) column of rectangular sampling sectors, with height ρ, and width equal to the outlet size. giving thus the particle density in the sector. we measure the density corresponding to the initial packing and for the packing array after the discharge for each one of the independent configurations and obtain the mean values of the density over those configurations presenting clogging. this was repeated for for all the geometries used. v. results and discussion in fig. 3 we present the results for the particle density as a function of the distance of the ring to the center (see fig. 2 (a)). the thickness of each ring is equal to 1.61 a.u., i.e., a particle diameter. in part (a) of the figure we plot the behavior of the density for the initial packings, before the discharge of the silo is performed. as expected, a constant behavior is observed as the radius increases. as also reported elsewhere [28], the average packing density is larger for a = 1.1 than for a = 1.5 or 2.0. the oscilations observed for a = 1.1 are due to the presence of order in the packing structure [24]. this order is virtually absent for higher tapping intensities. the decrease of the density at small distances from the center, especially for a = 1.1 and 2.0, is a purely 0 20 40 60 80 100 0.0 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 (b) r d is ks d en si ty (a) figure 3: particle density vs. the distance of the ring to the center, using the cell depicted in fig. 2(a). here ρ = d. (a) initial packing behavior. (b) final stage after discharge. a = 1.1 (red squares), 1.5 (blue up triangles) and 2.0 (green circles). geometric effect related to the small area covered by the smaller rings and the particular disposition of the disks at the base. on the other hand, the decrease observed at large distances for a = 1.1 is because the initial arrangement of particles presents a free surface which is tilted. after the discharge (fig. 3 (b)), the disk density steeply drops inside the region close to the outlet, putting in evidence the size of the concave hole formed after the avalanche. besides, the density slightly falls with height, for all amplitudes. to highlight this effect, we draw three lines corresponding to the mean densities at the initial state (fig. 3 (a)). this effect is due to the lower density of packing near the walls and it will be explained later. here again, the oscillations for a = 1.1 are associated to the order in the packing structure. a similar analysis can be done taking a different thickness for the rings, giving results qualitatively equal to the ones shown in fig. 3. in fig. 4 we present the results for particle density as a function of the inclination angle of the sector respect to the horizontal for three values of a. part (a) corresponds to the initial packing structure and part (b) shows the state after discharge. the opening angle in each sector is 1◦. the horizontal axis represents the angle of the most inclined side of the sector, as indicated in fig. 2 (b). the peaks for a = 1.1 in both plots are associated with the 070007-5 papers in physics, vol. 7, art. 070007 (2015) / r. o. uñac et al. 0 20 40 60 80 100 120 140 160 180 0.2 0.3 0.4 0.5 0.6 0.3 0.4 0.5 0.6 (b) (º) d is ks d en si ty (a) figure 4: particle density vs. the inclination angle of the sector respect to the horizontal, for three values of a. (a) initial packing behavior. (b) final stage after discharge. a = 1.1 (red squares), 1.5 (blue up triangles) and 2.0 (green circles). the three horizontal lines indicate the initial average densities for the three amplitudes. ordered structure, showing important increments of the density for 60◦ and 120◦ and, less important, for 30◦ and 150◦. when increasing the angular sector width, ∆θ, peak amplitude decreases, virtually disappearing, keeping the qualitative behavior shown in fig. 4. the average density for a = 1.1 is slightly higher, as expected. after discharge, a dip at the central part of the curves appears as an evident consequence of the created hole, and showing the presence of a central region around the hole in which the density is reduced. this reduction is more pronounced for a = 1.1 because the mean avalanche size for that tapping intensity is greater [12] and the affected region is approximately between 60◦ and 120◦ for all amplitudes. at the bottom part of the figure, we plotted three lines indicating the initial average densities for the three amplitudes to better visualize the density change. although useful to have an overview for density behavior, information is lost when averaging over sectors spanning all packing configuration. for that reason, and to analyze in more detail the region close to the outlet of the bidimensional silo, we implement three new arrangements for the sectors to perform the density analysis, as indicated in fig. 2 (c)-(e). fig. 2 (c) shows a scheme for sectors similar to those in part (a) but delimited by two symmetrical lines at 60◦ respect to the horizontal and a maximum radius of 20 a.u. (half-width of the container). part (d) sketches the same sectors as in part (b) but limited by a semicircle with radius 20 a.u.. finally, part (e) in the same figure, shows a column of rectangular sampling sectors, each one with height ρ, and width equal to the outlet size. in fig. 5 we present the results for a = 1.1 for the density averaged over sectors as in fig. 2 (d). the upper part shows the density for the initial packing structure as a function of the inclination angle of the sector with ∆θ = 1◦ (squares) and ∆θ = 10◦ (circles). for comparison, the middle part of the figure also shows the initial density but for the sectors without the boundary semicircle. by circumscribing the density calculation to a semicircle centered at the outlet and whose radius is the half-width of the system, the ordered structure of the bottom part of the initial packing is more evident (peak occurrence) inside sectors from 60◦ to 120◦. analyzing the final state (bottom part of fig. 5), a visible decrease in density around the outlet and also a large disorder is observed, especially between 60◦ and 120◦. outside this range, the structure of the packing seems virtually unchanged, thus revealing the non-avalanche area. as before, the lines indicate the initial average density for the packing. in fig. 6 we show the results for a = 1.5 with ∆θ = 1◦ (squares) and ∆θ = 10◦ (circles) . it is clear from part (a) of the figure that density oscillations are much smaller than those for a = 1.1, even for ∆θ = 1◦. this proves the lack of order in the packing structure, except for those sectors close to the base of the packing. after discharge, the density does not change substantially throughout the bulk, except for a slight decrease in the vicinity of the outlet orifice. this is shown in fig. 6 (b) and it is more evident for ∆θ = 10◦. a slight density modulation can also be observed. as a partial conclusion, we can say that the density of the packing after discharge retains the look of the original grain disposition, which is consistent with the disorder obtained for that tapping amplitude, i.e., lowering of the packing fraction [12]. the results and conclusions for a = 2.0 are quite similar to those for a = 1.5. in fig. 7 the results for density are plotted av070007-6 papers in physics, vol. 7, art. 070007 (2015) / r. o. uñac et al. 0.3 0.4 0.5 0.6 0 20 40 60 80 100 120 140 160 180 0.2 0.3 0.4 0.5 0.6 0.3 0.4 0.5 0.6 (a) (c) (º) d is ks d en si ty (b) figure 5: density average vs. inclination angle for a = 1.1. squares correspond to ∆θ = 1◦ and circles for ∆θ = 10◦. upper: initial packing structure. middle: initial density but for the sectors without the boundary semicircle, for comparison. bottom: final state after the discharge for the system in the upper part. the horizontal line indicates the initial average density for ∆θ = 10◦. eraging over sectors as in fig. 2 (c), for a = 1.1, 1.5 and 2.0, and ρ = 2 a.u.. as in previous cases, the initial packing shows the periodicity related to the ordered structure for small ρ (not shown here) and small a. as ρ increases, fluctuations become smaller and a practically constant value for packing density can be observed as a function of the distance to the bottom center of the silo. after discharge (fig. 7 (b)), a sudden decrease in density is observed up to a height equivalent to the estimated beverloo’s diameter, i.e., in our present case, 4 a.u. two regions can be distinguished. one corresponding to a radius of 2 a.u., where the density is zero, and the other, where the density increases rapidly, almost reaching the initial bulk value. for a = 1.5, the fluctuations are only present for small r, near the base. after discharge, the 0 20 40 60 80 100 120 140 160 180 0.2 0.3 0.4 0.5 0.6 0.3 0.4 0.5 0.6 (b) (º) d is ks d en si ty (a) figure 6: density vs. inclination angle as in fig. 5 (a) and (c), but for a = 1.5, ∆θ = 1◦ (squares) and ∆θ = 10◦. the horizontal line indicates the initial average density for ∆θ = 10◦. behavior is similar to the case a = 1.1, but with the presence of much less fluctuations. for all amplitudes, the depression in density is confined inside the cone subtended by the lines at 60◦. compare fig. 7 with fig. 4 to see that the sectors corresponding to the stagnant zone keep the initial disk density. a similar result is obtained for a = 2.0. finally, we performed the density analysis on the column made by sectors with a given thickness ρ, as indicated in fig. 2 (e). figure 8 shows the results for a = 1.1, 1.5 and 2.0, and ρ = 2 a.u.. in part (a) of the figure, which corresponds to the initial packing structure, the density of disks remains constant with height for different a. fluctuations are again related to the ordered structure and decreases for greater a. in the case a = 1.5, as known, the initial configuration is more disordered than for a = 1.1 (less fluctuations even for small ρ). the mean density of the packing coincides with the corresponding values in previous figures. 070007-7 papers in physics, vol. 7, art. 070007 (2015) / r. o. uñac et al. 0 2 4 6 8 10 12 14 16 18 20 0.0 0.2 0.4 0.6 0.2 0.4 0.6 (b) r d is ks d en si ty (a) figure 7: results for the density of particles averaged over sectors as in fig. 2 (c), for a = 1.1, 1.5 and 2.0, and ρ = 2 a.u.. (a) initial packing. (b) after the discharge. a = 1.1 (red squares), 1.5 (blue up triangles) and 2.0 (green circles). the three horizontal lines indicate the initial average densities for the three amplitudes. after the avalanche (fig. 8 (b)), a change in density is observed up to a height of about 25 to 30 a.u.. up to the order of 40 a.u., fluctuations decrease significantly, while for greater heights (far from the outlet) only a slight decrease of those fluctuations is observed. this is probably related to the formation of arches in the structure that prevents the occurrence of internal avalanches at greater heights. figure 1 and fig. 9 illustrate this point. those figures show the snapshots before and after the discharge for a = 1.1 and a = 1.5, respectively. there, the arches formed by the particles are indicated with thick segments. in parts (b), a loose packing with the presence of arches is the resulting structure after the avalanche [12]. the density drop in fig. 8 (b) coincides with that in fig. 7 (b) and its extent is related to the region mainly affected by the discharge, which is of the 0 10 20 30 40 50 60 70 80 90 100 0.0 0.2 0.4 0.6 0.2 0.4 0.6 (b) h d is ks d en si ty (a) figure 8: density results on the column vs. height h belonging to sectors with thickness ρ = 2 a.u. (1.24d). a = 1.1 (red squares), 1.5 (blue up triangles) and 2.0 (green circles). the three horizontal lines indicate the initial average densities for the three amplitudes. order of one beverloo’s diameter (4 a.u.). besides, a decrease of about 5% to 10% in the average bulk density is observed, in coincidence with fig. 3 (b). the analysis of the column for a = 1.05 presents some interesting features comparing to the preceding results. figure 10 shows the results for that amplitude. first, we observe that the density fluctuations vs. height in the arrangements (both initial and final) are higher, reinforcing the fact that fluctuations increase with decreasing a. this is related with the higher order structure. on the other hand, up to a height of 25 to 30 a.u. (which involves not only the empty vault area but the rarefaction caused by the avalanche), the density does not fluctuate for a = 1.5 and 2.0, while it does fluctuate for a = 1.05 and 1.1. this is because these latter arrangements are more compact and ordered, allowing propagation of the decreased density with a certain periodicity (see fig. 1). as 070007-8 papers in physics, vol. 7, art. 070007 (2015) / r. o. uñac et al. ( b )( a )(a) (b) figure 9: snapshots of the particles inside the silo, (a) before the discharge and (b) after it. the case corresponds to a = 1.5 and d = 2.5. thin blue lines indicate contacts between particles and thick red lines indicate the arches. before, the radius of the empty vault coincides with the beverloo assumption. it is noteworthy that the decrease in the average density in the bulk after the discharge is only noticed when analyzing the data in a column or in the circular sectors of fig. 2 (a), not in the other cases. this would indicate that the main contribution to the avalanche is done by the particles just above the hole. vi. conclusions according to the results shown so far, it can be said that there is a clear area of rarefaction in the packing column after the avalanche discharge of the silo. this area is focused on the outlet opening. this lower density region has been analyzed with dif0 10 20 30 40 50 60 70 80 90 100 0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 (b) h d is ks d en si ty (a) figure 10: density results as in fig. 8 but a = 1.05. the horizontal line indicates the initial average density. ferent geometries that provide a structural macroscopic view of it. the presence of arches before and after the discharge allows relating the packing structure with the size of the density depression [12]. the higher the order in the initial structure, the greater the extent of the rarefaction area after discharge (fig. 7 (b) and fig. 8 (b)), i.e., a more open (less ordered) initial structure induces a less spreading of the lower density region. regarding the shape of the rarefaction region, the decrease in density is more pronounced upward and sideways to an angle of 60◦ (120◦) (which defines the stagnation zone). this is evidenced by comparing the density profiles for the columns (fig. 8 (b)) and those for the circular sectors limited by straight lines at 60◦ and 120◦ (fig. 7 (b)) with respect to the case shown in fig. 5 (c). after discharge in the case of fig. 7 (b), a sudden decrease in density is observed up to a height equivalent to the estimated beverloo’s diameter. two regions can be distinguished: one corresponding to 070007-9 papers in physics, vol. 7, art. 070007 (2015) / r. o. uñac et al. a radius of 2 a.u., where the density is zero, and the other where the density increases rapidly, almost reaching the initial bulk value. the results for the case d = 2.75 are qualitatively the same as the one analyzed above, for that reason they are not presented here. it is important to remember that here we consider a monosized distribution. taking into account that in real applications the size distribution of particles is usually not monodisperse, it is a future challenge to see how our present results are modified when considering a given dispersion in the size distribution of particles. acknowledgements this work was supported by conicet (argentina) through grant pip 353 and by the secretary of science and technology of universidad nacional de san luis, grant p-3-10114. [1] a w jenike, gravity flow of bulk solids, university of utah engineering experiment station, bull. 108 (1961), and bulletin 123 (1964). [2] w a beverloo, h a leniger, j van de velde, the flow of granular solids through orifices, chem. eng. sci. 15, 260 (1961). [3] j wu, j binbo, j chen, y yang, multi-scale study of particle flow in silos, advanced pow. tech. 20, 62 (2009). [4] g h ristow, outflow rate and wall stress for two-dimensional hoppers, phys. a 235, 319 (1997). [5] k to, p-y lai, jamming pattern in a twodimensional hopper, phys. rev. e 66, 011308 (2002). [6] s-c yang, s-s hsiau, the simulation and experimental study of granular materials discharged from a silo with the placement of inserts, pow. tech. 120, 244 (2001). [7] a garcimart́ın, i zuriguel, l a pugnaloni, a janda, shape of jamming arches in twodimensional deposits of granular materials, phys. rev. e 82, 031306 (2010). [8] r o uñac, o a benegas, a m vidales and i ippolito, experimental study of discharge rate fluctuations in a silo with different hopper geometries, pow. tech. 225, 214 (2012). [9] r l brown, j c richards, profile of flow of granulates through apertures, trans. inst. chem. eng. 38, 243 (1960). [10] p w cleary, m l sawley, dem modelling of industrial granular flows: 3d case studies and the effect of particle shape on hopper discharge, appl. math. model. 26, 89 (2002). [11] a anand, j s curtis, c r wassgren, b c hancock, w r ketterhagen, predicting discharge dynamics from a rectangular hopper using the discrete element method (dem), chem. eng. sci. 63, 5821 (2008). [12] r o uñac, a m vidales and l. a. pugnaloni, the effect of packing fraction on the jamming of granular flow through small apertures, j. stat. mech. p4008 (2012). [13] c mankoc, a janda, r arévalo, j m pastor, i zuriguel, a garcimart́ın, d maza, the flow rate of granular materials through an orifice, gran. matt. 9, 407 (2007). [14] a p huntington, n m rooney, chemical engineering tripos part 2, research project report, university of cambridge (1971). [15] f c franklin, l n johanson, flow of granular material through a circular orifice, chem. eng. sci. 4, 119 (1955). [16] s humby, u tüzün, a b yu, prediction of hopper discharge rates of binary granular mixtures, chem. eng. sci. 53, 483 (1998). [17] a mehta, g c barker, j m luck, cooperativity in sandpiles: statistics of bridge geometries, j. stat. mech. p10014 (2004). [18] g h ristow, h j herrmann, density patterns in two-dimensional hoppers, phys. rev. e 50, r5 (1994). [19] a medina, j a córdova, e luna, c treviño, velocity field measurements in granular gravity flow in a near 2d silo, phys. lett. a 250 111 (1998). 070007-10 papers in physics, vol. 7, art. 070007 (2015) / r. o. uñac et al. [20] l babout, k. grudzien, e maire, p j withers, influence of wall roughness and packing density on stagnant zone formation during funnel flow discharge from a silo: an x-ray imaging study, chem. eng. sci. 97, 210 (2013). [21] ss manna, d v khakhar, internal avalanches in a granular medium, phys. rev. e 58, r6935 (1998). [22] manna s s, self-organization in a granular medium by internal avalanches, phase transit. 75, 529 (2002). [23] r o uñac, j g benito, a m vidales, l a pugnaloni, arching during the segregation of twodimensional tapped granular systems: mixtures versus intruders, eur. phys. j.e 37, 117 (2014). [24] l a pugnaloni, m g valluzzi and l g valluzzi, arching in tapped deposits of hard disks, phys. rev. e 73, 051302 (2006). [25] a kudrolli, size separation in vibrated granular material, rep. prog. phys. 67, 209 (2004). [26] j duran, j rajchenbach, e clément, arching effect model for particle size segregation, phys. rev. lett. 70, 2431 (1993). [27] j duran, t mazozi, e clément, j rajchenbach, size segregation in a two-dimensional sandpile: convection and arching effects, phys.rev. e 50, 5138 (1994). [28] r o uñac, a m vidales, l a pugnaloni, simple model for wet granular beds subjected to tapping, gran. matt. 11, 371 (2009). [29] p ribière, p richard, p philippe, d bideau, r delannay, on the existence of stationary states during granular compaction, eur. phys. j. e 22, 249 (2007). 070007-11 papers in physics, vol. 9, art. 090005 (2017) received: 31 march 2017, accepted: 06 june 2017 edited by: d. domı́nguez reviewed by: a. feiguin, northeastern university, boston, united states. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.090005 www.papersinphysics.org issn 1852-4249 an efficient impurity-solver for the dynamical mean field theory algorithm y. núñez fernández,1∗ k. hallberg1 one of the most reliable and widely used methods to calculate electronic structure of strongly correlated models is the dynamical mean field theory (dmft) developed over two decades ago. it is a non-perturbative algorithm which, in its simplest version, takes into account strong local interactions by mapping the original lattice model on to a single impurity model. this model has to be solved using some many-body technique. several methods have been used, the most reliable and promising of which is the density matrix renormalization technique. in this paper, we present an optimized implementation of this method based on using the star geometry and correction-vector algorithms to solve the related impurity hamiltonian and obtain dynamical properties on the real frequency axis. we show results for the half-filled and doped one-band hubbard models on a square lattice. i. introduction materials with strongly correlated electrons have attracted researchers in the last decades. the fact that most of them show interesting emergent phenomena like superconductivity, ferroelectricity, magnetism, metal-insulator transitions, among other properties, has triggered a great deal of research. the presence of strongly interacting local orbitals that causes strong interactions among electrons makes these materials very difficult to treat theoretically. very successful methods to calculate electronic structure of weakly correlated materials, such as the density functional theory (dft) [1], lead to wrong results when used in some of these systems. the dft-based local density approxima∗e-mail: yurielnf@gmail.com 1 centro atómico bariloche and instituto balseiro, cnea, conicet, avda. e. bustillo 9500, 8400 san carlos de bariloche, ŕıo negro, argentina tion (lda) [2] and its generalizations are unable to describe accurately the strong electron correlations. also, other analytical methods based on perturbations are no longer valid in this case so other methods had to be envisaged and developed. more than two decades ago, the dynamical mean field theory (dmft) was developed to study these materials. this method and its successive improvements [3–8] have been successful in incorporating the electronic correlations and more reliable calculations were done. the combination of the dmft with lda allowed for band structure calculations of a large variety of correlated materials (for reviews, see refs. [9, 10]), where the dmft accounts more reliably for the local correlations [11, 12]. the dmft relies on the mapping of the correlated lattice onto an interacting impurity for which the fermionic environment has to be determined self-consistently until convergence of the local green’s function and the local self-energy is reached. this approach is exact for the infinitely 090005-1 papers in physics, vol. 9, art. 090005 (2017) / y. núñez fernández et al. coordinated system (infinite dimensions), the noninteracting model and in the atomic limit. therefore, the possibility to obtain reliable dmft solutions of lattice hamiltonians relies directly on the ability to solve (complex) quantum impurity models. since the development of the dmft, several quantum impurity solvers were proposed and used successfully; among these, we can mention the iterated perturbation theory (ipt) [13, 14], exact diagonalization (ed) [15], the hirsch-fye quantum monte carlo (hfqmc) [16], the continuous time quantum monte carlo (ctqmc) [17–20], noncrossing approximations (nca) [21], and the numerical renormalization group (nrg) [22, 23]. all of these methods imply certain approximations. for a more detailed description, see [24]. some years ago, we proposed the density matrix renormalization group (dmrg) as a reliable impurity-solver [25–27] which allows to surmount some of the problems existing in other solvers, giving, for example, the possibility of calculating dynamical properties directly on the real frequency axis. other related methods followed, such as in [28, 29]. this way, more accurate results can be obtained than, for example, using algorithms based on monte carlo techniques. the scope of this paper is to detail the implementation of this method and to show recent applications and potential uses. ii. dmft in the square lattice we will consider the hubbard model on a square lattice: h = t ∑ 〈ij〉σ c † iσcjσ + u ∑ i ni↑ni↓ −µ ∑ i ni, (1) where ciσ ( c † iσ ) annihilates (creates) an electron with spin σ =↑,↓ at site i, niσ = c † iσciσ is the density operator, ni = ni↓ + ni↑, u is the coulomb repulsion, µ is the chemical potential, and 〈ij〉 represents nearest neighbor sites. changing to the bloch basis d † k, the noninteracting part becomes: h0 = ∑ k,σ t(k)d † kσdkσ, (2) with t(k) = 2t (cos kx + cos ky) − µ. the green’s function for (1) is hence given by: g(k,ω) = [ω − t(k) − σ(k,ω)]−1 , (3) where σ(k,ω) is the self-energy. the dmft makes a local approximation of σ(k,ω), that is, σ(k,ω) ≈ σ(ω). this locality of the magnitudes allows us to map the lattice problem onto an auxiliar impurity problem that has the same local magnitudes g(ω) and σ(ω). the impurity is coupled to a non-interacting bath, which should be determined iteratively. the hamiltonian can be written: himp = hloc + hb, (4) where hloc is the local part of (1) hloc = −µn0 + un0↑n0↓, (5) and the non-interacting part hb representing the bath is: hb = ∑ iσ λib † iσbiσ + ∑ iσ vi [ b † iσc0σ + h.c. ] , (6) where b † iσ represents the creation operator for the bath-site i and spin σ, label “0” corresponds to the interacting site. the algorithm is summarized as: (i) start with σ(ω) = 0. (ii) calculate the green’s function for the local interacting lattice site: g(ω) = 1 n ∑ k g(k,ω) (7) = 1 n ∑ k [ω − t(k) − σ(ω)]−1 . (iii) calculate the hybridization γ(ω) = ω + µ− σ(ω) − [g(ω)]−1 . (8) (iv) find a hamiltonian representation himp with hybridization γd(ω) to approximate γ(ω). the hybridization γd(z) is characterized by the parameters vi and λi of himp through: 090005-2 papers in physics, vol. 9, art. 090005 (2017) / y. núñez fernández et al. γd(ω) = ∑ i v2i ω −λi . (9) (v) calculate the green’s function gimp(ω) at the impurity of the hamiltonian himp using dmrg. (vi) obtain the self-energy σ(ω) = ω + µ− [gimp(ω)] −1 − γd(ω). (10) return to (ii) until convergence. at step (iv) we should find the parameters vi and λi by fitting the calculated hybridization γ(ω) using expression (9). at half-filling, because of the electron-hole symmetry, we have γ(ω) = γ(−ω) and hence λ−i = −λi, and v−i = vi, where the bath index i goes from −p to p, and it does not include i = 0 for an even number of bath sites 2p. almost all of the computational time is spent at step (v), where the dynamics of a single impurity anderson model (siam) (see fig. 1) is calculated. we use the correction-vector for dmrg following [30]. the one-dimensional representation of the problem (needed for a dmrg calculation) is as showed in fig. 1, except that for the spin degree of freedom we duplicate the graph, generating two identical chains, one for each spin. moreover, it should be noticed that this is not a local or shortrange 1d hamiltonian (usually called chain geometry, where the dmrg is supposed to work very well). however, we refer to [31, 32] where strong evidence of better performance of the dmrg for this kind of geometry (star geometry) compared to chain geometry is presented. the correction-vector for dmrg consists of targeting not only the ground state |e0〉 of the system but also the correction-vector |vi〉 associated to the frequency ωi (and its neighborhood), that is: (ωi + iη −himp −e0) |vi〉 = c † 0 |e0〉 , (11) where a lorentzian broadering η is introduced to deal with the poles of a finite-length siam. for a better matching between the ω windows (with width approximately η), we target the correction vectors of the extremes of the window. once the dmrg is converged, the green’s function is evaluated for a finer mesh (around 0.2 of the original figure 1: schematic representation of the impurity problem for the dmft. the circles (square) represent the non-interacting (interacting) sites, and the lines correspond to the hoppings. top: star geometry drawn in two ways. bottom: 1d representation as used for dmrg calculations. window) [30]. in this way, a suitable renormalized representation of the operators is obtained to calculate the properties of the excitations around ωi, particularly the green’s function. in what follows, we present results for a paradigmatic correlated model using the method described above. iii. results we have used this method to calculate the density of states (dos) of the hamiltonian (eq. 1) on a square lattice with unit of energy t = 0.25, for several dopings, given by the chemical potential. we consider a discarded weight of 10−11 in the dmrg procedure for which a maximum of around m = 128 states were kept, even for the largest systems (50 sites). for these large systems, the ground state takes around 20 minutes to converge and each frequency window, between 5 and 20 minutes. this is an indication of the good efficiency of the method. the metal-insulator mott’s transition at halffilling is showed in fig. 2. the transition occurs between u = 3 and u = 4. in fig. 3, we observe that the metallic character of the bands remains robust under doping for a given value of the interaction, showing a weight transfer between the bands due to the correlations. the metallic character is also seen in the variation of the filling with µ. 090005-3 papers in physics, vol. 9, art. 090005 (2017) / y. núñez fernández et al. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 −4 −3 −2 −1 0 1 2 3 4 − 1 /π i m [g (ω + i η )] u=1 u=2 u=3 u=4 −30 −25 −20 −15 −10 −5 0 −4 −3 −2 −1 0 1 2 3 4 im [σ (ω + ι η )] ω u=1 u=2 u=3 u=4 figure 2: top: density of states for u = 1, 2, 3, 4 at half-filling. we use a bath with 30-50 sites per spin and a lorentzian broadening η = 0.12. the fermi energy is located at ω = 0. bottom: imaginary part of the self-energy. figure 4 shows our results for a larger value of the interaction u, for which we find a regime of doping having an insulating character. however, for a large enough doping (obtained for a large negative value of the chemical potential), the systems turn metallic and acquire a large density of states at the fermi energy. while the system is insulating, changing the chemical potential only results in a rigid shift of the density of states. the small finite values of the dos at the fermi energy for the insulating cases are due to the lorentzian broadening η, see eq. (11). iv. conclusions we have presented here an efficient algorithm to calculate dynamical properties of correlated sys 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 −3 −2 −1 0 1 2 3 4 − 1 /π i m [g (ω + i η )] ω µ=−0.0 µ=−0.5 µ=−1.0 0.8 0.85 0.9 0.95 1 −1 −0.8 −0.6 −0.4 −0.2 0 fi ll in g µ figure 3: top: density of states for u = 3, same parameters as in fig. 2, and several chemical potentials (µ = 0 corresponds to the half-filled case). bottom: filling vs chemical potential showing a metallic behavior. tems such as the electronic structure for any doping. it is based on the dynamical mean field theory method where we use the density matrix renormalization group (dmrg) as the impurity solver. by using the star geometry for the hybridization function (which reduces the entanglement enhancing the performance of the dmrg for larger bath sizes) together with the correction vector technique(which accurately calculates the dynamical response functions within the dmrg) we were able to obtain reliable real axis response functions, in particular, the density of states, for any doping, for the hubbard model on a square lattice. this improvement will allow for the calculation of dynamical properties on the real energy axis for complex and more realistic correlated systems. 090005-4 papers in physics, vol. 9, art. 090005 (2017) / y. núñez fernández et al. 0 0.1 0.2 0.3 0.4 0.5 −3 −2 −1 0 1 2 3 4 5 − 1 /π i m [g (ω + i η )] ω µ=−0.0 µ=−0.5 µ=−1.0 µ=−1.5 0.8 0.85 0.9 0.95 1 −1.6 −1.4 −1.2 −1 −0.8 −0.6 −0.4 −0.2 0 fi ll in g µ figure 4: top: density of states for u = 4, same parameters as in fig. 2, and several chemical potentials (µ = 0 corresponds to the half-filled case). bottom: filling vs chemical potential showing the transition from a metal to an insulator. acknowledgements we thank daniel garćıa for useful discussions. [1] p hohenberg, w kohn, inhomogeneous electron gas, phys. rev. 136, b864 (1964). [2] r o jones, o gunnarsson, the density functional formalism, its applications and prospects, rev. mod. phys. 61, 689 (1989). [3] g kotliar, d vollhardt, strongly correlated materials: insights from dynamical mean-field theory, physics today 57, 53 (2004). [4] a georges, g kotliar, w krauth, m j rozenberg, dynamical mean-field theory of strongly correlated fermion systems and the limit of infinite dimensions, rev. mod. phys. 68, 13 (1996). [5] g kotliar, s y savrasov, g pálsson, g biroli, cellular dynamical mean field approach to strongly correlated systems, phys. rev. lett. 87, 186401 (2001). [6] t maier, m jarrell, t pruschke, m h hettler, quantum cluster theories, rev. mod. phys. 77, 1027 (2005). [7] m h hettler, a n tahvildar-zadeh, m jarrell, t pruschke, h r krishnamurthy, nonlocal dynamical correlations of strongly interacting electron systems, phys. rev. b 58, r7475 (1998). [8] d sénéchal, d perez, m pioro-ladrière, spectral weight of the hubbard model through cluster perturbation theory, phys. rev. lett. 84, 522 (2000). [9] m imada, t miyake, electronic structure calculation by first principles for strongly correlated electron systems, j. phys. soc. jpn. 79, 112001 (2010). [10] k held, electronic structure calculations using dynamical mean-field theory, adv. in phys. 56, 829 (2007). [11] v i anisimov, a i poteryaev, m a korotin, a o anokhin, g kotliar, first-principles calculations of the electronic structure and spectra of strongly correlated systems: dynamical mean-field theory, j. phys. condens. mat. 9, 7359 (1997). [12] a i lichtenstein, m i katsnelson, ab initio calculations of quasiparticle band structure in correlated systems: lda++ approach, phys. rev. b 57, 6884 (1998). [13] a georges, g kotliar, hubbard model in infinite dimensions, phys. rev. b 45, 6479 (1992). [14] m j rozenberg, g kotliar, x y zhang, motthubbard transition in infinite dimensions. ii, phys. rev. b 49, 10181 (1994). 090005-5 papers in physics, vol. 9, art. 090005 (2017) / y. núñez fernández et al. [15] m caffarel, w krauth, exact diagonalization approach to correlated fermions in infinite dimensions: mott transition and superconductivity, phys. rev. lett. 72, 1545 (1994). [16] j e hirsch, r m fye, monte carlo method for magnetic impurities in metals, phys. rev. lett. 56, 2521 (1986). [17] a n rubtsov, v v savkin, a i lichtenstein, continuous-time quantum monte carlo method for fermions, phys. rev. lett. 72, 035122 (2005). [18] p werner, a comanac, l de medici, m troyer, a j millis, continuous-time solver for quantum impurity models, phys. rev. lett. 97, 076405 (2006). [19] h park, k haule, g kotliar, cluster dynamical mean field theory of the mott transition, phys. rev. lett. 101, 186403 (2008). [20] e gull, a j millis, a i lichtenstein, a n rubtsov, m troyer, p werner, continuoustime monte carlo methods for quantum impurity models, rev. mod. phys. 83, 349 (2011). [21] t pruschke, d l cox, m jarrell, hubbard model at infinite dimensions: thermodynamic and transport properties, phys. rev. lett. 47, 3553 (1993). [22] k g wilson, the renormalization group: critical phenomena and the kondo problem, rev. mod. phys. 47, 773 (1975). [23] r bulla, zero temperature metal-insulator transition in the infinite-dimensional hubbard model, phys. rev. lett. 83, 136 (1999); r bulla, a c hewson, t pruschke, numerical renormalization group calculations for the self-energy of the impurity anderson model, j. phys. condens. mat. 10, 8365 (1998). [24] k hallberg, d j garćıa, p cornaglia, j facio, y núñez-fernández, state-of-the-art techniques for calculating spectral functions in models for correlated materials, epl 112, 17001 (2015). [25] d j garćıa, k hallberg, m j rozenberg, dynamical mean field theory with the density matrix renormalization group, phys. rev. lett. 93, 246403 (2004). [26] d j garćıa, e miranda, k hallberg, m j rozenberg, mott transition in the hubbard model away from particle-hole symmetry, phys. rev. b 75, 121102 (2007); e miranda, d j garćıa, k hallberg, m j rozenberg, the metal-insulator transition in the paramagnetic hubbard model, physica b: cond. mat. 403, 1465 (2008); d j garćıa, e miranda, k hallberg, m j rozenberg, metal-insulator transition in correlated systems: a new numerical approach, physica b: cond. mat. 398, 407 (2007); s nishimoto, f gebhard, e jeckelmann, dynamical density-matrix renormalization group for the mott-hubbard insulator in high dimensions, j. phys. condens. mat. 16, 7063 (2004); m karski, c raas, g uhrig, electron spectra close to a metal-to-insulator transition, phys. rev. b 72, 113110 (2005); m karski, c raas, g uhrig, single-particle dynamics in the vicinity of the mott-hubbard metal-to-insulator transition, phys. rev. b 75, 075116 (2008); c raas, p grete, g uhrig, emergent collective modes and kinks in electronic dispersions, phys. rev. lett. 102, 076406 (2009). [27] y núñez fernández, d garćıa, k hallberg, the two orbital hubbard model in a square lattice: a dmft + dmrg approach, j. phys.: conf. ser. 568, 042009 (2014). [28] m ganahl et al, efficient dmft impurity solver using real-time dynamics with matrix product states, phys. rev. b 92, 155132 (2015). [29] f wolf, j justiniano, i mcculloch, u schollwöck, spectral functions and time evolution from the chebyshev recursion, phys. rev. b 91, 115144 (2015). [30] t d kühner, s r white, dynamical correlation functions using the density matrix renormalization group, phys. rev. b 60, 335 (1999). 090005-6 papers in physics, vol. 9, art. 090005 (2017) / y. núñez fernández et al. [31] a holzner, a weichselbaum, j von delft, matrix product state approach for a two-lead multilevel anderson impurity model, phys. rev. b 81, 125126 (2010). [32] f alexander wolf, i mcculloch, u schollwöck, solving nonequilibrium dynamical mean-field theory using matrix product states, phys. rev. b 90, 235131 (2014). 090005-7 papers in physics, vol. 7, art. 070012 (2015) received: 20 november 2014, accepted: 29 june 2015 edited by: c. a. condat, g. j. sibona reviewed by: a. de luca, laboratoire de physique theorique, ens & institut philippe meyer, paris, france. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.070012 www.papersinphysics.org issn 1852-4249 role of energy uncertainties in ergodicity breaking induced by competing interactions and disorder. a dynamical assessment through the loschmidt echo. pablo r. zangara,1, 2∗ patricia r. levstein,1, 2 horacio m. pastawski1, 2† a local excitation in a quantum many-particle system evolves deterministically. a timereversal procedure, involving the invertion of the signs of every energy and interaction, should produce an excitation revival: the loschmidt echo (le). if somewhat imperfect, only a fraction of the excitation will refocus. we use such a procedure to show how noninverted weak disorder and interactions, when assisted by the natural reversible dynamics, fully degrade the le. these perturbations enhance diffusion and evenly distribute the excitation throughout the system. such a dynamical paradigm, called ergodicity, breaks down when either the disorder or the interactions are too strong. these extreme regimes give rise to the well known anderson localization and mott insulating phases, where quantum diffusion becomes restricted. accordingly, regardless of the kinetic energy terms, the excitation remains mainly localized and out-of-equilibrium, and the system behaves non-ergodically. the le constitutes a fair dynamical witness for the whole phase diagram since it evidences a surprising topography in which ergodic and non-ergodic phases interpenetrate each other. furthermore, we provide an estimation for the critical lines separating the ergodic and non-ergodic phases around the mott and anderson transitions. the energy uncertainties introduced by disorder and interaction shift these thresholds towards stronger perturbations. remarkably, the estimations of the critical lines are in good agreement with the phase diagram derived from the le dynamics. i. introduction according to classical mechanics, a system composed by n particles in d dimensions is described as a point x in a (2dn)-dimensional phase space. if the system is conservative, the energy is the primary conserved quantity, and the phase space is ∗email: zangara@famaf.unc.edu.ar †email: horacio@famaf.unc.edu.ar 1 instituto de f́ısica enrique gaviola (conicet-unc), argentina. 2 facultad de matemática, astronomı́a y f́ısica, universidad nacional de córdoba, 5000 córdoba, argentina. restricted to a hypersurface s of 2dn − 1 dimensions usually called energy shell. fully integrable systems are further constrained, and their solutions turn out to be regular and non-dense periodic orbits contained in s. if integrability is broken, the orbits become irregular and cover s densely. this means that the actual trajectory x(t) will uniformly visit every configuration within s, provided that enough time has elapsed. this last observation embodies the concept of ergodicity : an observable can be equivalently evaluated by averaging it for different configurations in s or by its time-average along a single trajectory x(t). in such a sense, ergodicity sets the equivalence between the gibbs’ description 070012-1 papers in physics, vol. 7, art. 070012 (2015) / p. r. zangara et al. in terms of ensembles and boltzmann kinetic approach to thermostatistics. therefore, the ergodic hypothesis has become the cornerstone of classical statistical mechanics [1, 2]. almost 60 years ago, e. fermi, j. pasta and s. ulam (fpu) [3] tried to study when and how the integrability breakdown can lead to an ergodic behavior within a deterministic evolution. they considered a string of harmonic oscillators perturbed by anharmonic forces in order to verify that these non-linearities can lead to energy equipartition as a manifestation of ergodicity. even though ulam himself stated “the motivation then was to observe the rates of mixing and thermalization...” [4], the results were not the expected ones: “thermalization” dynamics did not show up at all. nowadays, their striking results are well understood in terms of the theory of chaos [5]. in this context, chaos is defined as an exponential sensitivity to changes in the initial condition. in fact, the onset of dynamical chaos [6, 7] can be identified with the transition from non-ergodic to ergodic behavior. therefore, within classical physics, the emergence of ergodicity can be satisfactorily explained [8]. the previous physical picture cannot be directly extended to quantum mechanics. indeed, any closed quantum system involves a discrete energy spectrum and evolves quasi-periodically in the hilbert space, which becomes the quantum analogue to the classical phase space. nevertheless, thermalization and ergodicity in isolated quantum systems could still be defined for a set of relevant observables [9, 10]. since the sensitivity to initial conditions does not apply to quantum systems, the quantum signature of dynamical chaos had to be found as an instability of an evolution towards perturbations in the hamiltonian [11]. because this definition encompasses the classical one, it builds a bridge between classical and quantum chaos. moreover, since it also implies an instability towards perturbations in a time reversal procedure, it can be experimentally evaluated [12] as the amount of excitation recovered or loschmidt echo (le) [13, 15]. such a revival is degraded by the presence of uncontrolled environmental degrees of freedom as in the usual picture of decoherence for open quantum systems [14]. strikingly, in closed systems with enough internal complexity, even simple perturbations seem to degrade the le in a time scale given by the reverted dynamics, revealing how a mixing dynamics drives irreversibility [16, 17]. within the last years, a new generation of experiments on relaxation and equilibration dynamics of (almost perfectly) closed quantum many-particle systems has become accessible employing optical lattices loaded with cold atoms [18, 19]. these became the driving force behind the recent theoretical efforts to grasp quantum thermalization [20]. attempting a step beyond the fpu problem, the current aim is to study simple quantum models that could go parametrically from an ergodic to a non-ergodic quantum dynamics. moreover, a fundamental question is whether such a transition occurs as a smooth crossover or a sharp threshold. a promising candidate for these studies would be a system showing many-body localization (mbl) [21, 22]. this dynamical phenomenon occurs when an excitation in a disordered quantum system evolves in presence of interactions. in fact, the mbl is a quantum dynamical phase transition between extended and localized many-body states that results from the competition between interactions [23] and anderson disorder [24]. if the many-body states are extended, then one may expect that the system is ergodic enough to behave as its own heat bath. in such case, single energy eigenstates would yield expectation values for few-body observables that coincide with those evaluated in the microcanonical thermal ensemble [25, 26]. quite on the contrary, if the many-body states are localized, any initial out-of-equilibrium condition would remain almost frozen. in this case, self-thermalization is precluded. therefore, the mbl would evidence the sought threshold between ergodic and non-ergodic behavior. in this article, we address the competition between interactions and disorder in a onedimensional (1d) spin system. it is already known that such models evidence the mbl transition, at least for particular parametric regimes [27–31]. our approach to tackle this problem involves the evaluation of the le, here defined as the amount of a local excitation recovered after an imperfect time reversal procedure. this involves the inversion of the sign of the kinetic energy terms in the many-spin hamiltonian [14]. moreover, the le is evaluated as an autocorrelation function that could become a suitable order parameter [32]. thus, the le is a natural observable that allows us to identify when the ergodicity of an excitation dynamics is broken 070012-2 papers in physics, vol. 7, art. 070012 (2015) / p. r. zangara et al. as interactions and disorder become strong enough. when weak, these “perturbations” favor the excitation spreading, but limit le recovery as they are not reversed. from the actual le time-dependence in the infinite-temperature regime, we extract a dynamical phase diagram that shows a non-trivial interplay between interactions and disorder. the near-zero temperature regime has already been addressed in the literature [33, 34] and there are conjectures about the global topography of the phase diagram [35]. in analogy with this last case, we address the nature of two critical lines that separate the ergodic phase from two different non-ergodic phases: the mott insulator and the mbl phase. the appearance of either, mott insulator and mbl phases, can be well estimated in terms of the relevant energy scales. thus, in order to evaluate an estimation of the critical lines, we compute the energy uncertainties that weak disorder and weak interactions would impose on the states involved in the mott transition and on the mbl transition, respectively. quite remarkably, these estimations show a good agreement with the dynamical le diagram. our approach allows the identification of ergodic and non-ergodic phases whose non-trivial structure may guide future theoretical and experimental investigations. ii. loschmidt echo formulation we consider a 1d spin system that evolves according to a hamiltonian ĥ = ĥ0 + σ̂. here, ĥ0 stands for a nearest neighbors xy 1 hamiltonian [36, 37]: ĥ0 = n∑ i=1 j [ sxi s x i+1 + s y i s y i+1 ] (1) = n∑ i=1 1 2 j [ s−i s + i+1 + s + i s − i+1 ] , (2) which because of the periodic boundary conditions can be thought as arranged in a ring. here, unless explicitly stated, n = 12. notice that ĥ0 can be mapped into two independent non-interacting 1notice that, in the recent literature of stronglycorrelated systems within the condensed matter community, the notation xx is employed instead of xy . fermion systems by the wigner-jordan transformation [38]. therefore, it encloses fully integrable single-particle dynamics. the integrability of the model is broken by the ising interactions and the on-site disorder enclosed in σ̂, σ̂ = n∑ i=1 ∆szi s z i+1 + n∑ i=1 his z i , (3) where ∆ is the magnitude of the homogeneous interaction and hi are randomly distributed fields in the range [−w,w]. in order to enable the comparison with the standard anderson localization literature, we stress that w here is half of the standard strength commonly used for the anderson disorder [39]. the initial out-of-equilibrium condition is given by an infinite-temperature state in which a local excitation (polarization) is injected at site 1: |ψneq〉 = |↑1〉⊗   2n−1∑ r=1 1 √ 2n−1 eiϕr |βr〉   , (4) where ϕr is a random phase and {|βr〉} are state vectors in the computational ising basis of the n−1 remaining spins. the state defined in eq. (4) is a random superposition over the whole hilbert space, and can successfully mimic the dynamics of ensemble calculations [40]. additionally, notice that σ̂ perturbs the quantum phase of each of the ising states participating in the superposition. two evolution operators are built from the hamiltonian operators 1 and 3, according to the relative sign between ĥ0 and σ̂. these are û+(tr) = exp[− i~ (ĥ0 + σ̂)tr] and û−(tr) = exp[− i ~ (−ĥ0 + σ̂)tr]. in this scenario, the le is defined as the revival of the local polarization at site 1: m1,1(2t) = 2〈ψneq|û † +(t)û † −(t)ŝ z 1û−(t)û+(t) |ψneq〉 . (5) it is important to stress that eq. (5) constitutes, at least in principle, an actual experimental observable of the kind evaluated since the early le experiments [12, 16, 17], see also ref. [41]. moreover, both the local excitation and detection are well established techniques within solid-state nmr 070012-3 papers in physics, vol. 7, art. 070012 (2015) / p. r. zangara et al. [42]. nevertheless, changing the signs of specific hamiltonian terms results in a more subtle task. while the standard dipole-dipole interaction can be reverted [43], the planar xy interaction requires much more sophisticated pulse sequences, even for its forward implementation [37]. in particular, the mapping of the xy interaction into a double-quantum hamiltonian, strictly valid in 1d systems [44], could provide a novel approach to the problem of localization as it recently did for 3d systems [45]. this could become a pathway towards an experimental realization much related to the problem considered here. in order to analyze the ergodicity of our observable, we evaluate the mean le, m̄1,1: m̄1,1(t) = 1 t ∫ t 0 m1,1(t)dt. (6) the standard analysis of localization implies the computation of limt→∞ m̄1,1(t). however, since the le is evaluated within a finite system, dynamical recurrences known as mesoscopic echoes show up at the (single-particle) heisenberg time th of ĥ0. as extensively discussed in refs. [36, 38, 46], th can be estimated as th ∼ 2 √ 2n ~ j . (7) this estimation can be interpreted as the time needed by a local excitation to wind around a ring of length l = n × a at an average speed vm/ √ 2, with a maximum group velocity vm = a × 12j/~. since these recurrences are spurious to our analysis of the limiting case n → ∞, we restrict our analysis to t < th . let us provide some specific details that should allow the reproduction of our numerical computation. we evaluate eq. (6) ranging both ∆ and w within the interval [0, 5j], considering increments of 0.2j in both magnitudes. the relevant parameter regions were explored in more detail by employing steps of 0.1j. for each parameter set, 10 realizations of disorder were averaged, each of them with second moment 〈 h2 〉 = w 2/3. for the noninteracting case ∆ = 0, i.e., pure anderson disorder, we computed 500 disorder realizations. with the purpose of keeping the statistical fluctuations negligibly small, an extra average over 10 realizations of |ψneq〉 is performed by tossing the phases ϕr in the whole [0, 2π) range. iii. a dynamical phase diagram 0 1 2 3 4 5 0 1 2 3 4 5 wc( ) c(w) anderson localization many-body localization glassy ergodic in te ra ct io n [j ] disorder w [j] 0.42 0.54 0.65 0.77 0.88 1.0 decoherent figure 1: dynamical phase diagram: m̄1,1(t) level plot at t = 12~/j as a function of the interaction strength ∆ and disorder w . figure 1 displays the dynamical phase diagram for the le. it is given by a level plot of m̄1,1 at t = 12~/j, as function of the interaction ∆ and disorder strength w . within the diagram, five dynamical regions are identified according to the predominant mechanism: decoherent, ergodic, glassy, anderson localization, and many-body localization. if both ∆ and w are weak, the system is almost reversible, the dynamics is controlled by single particle propagations and therefore m̄1,1 remains near 1. this means that despite of the slight phase perturbations, the local excitation can be driven back by the reversal of ĥ0. thus, the parametric region at the bottom left corner may be associated with decoherence, i.e., a sort of spin wave behavior weakly perturbed by the imperfect control of the internal degrees of freedom [37, 47]. if either ∆ or w are further increased, the propagation of a local excitation, ruled by ĥ0, suffers the effects of σ̂ as multiple scattering events with the disordered potential and with other spins. thus, the excitation enters in a diffusive regime where it rapidly spreads all over the spin system. as these scattering processes cannot be undone by the reversal procedure, the spreading becomes irreversible. consistently, this bluish region is associated with an ergodic behavior for the polarization. in this 070012-4 papers in physics, vol. 7, art. 070012 (2015) / p. r. zangara et al. regime, the polarization becomes evenly distributed within the system, i.e., 2 〈 ŝzj 〉 = 1/n for all j. the ideal limit m̄1,1(t → ∞) → 1/n is verified up to a numerical offset that comes from the transient decay of the le. if w = 0, a strong increase in ∆ leads to a predominance of the ising interaction, which freezes the polarization dynamics. since the quantum diffusion induced by ĥ0 results drastically constrained, m̄1,1 remains trivially high. we interpret such behavior as a glassy dynamics with long relaxation times. in fact, this sort of localization corresponds to the mott insulating phase of an impurity band [23]. additionally, the color contrast around ∆ & 2j suggests that the glassy-ergodic transition remains abrupt even for nonzero disorder (w . 1.0j). this indicates a parameter region where the interaction-disorder competition leads to a sharp transition between the glassy and the ergodic phases. however, since transient phenomena become very slow, a reliable finite size scaling of this regime would require excessively long times to capture how a vitreous dynamics is affected by disorder. a dimensional argument provides a hint on the nature of the critical line that separates the ergodic and glassy phases. in fact, the mott transition typically occurs when the interaction strength ∆ is comparable to the bandwidth b = 2j. such a particular interaction strength is singled out in fig. 1 by a full black circle. adding a weak disorder introduces an energy uncertainty δe on the energy levels that would widen b. in order to estimate it, we resort to its corresponding time scale τ, which in turn can be evaluated according to the fermi golden rule (fgr). with such a purpose, we consider a localized excitation that can “escape” either to its right or to its left side, where two semi-infinite linear chains are symmetrically coupled. then, 1 τ = 2 2π ~ ( w 2 3 ) n1(ε). (8) here, as stated above, w 2/3 stands for second moment of the disorder distribution. the factor 2 accounts for the two alternative decays (right and left). additionally, n1(ε) is the local density of states (ldos) of a semi-infinite linear chain with hopping element j/2, n1(ε) = 2 πj √ 1 − (ε j )2 . (9) the energy levels acquire a lorentzian broadening which, evaluated at the spectral center ε = 0, results δe = ~ 2τ = 4 3 w 2 j √ 1 − (ε j )2∣∣∣∣∣ ε=0 = 4 3 w 2 j . (10) since half of the states lie beyond the range b + 2δe, one may attempt an estimation of the critical line for the mott transition as, ∆c(w) ∼ b + 2δe ∼ 2j + 8 3 w 2 j , (11) which is displayed in fig. 1 as a dashed line. a similar functional dependence as the one discussed here for the glassy-ergodic interphase was conjectured by kimball for the interacting ground state diagram [35]. additionally, it is worthy to mention that a naive expectation about the morphology of the phase diagram with two competing magnitudes would be a semi-circular shape. this is precisely the case of magnetic field and temperature as in the phase diagram of a type i superconductor. thus, one of the highly non-trivial implications of the reentrance of the ergodic phase at large ∆ in our diagram is to debunk such an expectation. if ∆ = 0, the picture for w > 0 is the standard anderson localization problem. here, a reliable estimate of the localization length is only possible when it is smaller than the finite size of the system. not being this the case of very weak disorder, the le degrades smoothly as a function of time with a dynamics that cannot be distinguished from a diffusive one. when the disorder is strong enough, the localization length becomes comparable with the lattice size and thus the initial local excitation does not spread significantly. strictly speaking, while disordered 1d systems are always localized, there are two mechanisms contributing to localization. one of them is the “strong localization”, i.e., the convergence, term by term, of a perturbation theory for the local green’s function. the other is the “weak localization”, originated in the interferences between long perturbation pathways. this last one was an idea conceptually difficult to grasp, both theoretically and numerically, until the appearance of the scaling theory 070012-5 papers in physics, vol. 7, art. 070012 (2015) / p. r. zangara et al. of conductance by the “gang of four” [48]. while weak localization is particularly relevant in 1d and 2d systems, when these have a finite size the dynamics remains diffusive and thus closely assimilable to an ergodic one. in our problem, as soon as ∆ & 0, the many-body interaction increases the effective dimensionality of the available hilbert space, and thus it competes with the anderson localization. regardless of the precise behavior near ∆ = 0, such an interplay between ∆ and w is the responsible for the onset of a localization transition at some wc(∆) > 0, much as in a high dimensional lattice. the ergodic-localized mbl transition can be observed when increasing w for a fixed ∆ > 0. in particular, we notice that localization by disorder is weakened when 1.0j . ∆ . 2.0j, since the ergodic region seems to unfold for larger w . again, since we consider a finite system, our observable describes a smooth crossover from the ergodic to a localized phase, where the excitation does not diffuse considerably. in fact, this corresponds to the actual mbl phase transition [27–31], which is genuinely sharp in the thermodynamic limit. according to eq. (7), t . th ∝ n, and thus increasing n in our simulations (e.g., 10, 12 and 14) enables an integration over a larger time t. in fact, when ∆ ∼ 1.0j, we verified that both sides of the mbl transition m̄1,1 behave as expected from physical grounds. indeed, in the ergodic phase it has the asymptotic behavior m̄1,1 ∼ 1/n, while in the localized phase of strong w it saturates at m̄1,1 ∼ 1/λ, regardless of n. the compatibility with a finite size scaling analysis is confirmed by the fact that ∂m̄1,1/∂w increases with n. however, our accessible range for n is not complete enough to provide for a scaling of m̄1,1(t) that could yield precise critical values for the mbl transition. in analogy to the case of the mott transition, a dimensional argument can be performed to estimate the critical line wc(∆). even though there is no actual phase transition in the 1d non-interacting case ∆ = 0, as the interactions appear we expect them to break down the 1d constrains. thus, we consider as a singular point the high dimensional estimate that occurs when the disorder strength is comparable to the bandwidth [49] wc(∆)|∆=0 = (e/2)b. (12) the particular disorder strength in eq. (12) is indicated in fig. 1 as an open circle, since it does not correspond to an actual critical point of the 1d problem. again, adding interactions would introduce an energy uncertainty that widens the band accordingly. in this case, the uncertainty δe is associated to the lifetime introduced by the ising interactions. the corresponding fgr evaluation for such a time-scale is explicitly performed in ref. [50] and it yields: 1 τ = 2 2π ~ ∆2 4 3π2j . (13) as above, the extra 2 factor stands for the contributions of two semi-infinite linear chains. the factor 4/(3π2j) stands for the corresponding ldos evaluated at ε = 0. then, δe = ~ 2τ = 8 3π ∆2 j . (14) this uncertainty adds to the bandwidth and hence it leads to the dimensional estimation of the critical line of the mbl transition, wc(∆) ∼ e 2 (b + 2δe) ∼ 2.71 2 ( 2j + 16 3π ∆2 j ) , (15) which is plotted in fig. 1 as a dashed line starting in the open circle. iv. conclusion we simulated the dynamics of a local loschmidt echo in a spin system in the presence of interactions and disorder, for a wide regime of these competing magnitudes. the computation yields a phase diagram that evidences the parametric region where ergodicity manifests. non-ergodic behaviors were classified and discussed in terms of glassy dynamics, standard anderson localization and the manybody localization. based on the evaluation of energy uncertainties introduced by weak interactions and weak disorder, we estimated the critical lines that separate these phases. the agreement between the estimated critical lines and the le diagram is considerably good. in spite of the fact that the local nature of the le observable constitutes a limitation to perform 070012-6 papers in physics, vol. 7, art. 070012 (2015) / p. r. zangara et al. a reliable finite size scaling procedure, our strategy seems promising to analyze different underlying topologies and different ways of breaking down integrability. last, but not least, in state-of-theart nmr [51, 52], the high temperature correlation functions, like the le, are privileged witnesses for the onset of phase transitions [53] that could hint the appearance of many-body localization [45, 51, 54]. acknowledgements prz and hmp wish to dedicate this paper to the memory of their coauthor patricia rebeca levstein who did not live to see the final version of this paper. we are grateful to a. iucci, a. d. dente and c. bederián for their cooperation at various stages of this work. this work benefited from fruitful discussions with t. giamarchi and comments by f. pastawski. we acknowledge support from conicet, anpcyt, secyt-unc and mincyt-cor. the calculations were done on graphical processing units under an nvidia professor partnership program led by o. reula. [1] j l lebowitz, boltzmann’s entropy and time’s arrow, phys. today 46, 32 (1993). [2] j l lebowitz, statistical mechanics: a selective review of two central issues, rev. mod. phys. supplement 71, 346 (1999). [3] e fermi, j pasta, s ulam, studies of nonlinear problems, lasl report la1940 5, 977 (1955). [4] e fermi, collected pppers: united states 1939-1954, vol. 2, university of chicago press (1965). [5] g p berman, f m izrailev, the fermi–pasta– ulam problem: fifty years of progress, chaos 15, 015104 (2005). [6] b v chirikov, resonance processes in magnetic traps, j. nucl. energy c 1, 253 (1960). [7] f m izrailev, b v chirikov, statistical properties of a nonlinear string, sov. phys. dokl. 11, 30 (1966). [8] g m zaslavsky, chaotic dynamics and the origin of statistical laws, phys. today 52, 39 (1999). [9] s goldstein, j l lebowitz, r tumulka, n zangh̀ı, long-time behavior of macroscopic quantum systems. commentary accompanying the english translation of john von neumann’s 1929 article on the quantum ergodic theorem, eur. phys. j. h 35, 173 (2010). [10] j von neumann, proof of the ergodic theorem and the h-theorem in quantum mechanics. translation of: beweis des ergodensatzes und des h-theorems in der neuen mechanik, eur. phys. j. h 35, 201 (2010). [11] r a jalabert, h m pastawski, environmentindependent decoherence rate in classically chaotic systems, phys. rev. lett. 86, 2490 (2001). [12] p r levstein, g usaj, h m pastawski, attenuation of polarization echoes in nuclear magnetic resonance: a study of the emergence of dynamical irreversibility in many-body quantum systems, j. chem. phys. 108, 2718 (1998). [13] a goussev, r a jalabert, h m pastawski, d wisniacki, loschmidt echo, scholarpedia 7, 11687 (2012). [14] p r zangara, a d dente, p r levstein, h m pastawski, loschmidt echo as a robust decoherence quantifier for many-body systems, phys. rev. a 86 012322 (2012). [15] ph jacquod, c petitjean, decoherence, entanglement and irreversibility in quantum dynamical systems with few degrees of freedom, adv. in phys. 58, 67 (2009). [16] h m pastawski, p r levstein, g usaj, j raya, j hirschinger, a nuclear magnetic resonance answer to the boltzmann-loschmidt controversy? physica a 283, 166 (2000). [17] g usaj, h m pastawski, p r levstein, gaussian to exponential crossover in the attenuation of polarization echoes in nmr, mol. phys. 95, 1229 (1998). [18] t kinoshita, t wenger, d s weiss, a quantum newton’s cradle, nature 440, 900 (2006). 070012-7 papers in physics, vol. 7, art. 070012 (2015) / p. r. zangara et al. [19] s trotzky, y-a chen, a flesch, i p mcculloch, u schollwöck, j eisert, i bloch, probing the relaxation towards equilibrium in an isolated strongly correlated one-dimensional bose gas, nat. phys. 8, 325 (2012). [20] a polkovnikov, k sengupta, a silva, m vengalattore, colloquium: nonequilibrium dynamics of closed interacting quantum systems. rev. mod. phys. 83, 863 (2011). [21] d m basko, i l aleiner, b l altshuler, metal insulator transition in a weakly interacting many-electron system with localized singleparticle states, ann. phys. new york 321, 1126 (2006). [22] i l aleiner, b l altshuler, g v shlyapnikov, a finite-temperature phase transition for disordered weakly interacting bosons in one dimension, nat. phys. 6, 900 (2010). [23] n f mott, metal-insulator transition, rev. mod. phys. 40, 677 (1968). [24] p w anderson, local moments and localized states, rev. mod. phys. 50, 191 (1978). [25] s popescu, a j short, a winter,entanglement and the foundations of statistical mechanics. nat. phys. 2, 754 (2006). [26] m rigol, v dunjko, m olshanii, thermalization and its mechanism for generic isolated quantum systems, nature 452, 854 (2008). [27] v oganesyan, d a huse, localization of interacting fermions at high temperature, phys. rev. b 75, 155111 (2007). [28] m žnidarič, t prosen, p prelovšek, many-body localization in the heisenberg xxz magnet in a random field, phys. rev. b 77, 064426 (2008). [29] a pal, d a huse, many-body localization phase transition, phys. rev. b 82, 174411 (2010). [30] j h bardarson, f pollmann, j e moore, unbounded growth of entanglement in models of many-body localization, phys. rev. lett. 109, 017202 (2012). [31] a de luca, a scardicchio, ergodicity breaking in a model showing many-body localization, europhys. lett. 101, 37003 (2013). [32] d pekker, g refael, e altman, e demler, v oganesyan, the hilbert-glass transition: new universality of temperature-tuned many-body dynamical quantum criticality, phys. rev. x 4, 011052 (2014). [33] t giamarchi, h j schulz, anderson localization and interactions in one-dimensional metals, phys. rev. b 37, 325 (1988). [34] c a doty, d s fisher, effects of quenched disorder on spin-1/2 quantum xxz chains, phys. rev. b 45, 2167 (1992). [35] j kimball, comments on the interplay between anderson localisation and electron-electron interactions, j. phys. c solid state 14, l1061 (1981). [36] h m pastawski, g usaj, p r levstein, quantum interference phenomena in the local polarization dynamics of mesoscopic systems: an nmr observation, chem. phys. lett. 261, 329 (1996). [37] z l mádi, b brutscher, t schulte-herbrüggen, r brüschweiler, r r ernst, time-resolved observation of spin waves in a linear chain of nuclear spins, chem. phys. lett. 268, 300 (1997). [38] e p danieli, h m pastawski, p r levstein, spin projection chromatography, chem. phys. lett. 384, 306 (2004). [39] b kramer, a mackinnon, localization: theory and experiment, rep. prog. phys. 56, 1469 (1993). [40] g a álvarez, e p danieli, p r levstein, h m pastawski, quantum parallelism as a tool for ensemble spin dynamics calculations, phys. rev. lett. 101, 120503 (2008). [41] s zhang, b h meier, r r ernst, polarization echoes in nmr, phys. rev. lett. 69, 2149 (1992). 070012-8 papers in physics, vol. 7, art. 070012 (2015) / p. r. zangara et al. [42] p cappellaro, implementation of state transfer hamiltonians in spin chains with magnetic resonance techniques, in: quantum state transfer and network engineering, eds. g m nikolopoulos, i jex, pag. 183–222, springer berlin heidelberg (2014). [43] w-k rhim, a pines, j s waugh, timereversal experiments in dipolar-coupled spin systems, phys. rev. b 3, 684 (1971). [44] e rufeil-fiori, c m sánchez, f y oliva, h m pastawski, p r levstein, effective one-body dynamics in multiple-quantum nmr experiments, phys. rev. a 79, 032324 (2009). [45] g a álvarez, d suter, r kaiser, experimental observation of a phase transition in the evolution of many-body systems with dipolar interactions, arxiv:1409.4562 (2014). [46] h m pastawski, p r levstein, g usaj, quantum dynamical echoes in the spin diffusion in mesoscopic systems, phys. rev. lett. 75, 4310 (1995). [47] l j fernández-alcázar, h m pastawski, decoherent time-dependent transport beyond the landauer-büttiker formulation: a quantumdrift alternative to quantum jumps, phys. rev. a 91, 022117 (2015). [48] e abrahams, p w anderson, d c licciardello, t v ramakrishnan, scaling theory of localization: absence of quantum diffusion in two dimensions, phys. rev. lett. 42, 673 (1979). [49] j m ziman, localization of electrons in ordered and disordered systems ii. bound bands, j. phys. c solid state 2, 1230 (1969). [50] e p danieli, g a álvarez, p r levstein, h m pastawski, quantum dynamical phase transition in a system with many-body interactions, solid state commun. 141, 422 (2007). [51] m b franzoni, p r levstein, manifestations of the absence of spin diffusion in multipulse nmr experiments on diluted dipolar solids, phys. rev. b 72, 235410 (2005). [52] s w morgan, v oganesyan, g s boutis, multispin correlations and pseudothermalization of the transient density matrix in solidstate nmr: free induction decay and magic echo,phys. rev. b 86, 214410 (2012). [53] j zhang, f m cucchietti, c m chandrashekar, m laforest, c a ryan, m ditty, a hubbard, j k gamble, r laflamme, direct observation of quantum criticality in ising spin chains, phys. rev. a 79, 012305 (2009). [54] g a álvarez, d suter, nmr quantum simulation of localization effects induced by decoherence, phys. rev. lett. 104, 230403 (2010). 070012-9 papers in physics, vol. 5, art. 050003 (2013) received: 6 april 2013, accepted: 3 june 2013 edited by: g. mindlin licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.050003 www.papersinphysics.org issn 1852-4249 invited review: epidemics on social networks m. n. kuperman1,2∗ since its �rst formulations almost a century ago, mathematical models for disease spreading contributed to understand, evaluate and control the epidemic processes. they promoted a dramatic change in how epidemiologists thought of the propagation of infectious diseases. in the last decade, when the traditional epidemiological models seemed to be exhausted, new types of models were developed. these new models incorporated concepts from graph theory to describe and model the underlying social structure. many of these works merely produced a more detailed extension of the previous results, but some others triggered a completely new paradigm in the mathematical study of epidemic processes. in this review, we will introduce the basic concepts of epidemiology, epidemic modeling and networks, to �nally provide a brief description of the most relevant results in the �eld. i. introduction with the development of more precise and powerful tools, the mathematical modeling of infectious diseases has become a crucial tool for making decisions associated to policies on public health. the scenario was completely di�erent at the beginning of the last century, when the �rst mathematical models started to be formulated. the rather myopic comprehension of the epidemiological processes was evidenced during the most dramatic epidemiologic events of the last century, the pandemic 1918 �u. the lack of a mathematical understanding of the evolution of epidemics gave place to an inaccurate analysis of the epidemiological situation and subsequent failed assertion of the success of the immunization strategy. during the in�uenza pandemic of 1892, a viral disease, richard pfei�er isolated ∗e-mail: kuperman@cab.cnea.gov.ar 1 consejo nacional de investigaciones cientí�cas y técnicas, argentina. 2 centro atómico bariloche and instituto balseiro, 8400 s. c. de bariloche, argentina bacteria from the lungs and sputum of patients. he installed, among the medical community, the idea that these bacteria were the cause of in�uenza. at that moment, the bacteria was called pfei�er's bacillus or bacillus in�uenzae, while its present name keeps a reminiscence of pfei�er's wrong hypothesis: haemophilus in�uenzae. though there were some dissenters, the hypothesis of linking in�uenza with this pathogen was widely accepted from then on. among the supporters of pfei�er hypothesis was william park, at the new york city health department, who in view of the fast progression of the �u in usa, developed a vaccine and antiserum against haemophilus in�uenzae on october 1918. shortly afterwards the philadelphia municipal laboratory released thousands of doses of the vaccine that was constituted by a mix of killed streptococcal, pneumococcal, and h. in�uenzae bacteria. several other attempts to develop similar vaccines followed this initiative. however, none of these vaccines prevented viral in�uenza infection. the present consensus is that they were even not protective against the secondary bacterial infections associated to in�uenza because the 050003-1 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman 37 39 41 43 45 47 49 51 1 3 5 0 50 100 150 200 d ea th r at e (p er 1 05 ) week figure 1: weekly �spanish in�uenza� death rates in baltimore (circles) and san francisco (squares) from 1918 to 1919. data taken from ref. [1]. vaccine developers at that time could not identify, isolate, and produce all the disease-causing strains of bacteria. nevertheless, a wrong evaluation of the evolution of the disease and a lack of epidemiological knowledge led to the conclusion that the vaccine was e�ective. if we look at fig. 1 corresponding to the weekly in�uenza death rates in a couple of u.s. cities taken from ref. [1], we observe a remarkable decay after vaccination, in week 43. this decay was inaccurately attributed to the e�ect of vaccination as it corresponds actually to a normal and expected development of an epidemics without immunization. the inaccurate association between h. in�uenzae and in�uenza persisted until 1933, when the viral etiology of the �u was established. but pfei�er's in�uenza bacillus, �nally named haemophilus in�uenzae, accounts in its denomination for this persistent mistake. the formulation of mathematical models in epidemiology has a tradition of more than one century. one of the �rst successful examples of the mathematical explanation of epidemiological situations is associated with the study of malaria. ronald ross was working at the indian medical service during the last years of the 19th century when he discovered and described the life-cycle of the malaria parasite in mosquitoes and developed a mathematical model to analyze the dynamics of the transmission of the disease [2�4]. his model linked the density of mosquitoes and the incidence of malaria among the human population. once he had identi�ed the anopheles mosquitoes as the vector for malaria transmission, ross conjectured that malaria could be eradicated if the ratio between the number of mosquitoes and the size of the human population was carried below a threshold value. he based his analysis on a simple mathematical model. ross' model was based on a set of deterministic coupled di�erential equations. he divided the human population into two groups, the susceptible, with proportion sh and the infected, with proportion ih. after recovery, any formerly infected individual returned to the susceptible class. this is called a sis model. the mosquito population was also divided into two groups (with proportions sm and im), with no recovery from infection. considering equations for the fraction of the population in each state, we have s + i = 1 for both humans and mosquitoes and the model is reduced to a set of two coupled equations dih dt = abfim(1− ih)−rih (1) dim dt = acih(1− im)−µmim, where a is the man biting rate, b is the proportion of bites that produce infection in humans, c is the proportion of bites by which one susceptible mosquito becomes infected, f is the ratio between the number of female mosquitoes and humans, r is the average recovery rate of human and µm is the rate of mosquito mortality. one of the parameters to quantify the intensity of the epidemics propagation is the basic reproductive rate r0, that measures the average number of cases produced by an initial case throughout its infectious period. r0 depends on several factors. among them, we can mention the survival time of an infected individual, the necessary dose for infection, the duration of infectiousness in the host, etc. r0 allows to determine whether or not an infectious disease can spread through a population: an infection can spread in a population only if r0 > 1 and can be maintained in an endemic state when r0 = 1 [5]. in the case of malaria, r0 is de�ned as the number of secondary cases of malaria arising from a single case in an susceptible population. for 050003-2 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman the model described by eq. (1) r0 = ma2bc rµm . (2) it is clear that the choice of the parameters a�ects r0. the main result is that it is possible to reduce r0 by increasing the mosquito mortality and reducing the biting rate. for his work on malaria, ross was awarded the nobel prize in 1902. ross' pioneering work was later extended to include other ingredients and enhance the predictability power of the original epidemiological model [5�11]. some years after ross had proposed his model, a couple of seminal works established the basis of the current trends in mathematical epidemiology. both models consider the population divided into three epidemiological groups or compartments: susceptible (s), infected (i) and recovered (r). on the one hand, kermack and mckendrick [12] proposed a sir model that expanded ross' set of di�erential equations. the model did not consider the existence of a vector, but a direct transmission from an infected individual to a susceptible one. a particular case of the original model, in which there is no age dependency of the transmission and recovery rate, is the classical sir model that will be explained later. on the other hand, reed and frost [13] developed a sir discrete and stochastic epidemic model to describe the relationship between susceptible, infected and recovered immune individuals in a population. it is a chain binomial model of epidemic spread that was intended mainly for teaching purposes, but that is the starting point of many modern epidemiological studies. the model can be mapped into a recurrence equation that de�nes what will happen at a given moment depending on what has happened in the previous one, it+1 = st(1− (1−ρ)it), (3) where it is the number of cases at time t, st is the number of susceptible individuals at time t and ρ is the probability of contagion. the basic assumption of these sir models, which is present in almost any epidemiological work, is that the infection is spread directly from infectious individuals to susceptible ones after a certain type of interaction between them. in turn, these newly infected individuals will develop the infection to become infectious. after a de�ned period of time, the infected individuals heal and remain permanently immune. the interaction between any two individuals of the population is considered as a stochastic process with a de�ned probability of occurrence that most of the deterministic model translates into a contact rate. given a closed population and the number of individuals in each state, the calculation of the evolution of the epidemics is straightforward. the epidemic event is over when no infective individuals remain. while many classic deterministic epidemiological models were having success at describing the dynamics of an infectious disease in a population, it was noted that many involved processes could be better described by stochastic considerations and thus a new family of stochastic models was developed [14�19]. sometimes, deterministic models introduce some colateral mistakes due to the continuous character of the involved quantities.an example of such a case is discussed in ref. [20]. in ref. [21], the authors proposed a deterministic model to describe the prevalence of rabies among foxes in england. they predicted a sharp decaying prevalence of the rabies up to negligible levels, followed by an unexpected new outbreak of infected foxes. the spontaneous outbreak after the apparent disappearing of the rabies is due to a �ctitious very low endemic level of infected foxes, as explained in ref. [20]. the former one is one among several examples of how stochastic models contributed to a better understanding and explanation of some observed phenomena but, as their predecessors, they considered a mean �eld scheme in the set of di�erential equations. traditional epidemiological models have successfully describe the generalities of the time evolution of epidemics, the di�erential e�ect on each age group, and some other relevant aspects of an epidemiological event. but all of them are based on a fully-mixing approximation, proposing that each individual has the same probability of getting in touch with any other individual in the population. the real underlying pattern of social contacts shows that each individual has a �nite set of acquaintances that serve as channels to promote the contagion. while the fully mixed approximation allows for writing down a set of di�erential equations 050003-3 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman and a further exploitation of a powerful analytic set of tools, a better description of the structure of the social network provides the models with the capacity to compute the epidemic dynamics at the population scale from the individual-level behavior of infections, with a more accurate representation of the actual contact pattern. this, in turn, re�ects some emergent behavior that cannot be reproduced with a system based on a set of di�erential equation under the fully mixing assumption. one of the most representative examples of this behavior is the so called herd immunity, a form of immunity that occurs when the vaccination of a signi�cant portion of the population is enough to block the advance of the infection on other non vaccinated individuals. additionally, some network models allow also for an analytic study of the described process. it is not surprising then that during the last decade, a new tendency in epidemiological modeling emerged together with the inclusion of complex networks as the underlying social topology in any epidemic event. this new approach proves to contribute with a further understanding of the dynamics of an epidemics and unveils the crucial e�ect of the social architecture in the propagation of any infectious disease. in the following section, we will introduce some generalities about traditional epidemiological models. in section iii, we will present the most commonly used complex networks when formulating an epidemiological model. in section iv, we will describe the most relevant results obtained by modeling epidemiological processes using complex networks to describe the social topology. next, we will introduce the concept of herd protection or immunity and a discussion of some of the works that treat this phenomenon. ii. basic epidemiological models two main groups can be singled out among the deterministic models for the spread of infectious diseases which are transmitted through person-toperson contact: the sir and the sis. the names of these models are related to the di�erent groups considered as components of the population or epidemiological compartments: s corresponds to susceptible, i to infected and r to removed. the s group represents the portion of the population that has not been a�ected by the disease but may be infected in case of contact with a sick person. the i group corresponds to those individuals already infected and who are also responsible for the transmission of the disease to the susceptible group. the removed group r includes those individuals recovered from the disease who have temporary or permanent immunity or, eventually, those who have died from the illness and not from other causes. these models may or may not include the vital dynamics, associated with birth and death processes. its inclusion depends on the length of time over which the spread of the disease is studied. i. the sir model as mentioned before, in 1927, kermack and mckendrick [12] developed a mathematical model in which they considered a constant population divided into three epidemiological groups : susceptible, infected and recovered. the equations of a sir model are ds dt = −βsi di dt = βsi −γi (4) dr dt = γi, where the involved quantities are the proportion of individuals in each group. as the population is constant, s(t) + i(t) + r(t) = 1. (5) the sir model is used when the disease under study confers permanent immunity to infected individuals after recovery or, in extreme cases, it kills them. after the contagious period, the infected individual recovers and is included in the r group. these models are suitable to describe the behavior of epidemics produced by virus agent diseases (measles, chickenpox, mumps, hiv, poliomyelitis) [22]. the model formulated through eq. (4) assumes that all the individuals in the population have the same probability of contracting the disease with a rate of β, the contact rate. the number of infected increases proportionally to both the number 050003-4 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman of infected and susceptible. the rate of recovery or removal is proportional to the number of infected only. γ represents the mean recovery rate, ( 1/γ is the mean infective period). it is assumed that the incubation time is negligible and that the rates of infection and recovery are much faster than the characteristic times associated to births and deaths. usually, the initial conditions are set as s(0) > 0, i(0) > 0 and r(0) = 0. (6) it is straightforward to show that di dt ∣∣∣∣ t=0 = i(0)(βs(0)−γ), (7) and that the sign of the derivative depends on the value of sc = γ β . when s(t) > sc, the derivative is positive and the number of infected individuals increases. when s(t) goes below this threshold, the epidemic starts to fade out. a rather non intuitive result can be obtained from eq. 4. we can write ds dr = − s ρ ⇒ s = s0 exp[−r/ρ] ≥ s0 exp[−n/ρ] > 0 ⇒ 0 < s(∞) ≤ n. (8) the epidemics stops when i(t) = 0, so we can set i(∞) = 0, so r(∞) = n −s(∞). from (8), s(∞) = s0 exp [ − r(∞) ρ ] = s0 exp [ − n −s(∞) ρ ] . (9) the last equation is a transcendent expression with a positive root s(∞). taking (9), we can calculate the total number of susceptible individuals throughout the whole epidemic process itotal = i0 + s0 −s(∞). (10) as i(t) → 0 and s(t) → s(∞) > 0, we conclude that when the epidemics end, there is a portion of the population that has not been a�ected the previous model can be extended to include vital dynamics [23], delays equations [24], age structured population, migration [25], and di�usion. in 0 10 20 30 40 50 0.0 0.2 0.4 0.6 0.8 1.0 p ro po rt io n time (arb. units) susceptible recovered infected figure 2: temporal behavior of the proportion of individuals in each of the three compartments of the sir model. any case, all these generalizations only introduce some slight changes on the steady states of the system, or in the case of spatially extended models, travelling waves [26]. figure 2 displays the typical behavior of the density of individuals in each of the epidemiological compartments described by eq. (4). compare this with the pattern shown in fig. 1. ii. the sis model the sis model assumes that the disease does not confer immunity to infected individuals after recovery. thus, after the infective period, the infected individual recovers and is again included in the s group. therefore, the model presents only two epidemiological compartments, s and i. this model is suitable to describe the behavior of epidemics produced by bacterial agent diseases (meningitis, plague, venereal diseases) and by protozoan agent diseases (malaria) [22]. we can write the equations for a general sis model assuming again that the population is constant, ds dt = −βsi + γi di dt = βsi −γi. (11) as the relation s + i = 1 holds, eq. (11) can be reduced to a single equation, di dt = (β −γ)i −βi2. (12) 050003-5 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman the solution of this equation is i(t) = (1− γ β ) c exp[(γ −β)t] 1 + c exp[(γ −β)t] , (13) where c is de�ned by the initial conditions as c = βi0 β(1− i0)−γ . (14) if i0 is small and β > γ, the solution is a logistic growth that saturates before the whole population is infected, the stationary value is is = β−γ β . it can be shown that r0 = β/γ. this sets the condition for the epidemic to persist. iii. other models the literature on epidemiological models includes several generalizations about the previous ones to adapt the description to the particularities of a speci�c infectious disease [27]. one possibility is to increase the number of compartments to describe di�erent stages of the state of an individual during the epidemic spread. among these models, we can mention the sirs, a simple extension of the sir that does not confer a permanent immunity to recovered individuals and after some time they rejoin the susceptible group, ds dt = −βsi + +λr di dt = βsi −γi (15) dr dt = γi −λr. other models include more epidemiological groups or compartments, such as the seis and seir model, that take into consideration the exposed or latent period of the disease, by de�ning an additional compartment e. there are several diseases in which there is a vertical transient immunity transmission from a mother to her newborn. then, each individual is born with a passive immunity acquired from the mother. to indicate this, an additional group p is added. the range of possibilities is rather extended, and this is re�ected in the title of ref. [27]: �a thousand and one epidemiological models�. there are a figure 3: transfer diagram for a seirs models. taken from ref. [27]. lot of possibilities to de�ne the compartment structure. usually, this structure is represented as a transfer chart indicating the �ow between the compartments and the external contributions. figure 3 shows an example of a diagram for a seirs model, taken from ref. [27]. horizontal incidence refers to a contagion due to a contact between a susceptible and infectious individual, vertical incidence account for the possibility for the o�spring of infected parents to be born infected, such as with aids, hepatitis b, chlamydia, etc. many of the previous models have been expanded, including stochastic terms. one of the most relevant di�erences between the deterministic and stochastic models is their asymptotic behavior. a stochastic model can show a solution converging to the disease-free state when the deterministic counterpart predicts an endemic equilibrium. the results obtained from the stochastic models are generally expressed in terms of the probability of an outbreak and of its size and duration distribution [14�19]. iii. complex networks a graph or network is a mathematical representation of a set of objects that may be connected between them through links. the interconnected objects are represented by the nodes (or vertices) of the graph while the connecting links are associated to the edges of the graph. networks can be characterized by several topological properties, some of which will be introduced later. social links are preponderantly non directional (symmetric), though there are some cases of social directed networks. the set of nodes attached to a given node through these links is called its neighborhood. the size of the neighborhood is the degree of the node. while the study of graph theory dates back to the 050003-6 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman pioneering works of erdös and renyi in the 1950s [28], their gradual colonization of the modern epidemiological models has only started a decade ago. the attention of modelers was drawn to graph theory when some authors started to point out that the social structure could be mimicked by networks constructed under very simple premises [30, 34]. since then, a huge collection of computer-generated networks have been studied in the context of disease transmission. the underlying rationale for the use of networks is that they can represent how individuals are distributed in social and geographical space and how the contacts between them are promoted, reinforced or inhibited, according to the rules of social dynamics. when the population is fully mixed, each individual has the same probability of coming into contact with any other individual. this assumption makes it possible to calculate the e�ective contact rates β as the product of the transmission rate of the disease, the e�ective number of contacts per unit time and the proportion of these contacts that propagate the infection. the formulation of a mean �eld model is then straightforward. however, in real systems, the acquaintances of each individual are reduced to a portion of the whole population. each person has a set of contacts that shapes the local topology of the neighborhood. the whole social architecture, the network of contacts, can be represented with a graph. in the limiting case when the mean degree of the nodes in a network is close to the total number of nodes, the di�erence between a structured population and a fully mixed one fades out. the di�erences are noticeable when the network is diluted, i.e., the mean degree of the node is small compared with the size of the network. this will be a necessary condition for all the networks used to model disease propagation. in the following paragraphs, we will introduce the most common families of networks used for epidemiological modeling. lattices. when incorporating a network to a model, the simplest case is considering a grid or a lattice. in a squared d dimensional lattice, each node is connected to 2d neighbors. individuals are regularly located and connected with adjacent neighbors; therefore, contacts are localized in space. figure 4 shows, among others, an example of a two dimensional square lattice small-world networks. the concept of small world was introduced by milgram in 1967 in order figure 4: scheme of four kinds of networks: (a) lattice, (b)scale free, (c) exponential, (d) small world. to describe the topological properties of social communities and relationships [29]. some years ago, watts and strogatz introduced a model for constructing networks displaying topological features that mimic the social architecture revealed by milgram. in this model of small world (sw) networks a single parameter p, running from 0 to 1, characterizes the degree of disorder of the network, ranging from a regular lattice to a completely random graph [30]. the construction of these networks starts from a regular, one-dimensional, periodic lattice of n elements and coordination number 2k. each of the sites is visited, rewiring k of its links with probability p. values of p within the interval [0,1] produce a continuous spectrum of small world networks. note that p is the fraction of modi�ed regular links. a schematic representation of this family of networks is shown in fig. 5. figure 5: representation of several small world networks constructed according the algorithm presented in ref. [30]. as the disorder degree increases, there number of shortcuts grow replacing some of the original (ordered network) links. to characterize the topological properties of the 050003-7 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman sw networks, two magnitudes are calculated. the �rst one, l(p), measures the mean topological distance between any pair of elements in the network, that is, the shortest path between two vertices, averaged over all pairs of vertices. thus, an ordered lattice has l(0) ∼ n/k, while, for a random network, l(1) ∼ ln(n)/ln(k). the second one, c(p), measures the mean clustering of an element's neighborhood. c(p) is de�ned in the following way: let us consider the element i, having ki neighbors connected to it. we denote by ci(p) the number of neighbors of element i that are neighbors among themselves, normalized to the value that this would have if all of them were connected to one another; namely, ki(ki −1)/2. now, c(p) is the average, over the system, of the local clusterization ci(p). ordered lattices are highly clustered, with c(0) ∼ 3/4, and random lattices are characterized by c(1) ∼ k/n. between these extremes, small worlds are characterized by a short length between elements, like random networks, and high clusterization, like ordered ones. 10-3 10-2 10-1 100 0.0 0.2 0.4 0.6 0.8 1.0 l & c p figure 6: in this �gure, we show the mean values of the clustering coe�cient c and the path length l as a function of the disorder parameter p. note the fast decay of l and the presence of a region where the value adopted by l is similar to the one corresponding to total disorder, while the value adopted by c is close to the one corresponding to the ordered case. other procedures for developing similar social networks have been proposed in ref. [31] where instead of rewiring existing links to create shortcuts, the procedure add links connecting two randomly chosen nodes with probability p. in fig. 7, we show an example, analogous to the one shown in fig. 5. figure 7: representation of several small world networks constructed according the algorithm presented in ref. [31]. as the disorder degree increases, three number of shortcuts as well as the number of total links grow. random networks. there are di�erent families of networks with random genesis but displaying a wide spectra of complex topologies. in random networks, the spatial position of individuals is irrelevant and the links are randomly distributed. the iconic erdös-rényi (er) random graphs are built from a set of nodes that are randomly connected with probability p, independently of any other existing connection. the degree distribution, i.e., the number of links associated to each node, is binomial and when the number of nodes is large, it can be approximated by a poisson distribution [32]. in ref. [33], the authors propose a formalism based on the generating function that permits to construct random networks with arbitrary degree distribution. the mechanism of construction also allows for further analytic studies on these networks. in particular, networks can be chosen to have a power law degree distribution. this case will be presented in the next paragraphs. scale-free network. as mentioned before, one of the most revealing measures of a network is its degree of distribution, i.e., the distribution of the number of connections of the nodes. in most real networks, it is far from being homogeneous, with highly connected individuals on one extreme and almost isolated nodes on the other. scale-free networks provide a means of achieving such extreme levels of heterogeneity. scale-free networks are constructed by adding new individuals to a core, with a connection mechanism that imitates the underlying process that rules 050003-8 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman figure 8: this �gure shows examples of (a) er and (b) ba networks. the �gure also displays the connectivity distribution p(k), that follows a binomial distribution for the er networks and a power law for ba networks. the choice of social contacts. the barabási albert (ba) model algorithm, one of the triggers of the present huge interest on scale-free networks, uses a preferential attachment mechanism [34]. the algorithm starts from a small nucleus of connected nodes. at each step, a new node is added to the network and connected to m existing nodes. the probability of choosing a node pi is proportional to the number of links that the existing node already has pi = ki∑ j kj , where ki is the degree of node i. that means that the new nodes have a preference to attach themselves to the most �popular� nodes. one salient feature of these networks is that their degree distribution is scale-free, following a power law of the form p (k) ∼ k−3. a sketch of the typical topology of the last two networks is shown in fig. 8. while the degree distribution of the er network has a clear peak and is close to homogeneous, the topology of the ba network is dominated by the presence of hub, highly connected nodes. the �gure also displays the typical degree distribution p(k) for each case. over the last years, many other attachment mechanisms have been proposed to obtain scalefree networks with other adjusted properties such as the clustering coe�cient, higher moments of the degree distribution [35�38]. coevolutive or adaptive topology. when one of the former examples of networks is chosen as a model for the social woven, there is an implicit assumption: the underlying social topology is frozen. however, this situation does not re�ect the observed fact that in real populations, social and migratory phenomena, sanitary isolation or other processes can lead to a dynamic con�guration of contacts, with some links being eliminated, other being created. if the time span of the epidemics is long enough, the social network will change and these changes will not be re�ected if the topology remains �xed. this is particularly important in small groups. the social dynamics, including the epidemic process, can shape the topology of the network, creating a feedback mechanism that can favor or attempt against the propagation of an infectious disease. for this reason, some models consider a coevolving network, with dynamic links that change the aspect of the networks while the epidemics occur. iv. epidemiological models on networks in this section, we will discuss several models based on the use of complex networks to mimic the social architecture. the discussion will be organized according to the topology of these underlying networks. lattices. lattices were the �rst attempt to represent the underlying topology of the social contacts and thus to analyze the possible e�ect of interactions at the individual level. these models took distance from the paradigmatic fully mixed assumption and focused on looking for those phenomena that a mean �eld model could not explain. still, the lattices cannot fully capture the role of inhomogeneities. as the individuals are located on a regular grid, mostly two dimensional, the neighborhood of each node is reduced to the adjacent nodes, inducing only short range or localized interactions. a typical model considers that the nodes can be in any of the epidemiological states or compartments. 050003-9 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman the dynamic of the epidemics evolves through a contact process [39] and the evolutive rules do not di�er too much from traditional cellular automata models [40]. disease transmission is modeled as a stochastic process. each infected node has a probability pi of infecting a neighboring susceptible node. once infected, the individuals may recover from infection with a probability pr; i.e., the infective stage lasts typically 1/pr. from the infective phase, the individuals can move back to the susceptible compartment or the recovered phase, depending on whether the models are sis or sir. usually, a localized infectious focus is introduced among the population. the transient shows a local and slow development of the disease that at the initial stage involves the growing of a cluster, with the infection propagating at its boundary, like a traveling wave. after the initial transient, sis, sir and sirs models behave in di�erent ways. the initially local dynamics that can or cannot propagate to the whole system is what introduces a completely new behavior in this spatially extended model. in ref. [41], the author argued the infective clusters behave as the clusters in the directed percolation model. figure 9 shows an example of the behavior of the asymptotic value of infected individuals under sis dynamics in a two dimensional square lattice. the �gure re�ects the results found in ref. [42]. the parameter f is associated to the infectivity of infectious individuals, closely related to the contact rate. we observe the inset displaying the scaling of the data with a power-like curve a|f − fc|α, with α ≈ 0.5 [42]. as mentioned before, kermack and mckendrick [12] proved the existence of a propagation threshold for the disease invading a susceptible population. the lattice based sir models introduce a di�erent threshold. the simulations show that epidemics can just remain localized around the initial focus or turn into a pandemic, a�ecting the entire population. the most dramatic examples of real pandemic are the black plague between the 1300 and 1500 and the spanish flu, in 1917-1918. both left a wake of death and terror while crossing the european continent. the predicted new threshold established a limit below in which the pandemic behaviour is not achieved. some works about epidemic propagation on lattices are analogous to forest �re models [43], with the characteristic feature that the frequency disfigure 9: sis model. asymptotic value of infected individuals as a function of the infectivity of infectious individuals. the inset displays the scaling of the data with a power-like curve a|f − fc|α, with α ≈ 0.5. adapted from ref. [42]. figure 10: sir model. asymptotic value of susceptible individuals as a function of the infectivity of infectious individuals. the inset displays the scaling of the data with a power-like curve a|f −fc|α, with α ≈ 0.5. adapted from ref. [42]. tributions of the epidemic sizes and duration obey a power-law. in ref. [44, 45], the authors exploit these analogies to explain the observed behavior of measles, whooping cough and mumps in the faroe islands. the observed data display a power-like behavior. 050003-10 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman random networks. most of the models based on random graphs were previous to the renewed interest on complex networks. a simple but e�ective idea for the study of the dynamics of diseases on random networks is the contact process proposed in ref. [46] that produces a branching phenomena while the infection propagates. in ref. [47], the authors use a e-r network with an approximately poisson degree distribution. a common feature to all these models is that the rate of the initial transient growth is smaller than the corresponding to similar models in fully-mixed populations. this effect can be easily understood noting that, on the one hand, the degree of a given initially infected node is typically small, thus having a limited number of susceptible contacts. on the other hand, there is a self limiting process due to the fact that the same infection propagation predates the local availability of susceptible targets. a di�erent analytical approach to random networks is presented in ref. [48]. the author shows that a family of variants of the sir model can be solved exactly on random networks built by a generating function method and appealing to the formalism of percolation models. the author analyzes the propagation of a disease in networks with arbitrary degree distributions and heterogeneous infectiveness times and transmission probabilities. the results include the particular case of scale-free networks, that will be discussed later. small-world networks. as mentioned above, regular networks can exhibit high clustering but long path lengths. on the other extreme, random networks have a lot of shortcuts between two distant individuals, but a negligible clustering. both features a�ect the propagative behavior on any modeled disease. the spread of infectious diseases on sw networks has been analyzed in several works. the interested was triggered by the fact that even a small number of random connections added to a regular lattice, following for example the algorithm described in ref. [30], produces unexpected macroscopic e�ects. by sharing topological properties from random and ordered networks, sw networks can display complex propagative patterns. on the one hand, the high level of clustering means that most infection occurs locally. on the other hand, shortcuts are vehicles for the fast spread of the epidemic to the entire population. in ref. [51], the authors study a si model and show that shortcuts can dramatically increase the possibility of an epidemic event. the analysis is based on bond percolation concepts. while the result could be easily anticipated due to the long range propagative properties of shortcuts, the authors �nd an important analytic result. it was a study of a sirs models that showed for the �rst time the evidence of a dramatic change in the behavior of an epidemic due to changes in the underlying social topology [52]. by speci�cally analyzing the e�ect of clustering on the dynamics of an epidemics, the authors show that a sirs model on a sw network presents two distinct types of behavior. as the rewiring parameter p increases, the system transits from an endemic state, with a low level of infection to periodic oscillations in the number of infected individuals, re�ecting an underlying synchronization phenomena. the transition from one regime to the other is sharp and occurs at a �nite value of p. the reason behind this phenomenon is still unknown. figure 11 shows the temporal behavior of the number of infected individuals for three values of the rewiring parameter p, as found in ref. [52]. 0 250 500 750 1000 0.0 0.2 0.4 t p = 0.9 0.0 0.2 i( t) p = 0.2 0.0 0.1 0.2 p = 0.01 figure 11: asymptotic behavior of the number of infected individuals in three sw networks with different degrees of disorder p. the emergence of a synchronized pattern is evident in the bottom graph. it would not be responsible to a�rm that sw networks re�ect all the real social structures. however, they capture essential aspects of such organization that play central roles in the propagation 050003-11 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman of a diseases, namely, the clustering coe�cient and the short social distance between individuals. understanding that there are certain limitations, sw networks help to mimic di�erent social organizations that range from rural population to big cities. there are more sophisticated models of networks with topologies that are more closely related to real social organizations at large scale. these networks are characterized by a truncated power law distribution of the degree of the nodes and by values of clustering and mean distance corresponding to the small world regime. scale-free networks. scale-free networks captured the attention of epidemiologists due to the close resemblance between their extreme degree distribution and the pattern of social contacts in real populations. a power law degree distribution presents individuals with many contacts and who play the role of super-spreaders. a higher number of contacts implies a greater risk of infection and correspondingly, a higher �success� as an infectious agent. some scale-free networks present positive assortativity. that translates into the fact that highly connected nodes are connected among them. this local structures can be used to model the existence of core groups of high-risk individuals, that help to maintain sexually transmitted diseases in a population dominated by long-term monogamous relationships [53]. models of disease spread through scale-free networks showed that the infection is concentrated among the individuals with highest degree [48, 54]. one of the most surprising results is the one found in ref. [54]. there, the authors show that no matter the values taken by the relevant epidemiological parameter, there is no epidemic threshold. once installed in a scale-free network, the disease will always propagate, independently of r0. remember that when analyzed under the fully mixed assumption, the studied sis model has a threshold. the authors perform analytic and numerical calculations of the propagation of the disease, to show the lack of thresholds. later, in ref. [55], it was pointed out that networks with divergent second moments in the degree distribution will show no epidemic threshold. the b a network ful�lls this condition. in ref. [56,57], the authors analyze the structure of di�erent networks of sexual encounters, to �nd that it has a pattern of contact closely related to a power law. they also discuss the implications of such structure on the propagation of venereal diseases co-evolutionary networks. co-evolutionary or adaptive networks take into account the own dynamics of the social links. in some occasions, the characteristic times associated to changes in social connections are comparable with the time scales of an epidemic process. some other times, the presence of n infectious core induces changes in social links. consider for example a case when the population of susceptible individuals after learning about the existence of infectious individuals try to avoid them, or another case when the health policies promote the isolation of infectious individuals [58]. the behavior of models based on adaptive network is determined by the interplay of two di�erent dynamics that sometimes have competitive e�ects. on the one hand, we have the dynamics of the disease propagation. on the other hand, the network dynamics that operates to block the advance of the infection. the later is dominated by the rewiring rate of the network, which a�ects the fraction of susceptible individuals connected to infective ones. the most obvious choice is to eliminate the infectious contacts of a susceptible individual by deleting or replacing them with noninfectious ones. the net e�ect is an e�ective reduction of the infection rate. while static networks typically predict either a single attracting endemic or disease-free state, the adaptive networks show a new phenomenon, a bistable situation shared by both states. the bistability appears for small rewiring rates [58�61]. in ref. [61], the authors consider a contact switching dynamics. all links connecting a susceptible agents with an infective one is broken with a rate r. the susceptible node is then connected to a new neighbor, randomly chosen among the entire population. the authors show that reconnection can completely prevent an epidemics, eliminating the disease. the main conclusion is that the mechanism that they propose, contact switching, is a robust and e�ective control strategy. figure 12 displays the results found in ref. [61], where two completely different types of behavior can be distinguish as the rewiring parameter r changes. the crossover from one regime to the other is a second order phase transition. 050003-12 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman λ figure 12: these two panels show the equilibrium fraction number of infected individuals, as a function of the infectivity of the disease, λ. lines are analytic results, symbols are numerical simulations. adapted from ref. [61]. v. immunization in networks any epidemiological model can reproduce the fact that the number of individuals in a population who are e�ectively immune to a given infection depends on the proportion of previously infected individuals and the proportion who have been e�ciently vaccinated. for some time, the epidemiologists knew about an emerging e�ect called herd protection (or herd immunity). they discovered the occurrence of a global immunizing e�ect veri�ed when the vaccination of a signi�cant portion of a population provides protection for individuals who have not or cannot developed immunity. herd protection is particularly important for diseases transmitted from person to person. as the infection progresses through the social links, its advance can be disrupted when many individuals are immune and their links to non immune subjects are no longer valid channels of propagation. the net e�ect is that the greater the proportion of immune individuals is, the smaller the probability that a susceptible individual will come into contact with an infectious one. the vaccinated individuals will not contract neither transmit the disease, thus establishing a �rewall between infected and susceptible individuals. while taking pro�t from the herd protection is far from being an optimal public health policy, it is still taken into consideration when individuals cannot be vaccinated due, for example, to immune disorders or allergies. the herd protection e�ect is equivalent to reduce the r0 of a disease. there is a threshold value for the proportion of necessary immune individuals in a population for the disease not to persist or propagate. its value depends on the e�cacy of the vaccine but also on the virulence of the disease and the contact rate. if the herd e�ect reduces the risk of infection among the uninfected enough, then the infection may no longer be sustainable within the population and the infection may be eliminated. in a real population, the emergence of herd immunity is closely related to the social architecture. while many fully mixed models can describe the phenomenon, the real e�ect is much more accurately reproduced by models based on social networks. one of the most expected result is to quantify how the shape of a social network can a�ect the level of vaccination required for herd immunity. there is a related phenomenon, not discussed here, that consists in the propagation of real immunity from a vaccinated individual to a non vaccinated one. this is called contact immunity and has been veri�ed for several vaccines, such as the opv [62]. the models to quantify the success of immunization of the population propose a targeted immunization of the populations. it is well established that immunization of randomly selected individuals requires immunizing a very large fraction of the population, in order to arrest epidemics that spread upon contact between infected individuals. in ref. [63], the authors studied the e�ects of immunization on an sir epidemiological model evolving on a sw network. in the absence of immunization, the model exhibits a transition from a regime where the disease remains localized to a regime where it spreads over a portion of the system. the e�ect of immunization reveals through two di�erent phenomena. first, there is an overall decrease in the fraction of the population a�ected by the disease. second, there is a shift of the transition point towards higher values of the disorder. this can be easily understood as the e�ective average number of susceptible neighbors per individual decreases. targeted immunization that is applied by vaccinating those individuals with the highest de050003-13 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman gree, produces a substantial improvement in disease control. it is interesting to point out that this improvement occurs even when the degree distribution over small-world networks is relatively uniform, so that the best connected sites do not monopolize a disproportionately high number of links. figure 13 shows an example of the results found in ref. [63], where the author compare the amount of non vaccinated individuals that are infected for different levels of vaccination, ρ, and di�erent degrees of disorder of the sw network p, as de�ned in ref. [30]. figure 13: fraction r of the non-vaccinated population that becomes infected during the disease propagation, as a function of the disorder parameter p, for various levels of random immunization (upper) and targeted immunization (bottom). adapted from ref. [63]. in a scale-free network, the existence of individuals of an arbitrarily large degree implies that there is no level of uniform random vaccination that can prevent an epidemic propagation, even extremely high densities of randomly immunized individuals can prevent a major epidemic outbreak. the discussed susceptibility of these networks to epidemic hinders the implementation of a prevention strategy di�erent from the trivial immunization of all the population [54,55,66]. taking into account the inhomogeneous connectivity properties of scale-free networks can help to develop successful immunization strategies. the obvious choice is to vaccinate individuals according to their connectivity. a selective vaccination can be very e�cient, as targeting some of the superspreaders can be su�cient to prevent an epidemic [55,67]. the vaccination of a small fraction of these individuals increases quite dramatically the global tolerance to infections of the network. when comparing the uniform and the targeted immunization procedures [67], the results indicate that while uniform immunization does not produce any observable reduction of the infection prevalence, the targeted immunization inhibits the propagation of the infection even at very low immunization levels. these conclusions are particularly relevant when dealing with sexually transmitted diseases, as the number of sexual partners of the individuals follows a distribution pattern close to a power law. targeted immunization of the most highly connected individuals [64,65,67] proves to be e�ective, but requires global information about the architecture of network that could be unavailable in many cases. in ref. [68], the authors proposed a di�erent immunization strategy that does not use information about the degree of the nodes or other global properties of the network but achieves the desired pattern of immunization. the authors called it acquaintance immunization as the targeted individuals are the acquaintances of randomly selected nodes. the procedure consists of choosing a random fraction pi of the nodes, selecting a random acquaintance per node with whom they are in contact and vaccinating them. the strategy operates at the local level. the fraction pi may be larger than 1, for a node might be chosen more than once, but the fraction of immunized nodes is always less than 1. this strategy allows for a low vaccination level to achieve the immunization threshold. the procedure is able to indirectly detect the most connected individuals, as they are acquaintances of many nodes so the probability of being chosen for vaccination is higher. vi. final remarks the mathematical modeling of the propagation of infectious diseases transcends the academic interest. any action pointing to prevent a possible pandemic situation or to optimize the vaccination strategies to achieve critical coverage are the core 050003-14 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman of any public health policy. the understanding of the behavior of epidemics showed a sharp improvement during the last century, boosted by the formulation of mathematical models. however, for a long time, many important aspects regarding the epidemic processes remained unexplained or out of the scope of the traditional models. perhaps, the most important one is the feedback mechanism that develops between the social topology and the advance of an infectious disease. the new types of models developed during the last decade made an important contribution to the �eld by incorporating a mean of describing the e�ect of the social pattern. while a quantitative analysis of a real situation still demands huge computational resources, the mathematical foundations to develop it are already laid. the is too much to do yet, but the breakthrough produced by these new models based on complex networks is already undeniable. [1] r h britten, the incidence of epidemic in�uenza, 1918-1919. a further analysis according to age, sex, and color of records of morbidity and mortality obtained in surveys of 12 localities, pub. health. rep. 47, 303 (1932). [2] r ross, some a priori pathometric equations, br. med. j. 1, 546 (1915). [3] r ross, an application of the theory of probabilities to the study of a priori pathometry i, proc. r. soc. a 92, 204 (1916). [4] r ross, an application of the theory of probabilities to the study of a priori pathometry ii, proc. r. soc. a 93, 212 (1916). [5] r m anderson, r m may, infectious diseases of humans: dynamics and control, oxford university press, london (1991). [6] g macdonald, the epidemiology and control of malaria, oxford university press, london (1957). [7] j l aron, r m may, the population dynamics of malaria, in: population dynamics of infectious disease, eds. r m anderson, chapman and hall, london, pag. 139 (1982). [8] k dietz, mathematical models for transmission and control of malaria, in: principles and practice of malariology, eds. w wernsdorfer, y mcgregor, churchill livingston, edinburgh, pag. 1091 (1988). [9] j l aron, mathematical modeling of immunity to malaria, math. biosci. 90, 385 (1988). [10] j a n filipe, e m riley, c j darkeley, c j sutherland, a c ghani, determination of the processes driving the acquisition of immunity to malaria using a mathematical transmission model, plos comp. biol. 3, 2569 (2007). [11] d j rodriguez, l torres-sorando, models of infectious diseases in spatially heterogeneous environments, bull. math. biol. 63, 547 (2001). [12] w o kermack, a g mckendrick, a contribution to the mathematical theory of epidemics, proc. r. soc. a 115, 700 (1927). [13] h abbey, an examination of the reed frost theory of epidemics, human biology 24, 201 (1952). [14] n t j bailey, the mathematical theory of infectious diseases and its applications, gri�n, london (1975). [15] f g ball, p donnelly, strong approximations for epidemic models, stoch. proc. appl. 55, 1 (1995). [16] h andersson, t britton, stochastic epidemic models and their statistical analysis, springer verlag, new york (2000). [17] o diekmann, j a p heesterbeek, mathematical epidemiology of infectious diseases, wiley, chichester (2000). [18] v isham, stochastic models for epidemics: current issues and developments, in: celebrating statistics: papers in honor of sir david cox on his 80th birthday, oxford university press, oxford (2005). [19] h c tuckwell, r j williams, some properties of a simple stochastic epidemic model of sir type, math. biosc. 208, 76 (2007). [20] d mollison, dependence of epidemic and population velocities on basic parameters, math. biosc. 107, 255 (1991). 050003-15 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman [21] j d murray, e a stanley, d l brown, on the spatial spread of rabies among foxes, proc. royal soc. london b 229, 111 (1986). [22] h w hethcote, three basic epidemiological models, in: applied mathematical ecology, eds. s a levin, t g hallam, l gross, pag. 119, springer, berlin (1989). [23] m n kuperman, h s wio, front propagation in epidemiological models with spatial dependence, physica a 272, 206 (1999). [24] e beretta, y takeuchi, global stability of an sir epidemic model with time delays, j. math. biol. 33, 250 (1995). [25] a franceschetti, a pugliese, threshold behaviour of a sir epidemic model with age structure and immigration, j math biol. 57, 1 (2008). [26] j yang j, s liang, y zhang, travelling waves of a delayed sir epidemic model with nonlinear incidence rate and spatial di�usion, plos one 6, e21128 (2011). [27] h w hethcote, a thousand and one epidemic models, in: frontiers in mathematical biology, eds. s levin, pag. 504, springer, berlin (1994). [28] p erdös, a rényi, on random graphs, publ. math-debrecen 6, 290, (1959). [29] s milgram, the small world problem, psychol. today 2, 60 (1967). [30] d j watts, s h strogatz collective dynamics of 'small-world' networks, nature 393, 409 (1998). [31] m e j newman, d j watts, renormalization group analysis of the small-world network model, physics letters a 263, 341 (1999). [32] m e j newman, networks: an introduction, oxford university press, new york (2010). [33] m e j newman, s h strogatz, d j watts. random graphs with arbitrary degree distributions and their applications, phys. rev. e 64, 026118 (2001). [34] a l barabási, r albert, emergence of scaling in random networks, science 286, 509 (1999). [35] p l krapivsky, g j rodgers, s redner, degree distributions of growing networks, phys. rev. lett. 86, 5401 (2001). [36] k klemm, v m eguíluz, highly clustered scalefree networks, phys. rev. e 65, 036123 (2002). [37] p holme, b j kim, growing scale-free networks with tunable clustering, phys. rev. e 65, 026107 (2002). [38] r xulvi-brunet, i m sokolov, changing correlations in networks: assortativity and dissortativity, acta phys. pol. b 36, 1431 (2005). [39] t e harris, contact interactions on a lattice, ann. probab. 2, 969 (1974). [40] s wolfram, statistical mechanics of cellular automata, rev. mod. phys. 55, 601 (1983). [41] p grassberger, on the critical behavior of the general epidemic process and dynamical percolation, math. biosci. 63, 157 (1983). [42] m a fuentes, m n kuperman, cellular automata and epidemiological models with spatial dependence, physica a 267, 471 (1999). [43] p bak, k chen, c tang, a forest-�re model and some thoughts on turbulence, phys. lett. a 147, 297 (1990). [44] c j rhodes, r m anderson, epidemic thresholds and vaccination in a lattice model of disease spread, theor. popul. biol. 52, 101 (1997). [45] c j rhodes, h j jensen, r m anderson, on the critical behaviour of simple epidemics, proc. r. soc. b 264, 1639 (1997). [46] o diekmann, j a p heesterbeek, j a j metz, a deterministic epidemic model taking account of repeated contacts between the same individuals, j. appl. prob. 35, 462 (1998). [47] a barbour, d mollison, epidemics and random graphs in: stochastic processes in epidemic theory, eds. j p gabriel, c lefèvre, p picard, pag. 86, springer, new york (1990). 050003-16 papers in physics, vol. 5, art. 050003 (2013) / m. n. kuperman [48] m e j newman, spread of epidemic disease on networks, phys rev. e 66, 016128 (2002). [49] d mollison, spatial contact models for ecological and epidemic spread, j. roy. stat. soc. 39, 283 (1977). [50] b t grenfell, o n bjornstad, j kappey, travelling waves and spatial hierarchies in measles epidemics, nature 414, 716 (2001). [51] c moore, m e j newman, epidemics and percolation in small-world networks, phys. rev. e 61, 5678 (2000). [52] m n kuperman, g abramson, small world effect in an epidemiological model, phys. rev. lett. 86, 2909 (2001). [53] h w hethcote, j a yorke, gonorrhea transmission dynamics and control, springer lecture notes in biomathematics, springer, berlin (1984). [54] r pastor-satorras, a vespignani epidemic spreading in scale-free networks, phys. rev. lett. 86, 3200 (2001). [55] a l lloyd, r m may, how viruses spread among computers and people, science 292, 1316 (2001). [56] f liljeros, c r edling, l a n amaral, h e stanley, y aberg, the web of human sexual contacts, nature 411, 907 (2001). [57] f liljeros, c r edling, l a n amaral, sexual networks: implications for the transmission of sexually transmitted infections, microbes infect. 5, 189 (2003). [58] t gross, c j d d'lima, b blasius, epidemic dynamics on an adaptive network, phys. rev. lett. 96, 208701 (2006). [59] l b shaw, i b schwartz, fluctuating epidemics on adaptive networks, phys. rev. e 77, 066101 (2008). [60] d h zanette, s risau-gusmán, infection spreading in a population with evolving contacts, j. biol. phys. 34, 135 (2008) [61] s risau-gusmán, d h zanette, contact switching as a control strategy for epidemic outbreaks, j. theor. biol. 257, 52 (2009). [62] m c bonnet, a dutta, world wide experience with inactivated poliovirus vaccine, vaccine 26, 4978 (2008). [63] d h zanette, m kuperman, e�ects of immunization in small-world epidemics, physica a 309, 445 (2002). [64] r albert, h jeong, a l barabási, error and attack tolerance of complex networks, nature 406, 378 (2000). [65] d s callaway, m e j newman, s h strogatz, d j watts, network robustness and fragility: percolation on random graphs, phys. rev. lett. 85, 5468 (2000). [66] r m may, a l lloyd, infection dynamics on scale-free networks, phys. rev. e 64, 066112 (2001). [67] r pastor-satorras, a vespignani, immunization of complex networks, phys. rev. e 65, 036104 (2002). [68] n madar, t kalisky, r cohen, d benavraham, s havlin, immunization and epidemic dynamics in complex networks, eur. phys. j. b 38, 269 (2004). 050003-17 papers in physics, vol. 9, art. 090009 (2017) received: 27 june 2017, accepted: 9 october 2017 edited by: d. restrepo licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.090009 www.papersinphysics.org issn 1852-4249 an improvement to the measurement of dalitz plot parameters s. ghosh,1∗ a. roy1† precise measurement of the dalitz plot parameters of the η′ → η π+ π− decay gives a better insight of the dynamics of the heavy pseudo-scalar meson. various measurements of the dalitz plot parameters of the η′ meson have been performed with the detection of all the final state particles including the neutral decay of η → 2γ. in many experiments, reconstruction of the η meson from the neutral decay modes comes with the disadvantage of poor resolution and low efficiency compared to that of the charged particles. in this article, a study of the dalitz plot parameters keeping the η meson as a missing particle is presented. the method is found to be advantageous in the case of poor photon resolution. effect of the charged particle resolution on the dalitz variable y is also examined. this work may provide guidance to select a suitable method for the dalitz plot analysis, depending on the detector resolution. i. introduction low energy quantum chromodynamics (qcd) is effectively studied in the framework of the chiral perturbation theory (chpt). the lagrangians for the hadronic processes are derived from chpt and the effective degrees of freedom are, usually, the octet of pseudo-scalar mesons (π, k, η) [1]. the observed mass of the η′ meson in nature is much higher than the other pseudo-scalar mesons, which gives the axial u(1) anomaly [2, 3]. this suggests the possible existence of many unsolved problems that could provide inputs to the lagrangian. study of the production and decay of the η′ meson is, therefore, important to both theory and experiment [4]. some possible dynamics that could be studied in the η′ decay model are the new sym∗e-mail: phd12115113@iiti.ac.in †e-mail: ankhi@iiti.ac.in 1 discipline of physics, school of basic sciences, indian institute of technology indore, khandwa road, simrol, mp-453552, india. metry and the symmetry breaking during interactions [5], gluonic contributions, gauge field configurations with non-zero winding number, and the quark instantons [6, 7]. this has motivated many groups to study both charged and uncharged decays of the η′ meson [1, 8]. the dalitz plot plays an important role as a useful tool to study decay dynamics of a meson decaying into three bodies. as the three-body decay has two degrees of freedom, one can define two linearly independent variables to represent the decay in the phase space. in this article, we shall study the decay of η′ → η π+ π− and for that we define the dalitz plot variables [9] as x = √ 3(tπ+ −tπ−) q , (1) y = (mη + 2mπ) mπ · tη q − 1. here t and m are, respectively, the kinetic energy (in the rest frame of η′) and the mass of the particles indicated by the subscripts. then, q = tπ+ 090009-1 papers in physics, vol. 9, art. 090009 (2017) / s. ghosh et al. + tπ+ + tπ+ = mη′ mη 2mπ, is the available energy of the reaction. the decay dynamics of the η′ mesons is studied in the form of dalitz plot parameters which are obtained by fitting the dalitz plot with the following general parameterization: f(x,y ) = a ( 1 + ay + by 2 + cx + dx2 ) . (2) here, a, b, c, and d are the dalitz plot parameters of the decay and a is the overall normalization constant. the measurement of dalitz plot parameters are specifically important to understand and crosscheck the correct inputs to the theoretical distribution of the lagrangian [10]. it is, therefore, important to measure the dalitz plot parameters precisely with good statistics and resolution. the dalitz plot parameters of the η′ → η π+ π− decay have been measured in the experiments of the ves [11] and the besiii [12] collaborations. gams-4π [13] and ihep-iisn-lanl-lapp collaborations [7] have reported the dalitz plot parameters for the neutral decay mode (η π0 π0) of the η′ meson. the values of the dalitz plot parameters for both of these decay modes of the η′ meson should be the same under the isospin limit. however, this is not observed in the above experiments. this discrepancy may be reconciled by precise measurements of the small dalitz plot parameters with high statistics. the previous experiments have identified the final state particles (π+, π− and γ) of the η′ → (η) π+π− → (2γ) π+π− decay and used the invariant mass method for the dalitz plot analysis. if the photon resolution is poor, the reconstruction efficiency of the η meson will be significantly low and will also affect the dalitz variable y and the parameters related to this variable. here, we will be reporting a method that improves the dalitz plot parameters in the measurements of the η′ decay modes and found particularly advantageous in the case of poor photon resolution. we shall consider the production of η′ meson through photo-production reaction first and then introduce the detector resolution to all the final state particles π+, π−, p and γ of the channel γ + p → η′p → (η) π+π−p → (2γ) π+π− p to simulate the detector environment. the dalitz plot parameters are then calculated using the following methods: (a) exclusive measurement or invariant mass method and (b) missing measurement method. the exclusive measurement method is commonly used, where all the final state particles (π+, π− and γ) are considered and η is reconstructed as the invariant mass (η → 2γ). in the missing measurement method, we have reconstructed the η′ meson as a missing particle using the information of the recoiled proton, the beam photon and the target proton (γbeam + ptarget → η′ precoil). the η ′ information along with π+ and π− is used to reconstruct the η meson as a missing particle. the calculations of the dalitz plot parameters using both these methods are described and compared. in addition to that, a bin width study is also performed with missing analysis to optimize the bin size for the extraction of the dalitz plot parameters. in the missing measurement method, a dependence of the dalitz variable y on the detector resolutions of the charged particles has also been examined. further, to study the effect of the background, method (b) was subjected to a combinatoric background channel and the dalitz plot parameters are then systematically studied with varying background component in the mixture of the signal and the background. the background is then subtracted from every dalitz plot bin and the parameters are reported. ii. model to fold the detector resolutions the pluto simulation framework (version 5.42) developed by the hades collaboration was used to generate hadronic physics reactions for this analysis [14]. dalitz plot parameters of the η′ → η π+ π− decay from the besiii [12] experiment were used as input parameters for the generated events. a total of 105 γ + p → η′ p events, each with a photon beam energy of 2.5 gev were generated in the phase space model for this analysis. however, the energy of the η′ meson is arbitrary as the dalitz plot parameters are independent of the initial energy of the η′ meson. the detector effects are absent in the events generated by the pluto event generator. however, in reality, the momentum of the particle passing through the detector is modified according to the detector resolution. generally, the detector response is distributed with a resolution which is dependent on the magnitude of the particle’s momentum. to incorporate the detector response, each component of the momentum 090009-2 papers in physics, vol. 9, art. 090009 (2017) / s. ghosh et al. figure 1: (a) number of events as a function of the momentum, where the solid histogram is for the events with unit momentum before folding and the dashed histogram is for the events after the folding with a gaussian distribution whose σ is 3% of the mean. (b) number of generated vs. reconstructed events in the dalitz variable bins x. (c) number of generated vs. reconstructed events in the dalitz variable bins y. (d) the dalitz variable x vs. y showing the bins completely inside the dalitz plot boundary. of the particles are convoluted with a random number sampled from a gaussian distribution of mean 1 and σ = 3% of the mean, which introduces the momentum dependent resolution to the particle. this effect transforms particles with a single momentum into a particle with a momentum distribution as shown in fig. 1(a) [15]. since we are studying a case where the photon resolution is poor compared to the charged particle resolution, the final state charged particles (π+, π− and p) are folded with 1% resolution [16, 17] in the momentum components to simulate a detectorlike effect, whereas 6% resolution is used for the photons. there are experiments that suffer from poor photon resolution for which 6% is rather an underestimation [18]. dalitz plot variables x and y, after the addition of resolution to the final state particles in momentum components, are compared with the generated dalitz plot variables as shown in fig. 1(b) and (c). it is observed in fig. 1(c) that at higher values of y, the bin migration is more significant. the reason for this is that a higher value of y corresponds to a high energy η meson, which decays to energetic photons with poor resolution. this leads to a enhanced bin migration at higher value of y. folding has the effect of migrating the events from one bin to the neighboring bins of the dalitz plot. it distributes the events in the bins away from the diagonal, creating a homogeneous migration of the events and thus allows to calculate the parameters from a properly binned dalitz plot. both dalitz plot variables, shown in fig. 1, are divided into 18 bins from −1.5 to 1.5. though the migration is observed for the individual dalitz variables shown in fig 1(b) and (c), the maximum number 090009-3 papers in physics, vol. 9, art. 090009 (2017) / s. ghosh et al. figure 2: (a) number of events vs. difference of the reconstructed (after folding) and the true value of the dalitz variable x and (b) the dalitz variable y for the exclusive measurement method. (c) number of events vs. difference of the reconstructed and the true value of dalitz variable x and (d) the dalitz variable y for the missing measurement method (b). of events, however, lie on the diagonal and confined well within the resolution of the bins. the general parameterization of eq. (2) was used to parametrize the decay. a least square fitting procedure minuit [19] was used to minimize the χ2 given in eq. (3) to fit each bin of the dalitz plot. χ2 = ∑(nj −f(xj,yj) ∆nj )2 . (3) where nj and ∆nj denotes the number of events and their statistical uncertainties for the dalitz plot bin j (j = 1, 2, ...,n). xj and yj are the central coordinates of each bin, and f(xj,yj) denotes the fitted form of the polynomial. the bins on the boundary of the dalitz plot are removed and only bins which are completely inside the boundary are considered as shown in fig. 1(d). i. method (a): exclusive measurement in this method, measurement of the dalitz plot parameters are performed with all the detected final state particles π+, π− and γ. to calculate the resolution of dalitz variables x and y, first a difference of the reconstructed (after folding) and the true value of the dalitz variable is considered. then, the standard deviation of x and y for a large number of events is calculated using σ = √√√√(∑(x−µ)2 n ) , (4) where x represents each value of the dalitz variables and µ is the mean, which should be zero in this case, and n is the total number of events. as seen from fig. 2(a) and (b), the resolutions of the dalitz variables x and y are respectively 0.019 and 0.099, after the folding in the momentum compo090009-4 papers in physics, vol. 9, art. 090009 (2017) / s. ghosh et al. table 1: dalitz plot parameters with varying number of bins in the dalitz variables x and y for methods (a) and (b). method (a) method (b) par 17 × 17 18 × 18 19 × 19 17 × 17 18 × 18 19 × 19 a −0.067 ±0.008 −0.062 ±0.008 -0.066 ± 0.008 -0.045 ± 0.008 -0.044 ± 0.008 -0.044 ± 0.008 b −0.134 ±0.015 −0.142 ±0.015 -0.145 ± 0.014 -0.058 ± 0.016 -0.063 ± 0.015 -0.081 ± 0.015 c 0.014 ± 0.005 0.014 ± 0.005 0.009 ± 0.005 0.014 ± 0.006 0.014 ± 0.005 0.009 ± 0.006 d −0.098 ±0.009 −0.096 ±0.009 -0.093 ± 0.009 -0.085 ± 0.010 -0.081 ± 0.009 -0.092 ± 0.009 χ2 ndf 1.25 1.04 1.09 1.08 1.14 0.95 figure 3: comparison of the dalitz plot parameters in an exclusive measurement method with varying photon resolution and missing measurement method along with the generated besiii parameters. nents. the resolution of the variables x and y are independent in this method as the former was calculated from π+ and π− but the later was calculated from the photons and, therefore, causing the resolution to be poor. ii. method (b): missing measurement in the missing measurement method, the dalitz variables x and y are calculated without using the information of the final photons from the decay of the η meson. only π+ and π− are folded with the resolutions and the produced η′ information is used to calculate the dalitz variables x and y. the calculated resolutions of the dalitz variables x and y are both 0.016 and are shown in fig. 2(c) and (d). in contrast to the exclusive measurement method, in this method, the resolutions of x and y are same. figure 4: variation of the y resolution with the charged particle resolution. this is due to the fact that both x and y use the information of the same particles (π+ and π−) with added resolution [20]. iii. comparison of the method (a) and (b) to compare the methods described above, we have extracted the dalitz parameters by fitting the dalitz plot with the parameterization defined in eq. (2). all the bins are fitted to the general parameterization with the χ2 minimization. to find an optimized binning, the dalitz plot parameters are calculated using the method (a) and (b) for varying number of bins in x and y axis as shown in table 1. it is found that the best χ2/ndf is obtained for 105 events with a binning of 18 × 18 for both methods. it can be seen that a systematics arises due to varying bin size on the parameters a and b in the invariant mass method. in both the methods a binning of 18 × 18 is thus used to calculate the dalitz plot parameters, which is higher 090009-5 papers in physics, vol. 9, art. 090009 (2017) / s. ghosh et al. than the resolution of the dalitz variables x and y. a comparison of the parameters obtained using the methods (a), with different photon resolution, and (b), with the generated besiii parameters, is shown in fig. 3 along with their errors and χ2/ndf. it can be seen from the figure that with poorer photon resolution the parameters a and b systematically deviate from the generated value. though the values of the parameters c and d are consistent as shown in table 1, it is observed that due to poor photon resolution, deviation of the a and b parameters from the input in the exclusive measurement method is significant compared to the missing 2γ measurement method. even the deviation of parameter b is higher than that of a, as b is the coefficient of a quadratic y. the poor photon resolution directly affects the η momentum resolution, which is reflected in the poor y resolution. the missing measurement method is, therefore, preferable for the dalitz plot analysis in this case of poor photon resolution. since in the missing measurement method the dalitz variable y depends on the charge particle resolution, we have carried out another study to examine the variation of the y resolution with the charged particle resolution. we found that y varies linearly with the charged particle resolution as shown in fig. 4. it can be concluded that if the charge particle resolution is equally poor as the photon resolution, the missing analysis method is still a better choice compared to the exclusive measurement method. in the present experimental condition of 1% charged particle resolution and 6% neutral particle resolution, the exclusive measurement method introduces a poorer y resolution compared to the missing measurement method. iii. background study the missing measurement method comes with a cost of additional combinatoric background for the channel η′ → η π+ π−. this combinatoric background arises from π+ and π− produced in the decays η′ → η π+ π− and η → π+ π− π0, which leaves the π+ and π− in both decays with a similar available energy and hence similar kinematics. we consider a background η′ → η π0 π0 generated with the same dalitz plot parameters because of isospin symmetry [1] as η′ → η π+ π− and η is further figure 5: the number of events vs. the missing mass mx(η ′, π+π−) from the signal and background channel. decayed into η → π+ π− π0. we mixed the signal channel with 5%, 10% and 15% background and then folded it to add resolution. the dalitz plot parameters from these three set are given in table 2. the missing mass mx(η ′, π+π−) in fig. 5 clearly shows the contribution of the background channel in the η meson mass region due to misidentification. the dalitz plot parameters a and b related to the dalitz variable y are systematically shifted from their generated values due to the presence of this background. meanwhile, the dalitz parameters c and d do not change since π+ and π− from both channels have similar kinematics, as shown in fig. 6. however, this combinatoric background can be subtracted by determining the number of background events in each bin of the dalitz plot. the contribution of background in each bin is calculated from the generated monte-carlo samples before implementing the resolution to the particles, which introduces a systematics from the migration of background events to the nearby bins in the dalitz plot. the general parameterization of eq. (2) is fitted to the background subtracted dalitz plot and the parameters are given in table 3. the systematics from the migration of background events is small in the missing measurement method, and the extracted parameters after background subtraction show that method (b) is a better choice than the exclusive measurement [method (a)] when charge particle resolution is 1% and photon resolution is 6%. 090009-6 papers in physics, vol. 9, art. 090009 (2017) / s. ghosh et al. table 2: comparison of the dalitz plot parameters extracted from method (b) with different background contributions. input pars method (b) (no bkg) method (b) (5% bkg) method (b) (10% bkg) method (b) (15% bkg) a (−0.047) -0.044 ± 0.008 −0.074 ± 0.008 −0.087 ± 0.008 −0.125 ± 0.008 b (−0.069) -0.063 ± 0.015 −0.078 ± 0.015 −0.130 ± 0.015 −0.119 ± 0.015 c ( 0.019) 0.014 ± 0.005 0.016 ± 0.005 0.020 ± 0.005 0.015 ± 0.005 d (−0.073) -0.081 ± 0.009 −0.070 ± 0.009 −0.064 ± 0.009 −0.062 ± 0.009 χ2 ndf 1.14 0.86 0.94 1.16 table 3: comparison of the dalitz plot parameters extracted from method (a) and the background subtracted dalitz plot obtained from method (b). input pars method (a) method (b) (5% bkg) method (b) (10% bkg) method (b) (15% bkg) a (−0.047) −0.062 ± 0.008 −0.045 ± 0.008 −0.045 ± 0.008 −0.042 ± 0.008 b (−0.069) −0.142 ± 0.015 −0.066 ± 0.016 −0.086 ± 0.016 −0.084 ± 0.017 c (+0.019) 0.014 ± 0.005 +0.018 ± 0.006 +0.022 ± 0.006 +0.019 ± 0.006 d (−0.073) −0.096 ± 0.009 −0.073 ± 0.010 −0.060 ± 0.010 −0.076 ± 0.010 χ2 ndf 1.04 0.93 1.15 1.02 figure 6: comparison of the dalitz plot parameters from generated and missing mass method without any background and with the background added in a proportion of 5%, 10% and 15% of the total number of events. iv. conclusions in this article, we have described two methods for measuring the dalitz variables x and y in the context of the decay η′ → η (2γ) π+ π−. the exclusive measurement method and the invariant mass method, though commonly used for measuring the dalitz variables, are shown here to be unsuitable in the case of poor photon resolution. in contrast, the missing (2γ) analysis method is found to be more suitable when the photon resolution is poor. the resolution in y versus charge particle resolution, in the missing analysis method, suggests that, even in the case of equally poor charged particle resolution and photon resolution the missing analysis method is a better choice compared to the exclusive measurement method. the missing analysis method, however, comes with the disadvantage of the combinatoric background and the dalitz plot parameters deviate systematically from their central values depending on the fraction of background present. once the background subtraction is implemented, the missing analysis method performs even better and becomes more attractive. in conclusion, this work provides guidance to select a suitable method for the extraction of the dalitz plot parameters depending on the detector resolution as well as the background present in the signal region and may be extended to the three body decay channels of other mesons when the detector resolution is poor. [1] b borasoy, r nißler, hadronic η and η′ decays, eur. phys. j. a 26, 383 (2005). 090009-7 papers in physics, vol. 9, art. 090009 (2017) / s. ghosh et al. [2] k naito, m oka, m takizawa, t umekawa, ua(1) breaking effects on the light scalar meson spectrum, prog. theor. phys. 109, 969 (2003). [3] s weinberg, the u(1) problem, phys. rev. d 11, 3583 (1975). [4] j bijnens, η and η′ physics, proceedings of the 11th international conference on mesonnucleon physics and the structure of the nucleon, econf c 070910, 104 (2007). [5] n beisert, b borasoy, the η′ → ηππ decay in u(3) chiral perturbation theory, nuc. phys. a 705, 433 (2002). [6] s d bass, gluonic effects in ηand η′-nucleon and nucleus interactions, acta phys. slovaca 56, 245 (2006). [7] d alde et al., matrix element of the η′(958) → ηπ0π0 decay, phys. lett. b 177, 115 (1986). [8] a kupsc, decays of η and η′ mesons: an introduction, int. j. mod. phys. e 18, 1255 (2009). [9] a h fariborz, j schechter, η′ → ηππ decay as a probe of a possible lowest lying scalar nonet, phys. rev. d 60, 034002 (1999). [10] r escribano, p masjuan, j j sans-cillero, chiral dynamics predictions for η′ → ηππ, j. high energy phys. 2011, 094 (2011). [11] v dorofeev et al., study of η′ → ηπ+π− dalitz plot, phys. lett. b 651, 22 (2007). [12] m ablikim et al., measurement of the matrix element for the decay η′ → ηπ+π−, phys. rev. d 83, 012003 (2011). [13] a m blik et al., measurement of the matrix element for the decay η′ → ηπ0π0 with the gams-4π spectrometer, phys. atom. nucl.+ 72, 231 (2009). [14] i fröhlich et al., pluto: a monte carlo simulation tool for hadronic physics, pos(acat2007) 076, (2007). [15] p garg, d k mishra, p k netrakanti, a k mohanty, b mohanty, unfolding of event-byevent net-charge distributions in heavy-ion collisions, j. phys. g: nucl. part. phys. 40, 055103 (2013). [16] m ullrich, w khn, y liang, b spruck, m werner, simulation of the besiii endcap time of flight upgrade, nucl. instrum. meth. a 769, 32 (2015). [17] m battaglieri, r de vita, v kubarovsky, pentaquark at jlab: the g11 experiment in clas, aip conf. proc. 792, 742 (2005). [18] m amarian et al., the clas forward electromagnetic calorimeter, nucl. instrum. meth. a 460, 239 (2001). [19] r brun, f rademakers, root: an object oriented data analysis framework, nucl. instrum. meth. a 389, 81 (1997). [20] s ghosh, dalitz plot analysis of η′ → ηπ+π−, aip conf. proc. 1735, 030018 (2016). 090009-8 papers in physics, vol. 2, art. 020011 (2010) received: 13 july 2010, accepted: 10 january 2011 edited by: v. lakshminarayanan licence: creative commons attribution 3.0 doi: 10.4279/pip.020011 www.papersinphysics.org issn 1852-4249 commentary on “experimental determination of distance and orientation of metallic nanodimers by polarization dependent plasmon coupling” sukhdev roy1∗ the paper by h. e. grecco and o. e. martinez [1] describes an experimental technique to determine (a) sub-diffraction distances based on nearfield coupling of metallic nanoparticles, and (b) relative orientations due to symmetry breaking in the scattering cross-section. the novelty is in sequential illumination by two wavelengths to separate out the background from scattering from the nanoparticles, and in the illumination scheme to facilitate rotation of the polarization. the experimental results are shown to be in good agreement with theoretical predictions made earlier by the authors. the authors had earlier theoretically shown that the interparticle separation dependent polarization anisotropy of discrete nanoparticle dimers enables nanoscale distance measurements [2]. their theoretical approach has also been recently experimentally implemented to simultaneously measure distance and orientation changes in discrete dimers of dna linked nanoparticles [3]. in this commentary, i briefly discuss some points to provide a better perspective of the contribution and make suggestions to improve the presentation. these suggestions have been taken on board by the authors and the published version of the paper is substantially improved. noble metal nanoparticles have been intensively ∗e-mail: sukhdevroy@dei.ac.in 1 department of physics and computer science, dayalbagh educational institute, dayalbagh, agra 282110, india studied, both theoretically and experimentally, for their unique optical properties. for a review of applications in biosystems, readers can refer for example to ref. [4]. the excitation of the localized surface plasmon (lsp) resonance by incident electromagnetic radiation is responsible for a very large field enhancement at their surface. this singular and spatially localized optical response has proven to be of main interest for specific applications such as high sensitivity detection and spectroscopy of molecules suitably attached to the nanoparticle surface (fluorescence, raman scattering and biochemical sensing). for a single nanoparticle, the spectral features of the lsp are known to closely depend on its size, shape and dielectric environment. more recent studies have shown that interparticle coupling effects can be used for tailoring lsp resonances with even more flexibility. this has been clearly demonstrated for nanoparticle pairs which are the most basic systems of such interacting objects. gold nanoparticles have stimulated tremendous research interest due to their unique optical properties. it is well established that the optical response of an individual gold nanoparticle can not only be varied by changing the dielectric environment, but more dramatically by changing the nanoparticle geometry itself. this has prompted extensive interest in the synthesis and characterization of the optical response of a wide variety of gold nanoparticle structures such as shells [5], rods [6,7], stars [8] and dumbbells [9], with the prime objective of generating a controlled plasmon resonance at a desired 020011-1 papers in physics, vol. 2, art. 020011 (2010) / s. roy wavelength. dimers [1–3], chains [10] and other arrays of nanoparticles are alternative methods of controlling the position or shape of the plasmon resonance. the optical properties of these arrays can be analyzed approximately in two regimes: far-field where the interparticle gap is large and the near field where the gap is sufficiently small so that the near fields of the particles are coupled. in the nearfield regime the plasmon resonance can be tuned deep into the infrared by decreasing the interparticle gap, resulting in a very strong enhancement of the electric field between the particles. the authors address the important problem of experimental determination of distance and orientation of gold nanoparticles by measuring scattering by illumination with polarized light. they show that scattering polarization microscopy of coupled nanoparticles (with radius in the range of 4-20 nm) can provide an alternative method to fluorescence resonance energy transfer (fret) and standard super-resolution techniques. the technique helps in filling the gap for distance and orientation measurements by these techniques, as in this range, the particles are closer than the resolution limit of the microscope and will appear as a single spot on the detector. the distance in which the technique is sensitive scales with the radii of the particles. the authors claim that the proposed technique can be an alternative to fluorescence based techniques when photostability, frame rate or coupling range are insufficient. it is interesting to note that the use of two-color imaging can provide an efficient, faster and reliable way to detect scattering centers that have plasmon resonances. the authors should highlight the importance of the proposed contribution in the introduction section. it should include the importance of (i) spherical metal nanoparticles and plasmon resonances, which are an indispensable tool for examining optical near-fields, imaging and sensing and (ii) distance and orientation measurements, which are very important to understand many biological systems, such as molecular structural dynamics and conformational transitions in proteins. gold nanoparticles, in addition to their enhanced absorption and scattering, and usefulness as contrast agents in cellular and biological imaging, offer good biocompatibility, facile synthesis and conjugation to a variety of biomolecular ligands, antibodies and other targeting moieties, making them suitable for use in biochemical sensing and detection, medical diagnostics and therapeutic applications. since the proposed technique is based on scattering polarization microscopy, it would be in the interest of readers to also mention scanning particle enhanced raman microscopy. since the spectral properties of the overall plasmon resonance of two coupled spherical metal nanoparticles are subject to their material, shape, size, orientation, distance and surrounding medium, modulation of the spectral position and the spectral line width can be used to estimate the distance between two coupled metallic nanoparticles. important related references, such as, reinhard et al., nano lett. 5, 2246 (2005) and olk et al., nano lett. 8, 1174 (2008) can be included to make the paper more comprehensive. the details of the experimental setup should include parameters such as the powers or intensities of the laser beams used, the spot size of the beams at the object plane, accuracy in the angular measurements, beam splitting ratio, bandwidth of the filters used, etc. to facilitate better understanding of the experiment. it should be clarified whether black and white or color recordings were made. a color ccd camera can improve the identification of particles by efficient measurement of scattered intensities and their color. a discussion on the photostability of the proposed technique, which has been mentioned as the main advantage over fluorescence and super-resolution based techniques in the results and discussion section, would provide justification and highlight the importance of scattering polarization microscopy. since it is an experimental study, a discussion on the various factors that can lead to uncertainty in the distance and orientation measurement, such as non-uniformity of the particles (both size and shape), noise, accuracy of the measurement of angles etc., is required to clearly assess the limitations and help in future efforts to improve the technique. since relative error depends on distance, it would be good to mention where the best spatial resolution for inter-particle separation occurs for 20 nm au plasmon ruler. the efficient use of two color imaging to detect scattering centers demonstrated in the illumination setup by grecco and mart́ınez in combination with other techniques, provides a robust anisotropy based microscopic tool for further studies. 020011-2 papers in physics, vol. 2, art. 020011 (2010) / s. roy [1] h e grecco, o e mart́ınez, experimental determination of distance and orientation of metallic nanodimers by polarization dependent plasmon coupling, pap. phys. 2, 020010 (2010). [2] h e grecco, o e mart́ınez, distance and orientation measurement in the nanometric scale based on polarization anisotropy of metallic dimers, opt. exp. 14, 8716 (2006). [3] h wang, b m reinhard, monitoring simultaneous distance and orientation changes in discrete dimers of dna linked gold nanoparticles, j. phys. chem. c 113, 11215 (2009). [4] p k jain, x huang, i h el-sayed, m a elsayed, review of some interesting surface plasmon resonance-enhanced properties of noble metal nanoparticles and their applications to biosystems, plasmonics 2, 107 (2007). [5] s j oldenburg, r d averitt, s l westcott, n j halas, nanoengineering of optical resonances, chem. phys. lett. 288, 243 (1998). [6] c j murphy, t k sau, a m gole, c j orendorff, j gao, l gou, s e hunyadi, t li, anisotropic metal nanoparticles: synthesis, assembly, and optical applications, j. phys. chem. b 109, 13857 (2005). [7] l s slaughter, y wu, b a willingham, p nordlander, s link, effects of asymmetry breaking and conductive contact on the plasmon coupling in gold nanorod dimers, acs nano 4, 4657 (2010). [8] c l nehl, h liao, j h hafner, optical properties of star-shaped gold nanoparticles, nano lett. 6, 683 (2006). [9] d k lim, k s jeon, h m kim, j m nam, y d suh, nanogap-engineerable raman-active nanodumbbells for single-molecule detection, nat. mater. 9, 60 (2010). [10] n harris, m d arnold, m g blaber, m j ford, plasmonic resonances of closely coupled gold nanosphere chains, j. phys. chem. c 113, 2784 (2009). 020011-3 papers in physics, vol. 11, art. 110002 (2019) received: 28 october 2018, accepted: 29 april 2019 edited by: a. goñi, a. cantarero, j. s. reparaz licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.110002 www.papersinphysics.org issn 1852-4249 nuclear magnetic resonance on lafeaso0.4h0.6 at 3.7 gpa n. fujiwara,1∗ m. takeuchi,1 t. kuwayama,1 s. nakagawa,1 s. iimura,2 s. matsuishi,3 h. hosono2,3 a prototypical electron-doped iron-based superconductor lafeaso1−xhx undergoes an antiferromagnetic (af) phase for x ≥ 0.49. we have performed nuclear magnetic resonance (nmr) measurements on lafeaso0.4h0.6 at 3.7 gpa to investigate the magnetic properties in the vicinity of a pressure-induced quantum critical point (qcp). the linewidth of 1h-nmr spectra broadens at low temperatures below 30 k, suggesting that the spin moments remain ordered at 3.7 gpa. the coexistence of gapped and gapless spin excitations was confirmed in the ordered state from the relaxation time t1 of 75as. the pressure-induced qcp is estimated to be 4.1 gpa from the pressure dependence of the gapped excitation. i. introduction a prototypical electron-doped iron-based pnictide lafeaso1−xhx (0 ≤ x ≤ 0.6) exhibits unique electronic properties in a heavily carrier-doped regime: a superconducting (sc) phase with double-domes structure expands in a wide regime (0.05 < x < 0.49) [1] and an antiferromagnetic (af) phase manifests itself by further h doping (0.49 ≤ x) [2–4]. band calculations show that both fermi surfaces and nesting vectors change by h doping: the two hole pockets present at γ point in the lightly hdoped regime almost disappear in the heavily h∗e-mail: naoki@fujiwara.h.kyoto-u.ac.jp 1 graduate school of human and environmental studies, kyoto university, yoshida-nihonmatsu-cyo, sakyo-ku, kyoto 606-8501, japan. 2 institute for innovative research, tokyo institute of technology, 4259 nagatsuda, midori-ku, yokohama 2268503, japan. 3 materials research center for element strategy, tokyo institute of technology, 4259 nagatsuda, midori-ku, yokohama 226-8503, japan. doped regime [5, 6]. the change in the nesting vectors due to h doping would cause a change in wavevector (q) dependent spin susceptibility χ(q,ω) and would allow for the appearance of two af phases in the lightly and heavily h-doped regimes. the af phase in the heavily h-doped regime is strongly suppressed upon applying pressure [7]. we have performed nuclear magnetic resonance (nmr) measurements on lafeaso0.4h0.6 at 3.7 gpa, and we have found that the spin excitation gap appearing at the af phase vanishes at around 4.1 gpa. we have investigated the magnetic properties in the vicinity of a pressure-induced quantum critical point (qcp)('4.1 gpa). ii. experimental apparatuses and conditions a pressure of 3.7 gpa was applied using a nicralhybrid clamp-type pressure cell as shown in fig. 1 [8]. we have used a mixture of fluorinert fc-70 and fc-77 as the pressure-transmitting medium. a coil wounded around the powder samples and an optical fiber with the ruby powders glued on top 110002-1 papers in physics, vol. 11, art. 110002 (2019) / n. fujiwara et al. figure 1: a nicral-hybrid clamp-type pressure cell [8]. a coil wounded around the powder samples and an optical fiber with the ruby powders were inserted into the sample space. were inserted into the sample space of the pressure cell [8]. the size of the coil was 2.4 mm in diameter and 3.5 mm in length, and the number of windings was 18 turns. the pressure was monitored through ruby fluorescence measurements. the r1 and r2 lines at ambient pressure, 3.0 and 3.7 gpa are shown in fig. 2. the wavelength of the r1 or r2 peak shifts linearly with respect to pressure. the shift of the wavelength ∆λ satisfies the relation p(gpa)=∆λ(nm)/0.365. nmr measurements for the powder samples were acquired using a conventional coherent-pulsed nmr spectrometer. the relaxation rate (1/t1) was measured using a conventional saturation-recovery method for the samples whose feas planes are parallel to the applied field. iii. experimental results i. 1h-nmr spectra 75as(i = 3/2)-nmr spectra broaden due to the nuclear quadrupole interaction, which makes difficult to investigate the antiferromagnetic (af) state. however, 1h(i = 1/2) is free from the nuclear quadrupole interaction. therefore, the 1h signal is narrow at a paramagnetic state, and the broadening in the af phase directly reflects the mag500 400 300 200 100 0 in te n si ty ( a rb .u n it s) 700698696694692690 wave length (nm) 3.7 gpa 3.0 gpa 0.1 mpa figure 2: ruby fluorescence spectra. the smaller and larger peaks correspond to the r2 and r1 transitions, respectively. n m r i n te n si ty ( a rb . u n it ) 10.09.59.08.58.07.57.0 h(koe) 4.2k 10k 20k 60k 30k 40k 1 h 9 f figure 3: 1h-nmr spectra for lafeaso0.4h0.6 measured at 3.7 gpa and 35.1 mhz. the 9f signal originates from the pressure-transmitting medium, a mixture of fluorinert fc-70 and fc-77. 110002-2 papers in physics, vol. 11, art. 110002 (2019) / n. fujiwara et al. 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 l in e w id th ( k o e ) 140120100806040200 t (k) 0.1 mpa 3.7 gpa t n tn figure 4: the increase in 1h linewidth due to the ordered spin moments. tn represents the antiferromagnetic (af) transition temperature. nitude of the spin moments. figure 3 shows 1hnmr spectra measured at 3.7 gpa and 35.1 mhz. the sharp signal of 9f originates from the pressuretransmitting medium mentioned above. the temperature dependence of the linewidth is shown in fig. 4 together with the data at ambient pressure [2, 4]. the onset of the broadening in fig. 4 corresponds to the af transition temperature (tn ). the maximum spin moment is estimated to be 1.80 µb [4]. as seen in fig. 4, tn is about 100 k at ambient pressure and decreases to 30 k at 3.7 gpa. the pressure-induced qcp is expected at a much higher pressure regime. ii. 1/t1t for 75as the relaxation rate divided by temperature 1/t1t provides a measure of low-energy spin fluctuations. in general, neglecting the wave-number (q) dependence of the hyperfine coupling constant, 1/t1t is proportional to the imaginary part of the susceptibility: 1/t1t ∝ σqimχ(q,ω)/ω where ω represents a nmr frequency. 75as is preferred to 1h for t1 measurements, because feas layers are hardly affected by the random distrubution of hydrogen in lao1−xhx layers. furthermore, owing to the nuclear quadrupole interaction, one can pick up the 75as signals coming from the powders whose feas planes are parallel to the applied field. figure 5 500 400 300 200 100 0 g a p ( k ) 543210 p (gpa) 5 6 7 8 0.1 2 3 4 5 6 7 8 1 2 3 4 5 1 / t 1 t ( s1 k -1 ) 20015010050 t (k) 0.1 mpa 3.0 gpa 3.7 gpa t n t n t n figure 5: relaxation rate of 75as divided by temperature, 1/t1t for lafeaso0.4h0.6. tn represents the af transition temperature. the inset shows the pressure dependence of the spin excitation gap ∆ (see eq. (1)). shows 1/t1t for 75as, and the peaks correspond to tn . the values of tn determined from 1/t1t are consistent with those obtained from the linewidth of 1h. at low temperatures just below tn , 1/t1t is expressed as follows: 1 t1t ∝ e− ∆ t (1) where ∆ represent the spin excitation gap. the pressure dependence of ∆ is shown in the inset to fig. 5. assuming that ∆ shows the linear dependence, the pressure-induced qcp is estimated to be 4.1 gpa. iv. discussion the activated spin excitation as shown in eq. (1) originates from a spin density wave (sdw). however, 1/t1t also shows curie-weiss behavior below 110002-3 papers in physics, vol. 11, art. 110002 (2019) / n. fujiwara et al. tn . the behavior is not observed at ambient pressure and it is characteristic of the critical behavior near the pressure-induced qcp. the coexistence of the gapped and gapless excitations are specific to this system. in this system, major fermi surfaces are electron pockets with a square-like shape in two dimensional k space. some parts of the electron pockets would contribute to the nesting and the sdw formation. the critical behavior would originate from the other parts of the fermi surfaces. the nesting condition becomes worse and the bandwidth becomes broader with increasing pressure. owing to these effects, the activated behavior shown in eq. (1) would disappear at the pressureinduced qcp. v. conclusions we performed nmr measurements on lafeaso0.4h0.6 at 3.7 gpa to investigate the magnetic properties in the vicinity of the pressureinduced qcp. we have found that the sdw ordered state still remains at 3.7 gpa. the pressure-induced qcp is estimated to be 4.1 gpa from the pressure dependence of the spin excitation gap. the gapless excitation observed as the curie-weiss behavior of 1/t1t coexists with the gapped excitation, implying that each excitation originates from different parts within the fermi surfaces. acknowledgements this work is supported by jsps kakenhi grant number jp18h01181, and a grant from mitsubishi foundation. we thank h. kontani and h. takahashi for discussion. [1] s iimura, s matsuishi, h sato, t hanna, y muraba, s w kim, j e kim, m takata, h hosono, two-dome structure in electron-doped iron arsenide superconductors, nat. commun. 63, 943 (2012). [2] n fujiwara, s tsutsumi, s iimura, s matsuishi, h hosono, y yamakawa, h kontani, detection of antiferromagnetic ordering in heavily doped lafeaso1−xhx pnictide superconductors using nuclear-magneticresonance techniques, phys. rev. lett. 111, 097002 (2013). [3] m hiraishi, s iimura, k m kojima, j yamaura, h hiraka, k ikeda, p miao, y ishikawa, s torii, m miyazaki, i yamauchi, a koda, k ishii, m yoshida, j mizuki, r kadono, r kumai, t kamiyama, t otomo, y murakami, s matsuishi, h hosono, introduction to solid state physics, nat. phys. 10, 300 (2014). [4] r sakurai, n fujiwara, n kawaguchi, y yamakawa, h kontani, s iimura, s matsuishi, h hosono, quantum critical behavior in heavily doped lafeaso1−xhx pnictide superconductors analyzed using nuclear magnetic resonance, phys. rev. b 91, 064509 (2015). [5] y yamakawa, s onari, h kontani, n fujiwara, s iimura, h hosono, phase diagram and superconducting states in lafeaso1−xhx based on the multiorbital extended hubbard model, phys. rev. b 88, 041106(r) (2013). [6] s iimura, s matsuishi, m miyakawa, t taniguchi, k suzuki, h usui, k kuroki, r kajimoto, m nakamura, y inamura, k ikeuchi, s ji, h hosono, switching of intra-orbital spin excitations in electron-doped iron pnictide superconductors, phys. rev. b 88, 060501(r) (2013). [7] n fujiwara, n kawaguchi, s iimura, s matsuishi, h hosono, quantum phase transition under pressure in a heavily hydrogen-doped ironbased superconductor lafeaso, phys. rev. b 96, 140507(r) (2017). [8] n fujiwara, t matsumoto, k k nakazawa, a hisada, y uwatoko, fabrication and efficiency evaluation of a hybrid nicral pressure cell up to 4 gpa, rev. sci. instrum. 78, 073905 (2007). 110002-4 papers in physics, vol. 6, art. 060006 (2014) received: 20 march 2014, accepted: 7 august 2014 edited by: a. marti licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060006 www.papersinphysics.org issn 1852-4249 influence of surface tension on two fluids shearing instability rahul banerjee,1∗ s. kanjilal1 using extended layzer’s potential flow model, we investigate the effects of surface tension on the growth of the bubble and spike in combined rayleigh-taylor and kelvin-helmholtz instability. the nonlinear asymptotic solutions are obtained analytically for the velocity and curvature of the bubble and spike tip. we find that the surface tension decreases the velocity but does not affect the curvature, provided surface tension is greater than a critical value. for a certain condition, we observe that surface tension stabilizes the motion. any perturbation, whatever its magnitude, results stable with nonlinear oscillations. the nonlinear oscillations depend on surface tension and relative velocity shear of the two fluids. i. introduction when two different density fluids are divided by an interface, the interface becomes unstable with exponential growth under the action of a constant acceleration acting in the direction perpendicular to the interface from the heavier to lighter fluid or under the action of relative velocity shear of two fluids. these two types of instabilities are known as rayleigh-taylor and kelvin-helmholtz instabilities, respectively. temporal development of the nonlinear structure of the interface consequent to rayleigh-taylor or kelvin-helmholtz instability is currently a topic of interest both from theoretical and experimental points of view. the nonlinear structure is called a bubble if the lighter fluid penetrates across the unperturbed interface into the heavier fluid and it is called a spike if the opposite takes place. the instabilities arise in connection with a wide range of problems ranging from direct or indirect laser driven experiments in the abla∗e-mail:rbanerjee.math@gmail.com 1 st. paul’s cathedral mission college, 33/1, raja rammohan roy, sarani, 700 009 kolkata, india. tion region at compression front during the process of inertial confinement fusion [1, 2] to mixing of plasmas in space plasma systems, such as boundary of planetary magnetosphere, solar wind and cluster of galaxies [3]. in high energy density physics(hedp), formation of supernova remnant or formation of astrophysical jets [4–8] are also seen in these types of instabilities. in high energy density plasma experiments using omega laser [9], kelvin-helmholtz instability growth has recently been observed . there are several methods to describe the nonlinear structure of the interface of two constant density fluids under potential theory and the associated nonlinear dynamics has been studied by many authors [10–13]. layzer [10] described the formation of the structure using an expansion near the tip of the bubble or the spike up to second order in the transverse coordinates in two dimensional motion and this approach was extended in ref. [14] for kelvin-helmholtz instability. it is well known [15] that the surface tension reduces the linear rayleigh-taylor growth rate. the lowering in the growth rate is seen to increase with increase in the wave number k up to a critical wave 060006-1 papers in physics, vol. 6, art. 060006 (2014) / r. banerjee et al. number kc = √ (ρh−ρl)g t , where t denotes surface tension, and ρh and ρl are the densities of the heavier and lighter fluids, respectively. the same effect has been described by mikaelian [16] for rayleigh– taylor instability in finite thickness and sung-ik sohn [17] described the effect using the layzer nonlinear potential model. the nonlinear theory influence of surface tension was elaborately studied by pullin [18] and garnier et al. [19] using numerical methods. the present paper addresses to the problem of the time development of the nonlinear interfacial structure caused by combined rayleigh–taylor and kelvin–helmholtz instability in presence of surface tension. it is shown that the growth rate of the instabilities is affected by the surface tension. the growth rate of the tip of the bubble or spike are significantly reduced due to the surface tension. we observed an oscillatory stabilization of the interface for large surface tension. this oscillation depends on the relative velocity shear also. section ii deals with the basic hydrodynamical equations together with the geometry involved. here we assume that the fluids are inviscid and the motion is irrotational. the investigation of the nonlinear aspect of the structure of the two fluids interface is facilitated by bernoull’s equation together with the pressure balance equation at the interface. the long time asymptotic behavior of the bubble and spike tip for combined rayleigh–taylor and kelvin–helmholtz instabilities is derived in section iii.a and iii.b, respectively. we have also discussed the characteristics of the tip of the bubble and the spike derived analytically and numerically. finally, we have concluded the results in section iv. ii. basic mathematical model we have considered two incompressible fluids separated by an interface located at y = 0 in a twodimensional x − y plane, where x axis lying normal to the unperturbed fluid interface. the fluid with density ρh is assumed to overlie the fluid with density ρl and gravity is taken along negative yaxis. in the following discussion, we shall denote the properties of the fluid above the interface by the subscript h and below the interface by the subscript l. after perturbation, the nonlinear interface is assumed to take up a parabolic shape, given by y = η(x,t) = η0(t) + η2(t)(x−η1(t))2 (1) the perturbed interface forms a bubble or spike according to η0(t) > 0, η2(t) < 0 or η0(t) < 0, η2(t) > 0. functions η0(t) and η1(t) are related to the position of the tip of the bubble from the unperturbed interface, i.e, at time t the position of the bubble tip is (η1(t),η0(t)) and η2(t) is related to the bubble curvature. in our previous works [14, 20–23], we have considered η1(t) = 0 due to the absence of velocity shear parallel to the unperturbed interface. however, in presence of streaming motion of the fluids, the tip of the bubble moves parallel to unperturbed interface with velocity η̇1(t). according to the extended layzer model [10, 11, 14,20], the velocity potentials describing the motion for the upper (heavier) and lower (lighter) fluids are assumed to be given by φh(x,y,t) = a1(t) cos (k(x−η1(t))e−k(y−η0(t)) + a2(t) sin (k(x−η1(t))e−k(y−η0(t)) −xuh (2) φl(x,y,t) = b0(t)y + b1(t) cos (k(x−η1(t))ek(y−η0(t)) + b2(t) sin (k(x−η1(t))ek(y−η0(t)) −xul (3) where uh and ul are streaming velocities of upper and lower fluids, respectively, and k is the perturbed wave number. the evolution of the interface y = η(x,t) can be determined by the kinematical and dynamical boundary conditions. the kinematical boundary conditions are ∂η ∂t − ∂η ∂x ∂φh ∂x = − ∂φh ∂y (4) ∂η ∂x ( ∂φh ∂x − ∂φl ∂x ) = ∂φh ∂y − ∂φl ∂y (5) and the dynamical boundary condition (first integral of the momentum equation) is of the form 060006-2 papers in physics, vol. 6, art. 060006 (2014) / r. banerjee et al. −ρh(l) ∂φh(l) ∂t + 1 2 ρh(l)(~∇φh(l))2 + ρh(l)gy = −ph(l) + fh(l)(t) (6) the pressure boundary condition at two fluid interface including surface tension [17, 22] is ph −pl = t r (7) where t is the surface tension and r is the radius of curvature. plugging the condition (7) at the interface y = η(x,t) in eq. (6), we obtain the following equation. ρh[− ∂φh ∂t + 1 2 (~∇φh)2] −ρl[− ∂φl ∂t + 1 2 (~∇φl)2] +g(ρh −ρl)y = − t r + fh −fl (8) we have restricted our study near the peak of the perturbed structure where |k(x−η1(t))|� 1. thus, we can neglect the terms of o(|x−η1|i) (i ≥ 3) [14]. with this point of view, we have 1 r = 2η2 ( 1 + 4η22(x−η1) 2 )−3 2 ≈ 2η2 ( 1 − 6η22(x−η1) 2 ) (9) we substitute all the parameters η, φh and φl in the kinematic and dynamic boundary conditions represented by eqs. (4), (5), (8) and (9), and equate coefficients of (x−η1)i,(i = 0, 1, 2) and neglect terms o(|x−η1|i) (i ≥ 3). this yields the following equations. dξ1 dτ = ξ4 (10) dξ2 dτ = vh − ξ5(2ξ3 + 1) 2ξ3 (11) dξ3 dτ = − 1 2 (6ξ3 + 1)ξ4 (12) kb0√ kg = − 12ξ3ξ4 6ξ3 − 1 (13) k2b1√ kg = 6ξ3 + 1 6ξ3 − 1 ξ4 (14) k2b2√ kg = (2ξ3 + 1)ξ5 − 2ξ3(vh −vl) 2ξ3 − 1 (15) dξ4 dτ = n1(ξ3,r) d1(ξ3,r) ξ24 (6ξ3 − 1) + 2(1 −r)ξ3(6ξ3 − 1) d1(ξ3,r) ( 1 − 12ξ22 k2 k2c ) + n2(ξ3,r) d1(ξ3,r) (6ξ2 − 1)ξ25 2ξ3(2ξ3 − 1)2 + 2(4ξ3 − 1)(6ξ3 − 1) d1(ξ3,r)(2ξ3 − 1)2 × [(vh −vl)2ξ3 − (vh −vl)(2ξ3 + 1)ξ5] (16) and dξ5 dτ = − (2ξ3 − 1)rξ4ξ5 2ξ3d2(ξ3,r) + ξ4(6ξ3 + 1) 2d2(ξ3,r)(6ξ3 − 1)(2ξ3 − 1) (17) × [4(vh −vl)(4ξ3 − 1) − ξ5 ξ3 (28ξ23 − 4ξ3 − 1)] where r = ρh ρl ; ξ1 = kη0; ξ2 = kη1; ξ3 = η2 k ; ξ4 = k2a1√ kg ; ξ5 = k2a2√ kg ; τ = t √ kg; k2c = (ρh−ρl)g t and vh(l) = kuh(l)√ kg are corresponding dimensionless quantities. the function n1,2(ξ3,r) and d1,2(ξ3,r) are given by n1(ξ3,r) = 36(1 −r)ξ23 + 12(4 + r)ξ3 + (7 −r); d1(ξ3,r) = 12(r − 1)ξ23 + 4(r − 1)ξ3 − (r + 1) (18) and n2(ξ3,r) = 16(1 −r)ξ33 + 12(1 + r)ξ 2 3 − (1 + r); d2(ξ3,r) = 2(1 −r)ξ3 + (r + 1) (19) the temporal development of the combined effect of rayleigh–taylor and kelvin–helmholtz instability is given by eqs. (10)–(12), (16) and (17). 060006-3 papers in physics, vol. 6, art. 060006 (2014) / r. banerjee et al. 0 10 20 30 40 1 2 3 4 5 1 0 10 20 30 40 5 10 15 20 25 2 0 10 20 30 40 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 3 0 10 20 30 40 0.1 0.2 0.3 0.4 4 0 10 20 30 40 0.020 0.015 0.010 0.005 0.000 5 ξ τ τ τ τ τ ξ ξ ξ ξ figure 1: bubblevariation of ξ1, ξ2, ξ3, ξ4 and ξ5 with τ. initial value ξ1 =0.1, ξ2 =0, ξ3 =-0.05, ξ4 = 0, and ξ5 = 0 with ρh = 3, ρl = 2, vh=0.5, vl = 0.1, k2 k2c =0 (line), 0.5 (dot), 1 (dash), 3.9 (dashdot). iii. numerical results and discussions i. effect of surface tension on bubble growth in this section, we present the effect of surface tension on the nonlinear growth rate of the bubble tip for combined rayleigh–taylor and kelvin– helmholtz instability. to describe the dynamics of the bubble tip, it is essential to integrate eqs. (10)–(12), (15) and (16) by numerical simulation. to obtain the initial conditions of the numerical integration, we assume that the initial interface is given by y = η0(t = 0)cos(kx). the expansion of the cosine function gives (ξ2)initial = 0 and (ξ3)initial = −12 (ξ1)initial, where (ξ1)initial is the arbitrary initial amplitude. since the perturbation starts from rest, we may often choose (ξ4)initial = (ξ5)initial = 0. the non-dimensionalized time de0 10 20 30 40 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 1 0 10 20 30 0 5 10 15 20 2 0 10 20 30 40 0.14 0.12 0.10 0.08 0.06 0.04 3 0 10 20 30 0.10 0.05 0.00 0.05 0.10 4 0 10 20 30 40 0.020 0.015 0.010 0.005 0.000 5 ξ τ 40 τ τ τ τ ξ ξ ξ ξ 40 figure 2: bubblevariation of ξ1, ξ2, ξ3, ξ4 and ξ5 with τ. initial value ξ1 =0.1, ξ2 =0, ξ3 =-0.05, ξ4 = 0, and ξ5 = 0 with ρh = 3, ρl = 2, vh=0.5, vl = 0.1, k2 k2c =10 (line), 15 (dot), 20 (dash). velopment plots of ξ1, ξ2, ξ3,ξ4 and ξ5 are shown in figs. 1, 2 and 3. before we describe the nature of the bubble tip, consider the asymptotic behavior of the tip. as τ → ∞, the asymptotic values of ξ3, ξ4 and ξ5 for bubble are obtained by setting dξ3 dτ = 0, dξ4 dτ = 0 and dξ5 dτ = 0. note that, if k2 < 3 ( 1 + 15 16 ρl ρh−ρl (∆v )2 ) k2c , where ∆v = vh − vl, the asymptotic values are [(ξ3)asymp]bubble = − 1 6 (20) [(ξ4)asymp]bubble (21) = √ 2a 3(1 + a) ( 1 − k2 3k2c ) + 5 16 1 −a 1 + a (∆v )2 and [(ξ5)asymp]bubble = 0 (22) 060006-4 papers in physics, vol. 6, art. 060006 (2014) / r. banerjee et al. 0 10 20 30 40 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1 0 10 20 30 40 0 5 10 15 20 2 0 10 20 30 40 0.16 0.14 0.12 0.10 0.08 0.06 0.04 3 0 10 20 30 40 0.2 0.1 0.0 0.1 0.2 4 0 10 20 30 40 0.07 0.06 0.05 0.04 0.03 0.02 0.01 5 ξ ξ ξξ ξ ττ τ τ τ 0.00 figure 3: bubblevariation of ξ1, ξ2, ξ3, ξ4 and ξ5 with τ. initial value ξ1 =0.1, ξ2 =0, ξ3 =-0.05, ξ4 = 0, and ξ5 = 0 with ρh = 3, ρl = 2, k2 k2c =20, vh=0, vl = 0 (line), vh=0.5, vl = 0.1(dot), vh=1, vl = 0.1 (dash), vh=1.5, vl = 0.1 (dash-dot). where, a = ρh−ρl ρh+ρl is the atwood number. it is clear form fig. 1 that surface tension suppresses the velocity and growth of the bubble tip significantly, provided surface tension is larger than a critical threshold, t < tbubblec , where tbubblec = 3 ( (ρh −ρl) + 15 16 ρl(∆v ) 2 ) g k2 (23) here the critical value depends on the magnitude of relative velocity shear of two fluids and the growth and velocity of the tip reduced if t < tbubblec . when there is no tangential velocity difference (i.e., vh = vl) between the two fluids initially, the fluids are purely prone to the rayleigh–taylor instability and the critical value becomes 3(ρh−ρl)g k2 . these results agree with the argument in ref. [17]. in absence of surface tension, the asymptotic values coincide with the results obtained in our previous work [14]. further, if t > tbubblec , oscillatory state emerges even for r > 1. figures 2 and 3 describe the oscillatory state of the motion. the amplitude and the period of oscillation decrease monotonically for large surface tension (fig. 2), while the amplitude of oscillation increases for large relative velocity shear (fig. 3). in this respect, figs. 2 and 3 show that there always exists a self generated oscillatory transverse velocity component (−ξ5) due to perturbation and this depends upon surface tension as well as the relative velocity shear ∆v at the two fluids interface. for negative velocity shear (i.e, ∆v < 0), the self generated oscillatory transverse velocity of the bubble peak acts opposite to the direction of vh and the amplitude of oscillation increases for large surface tension. if t = tbubblec , equilibrium is attained, i.e, ξ̇3 = ξ̇4 = ξ̇5 = 0 when ξ3 = − 1 6 and ξ4 = ξ5 = 0 (24) and the equilibrium becomes unstable. this feature is shown with a dot-dash line in fig. 1. thus, the combined rayleigh–taylor and kelvin–helmholtz instability is stabilized when k2 > 3 ( 1 + 15 16 ρl ρh −ρl (∆v )2 ) k2c, i.e., t > tbubblec (25) while the instability however persists but with reduced growth rate for k2 ≤ 3 ( 1 + 15 16 ρl ρh −ρl (∆v )2 ) k2c, i.e., t ≤ tbubblec (26) according to the condition (25), for ρh = 3, ρl = 2, vh = 0.5 and vl = 0.1, the motion is stabilized when k 2 k2c > 3.9. these results are exhibited in fig. 2, where k 2 k2c > 3.9. for k 2 k2c = 3.9, the growth rate of the instability is asymptotically diminished and becomes 0 (dash-dot line of fig. 1). however, fig. 1 shows the suppression of growth rate of the instability due to surface tension, when k 2 k2c < 3.9. 060006-5 papers in physics, vol. 6, art. 060006 (2014) / r. banerjee et al. 0 10 20 30 40 8 6 4 2 0 1 0 10 20 30 40 0 1 2 3 4 2 0 10 20 30 40 0.06 0.08 0.10 0.12 0.14 0.16 3 0 10 20 30 40 0.0 0.1 0.2 0.3 0.4 0.5 4 0 10 20 30 40 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 5 ξ ξ ξ ξ τ τ ττ τ ξ figure 4: spikevariation of ξ1, ξ2, ξ3, ξ4 and ξ5 with τ. initial value ξ1 =-0.1, ξ2 =0, ξ3 =0.05, ξ4 = 0, and ξ5 = 0 with ρh = 3, ρl = 2, vh=0.5, vl = 0.1, k2 k2c =0 (line), 0.5 (dot), 1 (dash), 4.35 (dash-dot). ii. effect of surface tension on spike growth the temporal evolution of spike state is exhibited in figs. 4, 5 and 6; the results follow from the numerical integration of eqs. (10)–(12), (15) and (16) using the transformation ξ1 →−ξ1, ξ3 →−ξ3, g → −g, r → 1 r and vh ⇀↽ vl. the saturation curvature and velocity of the spike tip are given by [(ξ3)asymp]spike = 1 6 (27) [(ξ4)asymp]spike (28) = √ 2a 3(1 −a) ( 1 − k2 3k2c ) + 5 16 1 + a 1 −a (∆v )2 and [(ξ5)asymp]spike = 0 (29) provided k2 < 3 ( 1 + 15 16 ρh ρh−ρl (∆v )2 ) k2c . 0 10 20 30 40 0.5 0.4 0.3 0.2 -0.1 1 0 10 20 30 40 0 1 2 3 4 5 6 7 2 0 10 20 30 40 0.06 0.08 0.10 0.12 3 0 10 20 30 40 0.10 0.05 0.00 0.05 0.10 4 0 10 20 30 40 0.00 0.01 0.02 0.03 0.04 5 ξ ξ ξ ξ ξ τ τ ττ τ figure 5: spikevariation of ξ1, ξ2, ξ3, ξ4 and ξ5 with τ. initial value ξ1 =-0.1, ξ2 =0, ξ3 =0.05, ξ4 = 0, and ξ5 = 0 with ρh = 3, ρl = 2, vh=0.5, vl = 0.1, k2 k2c =10 (line), 15 (dot), 20 (dash). figure 4 describes that large surface tension suppresses the growth rate of the spike tip, as well as the bubble. the nonlinear oscillation of the spike tip is observed for k2 > 3 ( 1 + 15 16 ρh ρh−ρl (∆v )2 ) k2c and the equilibrium state arises when equality holds. the pattern of amplitude and period of oscillation are identical to that for the bubble (figs. 5 and 6). figure 5 shows the oscillatory behavior of the spike structure for different values of surface tension while the dependency of the relative velocity shear is demonstrated in fig. 6. iv. conclusion in this paper, we have studied a potential flow model to describe the nature of the nonlinear structure of a two-fluid interface under the combined action of rayleigh–taylor and kelvin–helmholtz instabilities due to surface tension. the analytic expressions for bubble and spike growth rates at 060006-6 papers in physics, vol. 6, art. 060006 (2014) / r. banerjee et al. 0 10 20 30 40 1.0 0.8 0.6 0.4 0.2 1 0 2 4 6 8 2 0.06 0.08 0.10 0.12 0.14 0.16 3 0.3 0.2 0.1 0.0 0.1 0.2 0.3 4 0.00 0.02 0.04 0.06 0.08 0.10 0.12 5 ξ ξξ ξ ξ τ τ τ τ τ 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 figure 6: spikevariation of ξ1, ξ2, ξ3, ξ4 and ξ5 with τ. initial value ξ1 =-0.1, ξ2 =0, ξ3 =0.05, ξ4 = 0, and ξ5 = 0 with ρh = 3, ρl = 2, k2 k2c =20, vh=0, vl = 0 (line), vh=0.5, vl = 0.1 (dot), vh=1, vl = 0.1 (dash), vh=1.5, vl = 0.1 (dash-dot). asymptotic stage are obtained for arbitrary atwood number and velocity shear. surface tension becomes a stabilizing factor of the instability, provided it is larger than a critical value. in this case, oscillatory behavior of motion described by numerical integration of governing equations. the nature of oscillations depends on both surface tension and relative velocity shear of two fluids. on the other hand, below the critical value, surface tension dominates the growth and growth rate of the instability. this result is expected to improve the understanding of the stabilization factor for the astrophysical instability. acknowledgements this work was supported by the university grant commission, government of india under ref. no. psw-43/12-13 (ero). [1] d k bradley, d g barun, s g glendinning, m j edwards, j l milovich, c m sorce, g w collins, s w hann, r h page, r j wallace, very-high-growth-factor planar ablative rayleigh–taylor experiments, phys. plasmas 14, 056313 (2007). [2] k s budil, b a remington, t a peyser, k o mikaelian, p l miller, n c woolsey, w m woodvasey, a m rubenchik, experimental comparison of classical versus ablative rayleigh–taylor instability, phys. rev.lett. 76, 4536 (1996). [3] h hasegawa, m fujimoto, t-d phan, h reme, a balogh, m w dunlop, c hashimoto, r tandokoro, transport of solar wind into earth’s magnetosphere through rolled-up kelvin– helmholtz vortices, nature 430, 755 (2004). [4] r p drake, hydrodynamic instabilities in astrophysics and in laboratory high-energydensity systems, plasma phys. control. fusion 47, b419 (2005). [5] j o kane, h f robey, b a remington, r p drake, j knauer, d d ryutov, h louis, r teyssier, o hurricane, d arnett, r rosner, a calder, interface imprinting by a rippled shock using an intense laser, phys. rev. e 63, 055401r (2001). [6] d d ryutov, b a remington, scaling astrophysical phenomena to high-energy-density laboratory experiments, plasma phys. control. fusion 44, b407 (2002). [7] l spitzer, behavior of matter in space, astrophys. j. 120, 1 (1954). [8] b a remington, r p drake, h takabe, d arnett, a review of astrophysics experiments on intense lasers, phys. plasma 7, 1641 (2000). [9] e c harding, j f hansen, o a hurricane, r p drake, h f robey, c c kuranz, b a remington, m j bono, m j grosskopf, r s gillespie, observation of a kelvin–helmholtz instability in a high-energy-density plasma on the omega laser, phys. rev.lett. 103, 045005 (2009). [10] d layzer, on the instability of superposed fluids in a gravitational field, astrophys. j. 122, 1 (1955). 060006-7 papers in physics, vol. 6, art. 060006 (2014) / r. banerjee et al. [11] v n goncharov, analytical model of nonlinear, single-mode, classical rayleigh–taylor instability at arbitrary atwood numbers, phys. rev. lett. 88, 134502 (2002). [12] sung-ik sohn, simple potential-flow model of rayleigh–taylor and richtmyer–meshkov instabilities for all density ratios, phys. rev. e 67, 026301 (2003). [13] q zhang, analytical solutions of layzer-type approach to unstable interfacial fluid mixing, phys. rev. lett. 81, 3391 (1998). [14] r banerjee, l mandal, m khan, m r gupta, effect of viscosity and shear flow on the nonlinear two fluid interfacial structures, phys. plasmas 19, 122105 (2012). [15] s chandrasekhar, hydrodynamic and hydromagnetic stability, dover, new york (1961). [16] k o mikaelian, rayleigh–taylor instability in finite-thickness fluids with viscosity and surface tension, phys. rev. e 54, 3676 (1996). [17] sung-ik sohn, effects of surface tension and viscosity on the growth rates of rayleigh–taylor and richtmyer–meshkov instabilities, phys. rev. e 80, 055302(r) (2009). [18] d i pullin, numerical studies of surface tension effects in nonlinear kelvinhelmholtz and rayleightaylor instability, j. fluid mech. 119, 507 (1982). [19] j garnier, c cherfils-clerouin, a p holstein, statistical analysis of multimode weakly nonlinear rayleigh–taylor instability in the presence of surface tension, phys. rev. e 68, 036401 (2003). [20] r banerjee, l mandal, s roy, m khan, m r gupta, combined effect of viscosity and vorticity on single mode rayleightaylor instability bubble growth, phys. plasmas 18, 022109 (2011). [21] m r gupta, r banerjee, l k mandal, r bhar, h c pant, m khan, m k srivastava, effect of viscosity and surface tension on the growth of rayleightaylor instability and richtmyermeshkov instability induced two fluid interfacial nonlinear structure, indian j. phys. 86, 471 (2012). [22] r banerjee, l mandal, m khan, m r gupta, bubble and spike growth rate of rayleigh taylor and richtmeyer meshkov instability in finite layers, indian j. phys. 87, 929 (2013). [23] r banerjee, l mandal, m khan, m r gupta, spiky development at the interface in rayleigh– taylor instability: layzer approximation with second harmonic, j. morden phys. 4, 174 (2013). 060006-8 papers in physics, vol. 2, art. 020009 (2010) received: 25 november 2010, accepted: 6 january 2011 edited by: a. vindigni licence: creative commons attribution 3.0 doi: 10.4279/pip.020009 www.papersinphysics.org issn 1852-4249 commentary on “anisotropic finite-size scaling of an elastic string at the depinning threshold in a random-periodic medium” andrei a. fedorenko1∗ in their paper [1] s. bustinagorry and a.b. kolton study the finite-size scaling properties of an elastic string driven in a disordered medium at the depinning transition. the zero-temperature dynamics of the model is given by the equation of motion for the displacement field u(z, t): γ∂tu(z, t) = c∂ 2 z u(z, t) + fp(u, z) + f, (1) where γ is the friction, c the elasticity, and f the driving force. fp is the gaussian random force due to disorder with zero mean and variance fp(u, z)fp(u′, z′) = ∆(u − u′)δ(z − z′). (2) the model of an elastic object in a disordered medium can be used to describe diverse physical systems. they can be cast into two classes: (i) periodic systems, such as charge density waves (cdw) in solids and vortex lattice in disordered superconductors; (ii) interfaces such as domain walls in magnets. ∆(u) is a periodic function for the random periodic systems and a short range function for the interfaces. due to the competition between elasticity and disorder, these systems exhibit a rich glassy behavior and a non-trivial response to external perturbations. in particular, a driving force f exceeding a certain threshold value fc is required ∗e-mail: andrey.fedorenko@ens-lyon.fr 1 cnrs-laboratoire de physique de l’ecole normale supérieure de lyon, 46, allée d’italie, 69007 lyon, france. to set the elastic object into steady motion. to some extent, the depinning transition shares many features with critical phenomena where the velocity v plays the role of the order parameter. for instance, the correlation length defined through the velocity-velocity correlation function diverges when approaching the transition from above as ξ ∼ (f − fc)−ν . however, in comparison with the ordinary critical phenomena, the problem of disordered elastic systems is notably difficult due to the so-called dimensional reduction. the latter takes place also in random field spin systems and states that a d-dimensional disordered system at zero temperature is equivalent to all orders in perturbation theory to a pure system in d−2 dimensions at finite temperature. it turns out that the metastability renders the zero-temperature perturbation theory useless: it breaks down on scales larger than the socalled larkin length. the peculiarity of the problem is that for d < 4 there is an infinite set of relevant operators, which can be parameterized by the function ∆(u) = −r′′(u). the corresponding functional renormalization group (frg) equation for the renormalized disorder correlator has two fixed points: the random field fixed point (rffp) which describes the depinning transition for the interfaces and the random periodic fixed point (rpfp) which describes the random periodic systems. the running (renormalized) disorder correlator becomes a non-analytic function beyond the larkin scale. the appearance of a non-analyticity in the form of a cusp at the origin is related to metastability, and 020009-1 papers in physics, vol. 2, art. 020009 (2010) / a. a. fedorenko nicely accounts for the generation of a threshold force at the depinning transition: fc ∼ −∆∗′(0+). note, that the latter is zero in the bare (analytic) theory. thus, frg provides a way to analytically compute the threshold force which turns out to be nonuniversal (similar to the critical temperature in critical phenomena) and the critical exponents which depend only on the universality class. in particular, one can compute the roughness exponent which gives the scaling of the average width of the interface with its size: w2 ∼ l2ζ . the observables computed using frg in the thermodynamic limit are already averaged over different disorder realizations. the picture for finite systems is more involved. in numerical simulations, one usually considers a finite system of linear size l moving in a box of size m with periodic boundary conditions. then, depending on how one performs the finite-size scaling analysis, one can approach different scaling regimes. in the limit of l → ∞ at fixed m one approaches the scaling behavior of a random periodic system. in the opposite limit of m → ∞ at fixed l, the system falls in one of the universality classes of a particle moving in a random landscape described by some particle fixed point (pfp) [2]. the random field universality class describing the scaling properties of a pinned interface corresponds to the limit of both l, m → ∞. however, one has to specify the precise way of taking this limit in order to separate the basin of attraction of the rffp from those of the rpfp and the pfp. studying the crossover between these three universality classes is the main aim of the paper [1]. the key parameter is k = m/lζdep where ζdep is the roughness exponent of the interface (a string) at the depinning transition. it has been found in [1] that w2l−2ζdep is a universal function of k, g(k), which has a minimum at k∗ < 1. for k � k∗, one reveals that the periodicity is important and the system is described by the rpfp. in this case, there are two relevant contributions to the displacement correlation function: the universal logarithmic growth with zero roughness exponent and the larkin type growth due to instability of rpfp which gives the roughness exponent ζl = (4−d)/2. for k ≈ k∗, one expects that the system approaches the rffp universality class in the limit l → ∞. the function g(k) for 0 < k < k∗ describes the crossover between the rprf and the rffp. the main result of the paper is that in order to verify the predictions of frg and numerically study the properties of the rffp universality class, one has to keep k = k∗. in the case of deviations from k = k∗, the results are modified by the crossover either to the rpfp or to the pfp. it has been found that g(k) increases slower than any power-law for k > k∗. in this regime, the system is expected to show crossover to the pfp [2]. the properties of the critical configuration are determined by the extreme value statistics. for very large k, one can split the system in k almost uncorrelated regions with their own subcritical configurations. the true critical configuration corresponds to the subcritical configuration with the maximal critical force. as a result, in the limit k → ∞, the distribution of the threshold force exhibits crossover from the gaussian distribution to the gumbel distribution. the average critical force increases as fc ∼ log k since the critical force can be regarded as the maximum among k different subcritical forces. the authors argue that the increase in the critical force might be correlated with the slow increase of roughness. they suggest that the larger critical force can be achieved either by accidental extended defects or by rare uncorrelated strong pinning forces. they argue that the second scenario is more plausible since the presence of accidental correlation should decrease roughness. however, as it was shown in [3], the presence of correlated disorder and extended defects also enhances roughness. therefore, one can expect the simultaneous enhancement of the roughness and the critical force. the similar crossover effects have been observed by the authors in the behavior of the structure factor and the interface width distribution. summarizing, the authors numerically observe that the average width and its probability distribution depend, in a universal way, on the parameter k that is encoded in the universal function g(k). the parameter k is decisive regarding the universality class which will be approached in the thermodynamic limit or probed by the finite size scaling analysis. knowing the function g(k), and especially the position of its minima k∗, is required for the correct analysis of the numerical simulations. analytical calculation of g(k) using, for example, frg remains, however, an open challenging problem. 020009-2 papers in physics, vol. 2, art. 020009 (2010) / a. a. fedorenko [1] s bustingorry, a b kolton, anisotropic finitesize scaling of an elastic string at the depinning threshold in a random-periodic medium, pap. phys. 2, 020008 (2010). [2] p le doussal, k j wiese, driven particle in a random landscape: disorder correlator, avalanche distribution and extreme value statistics of records, phys. rev. e 79, 051105 (2009). [3] a a fedorenko, p le doussal, k j wiese, statics and dynamics of elastic manifolds in media with long-range correlated disorder, phys. rev. e 74, 061109 (2006). 020009-3 papers in physics, vol. 2, art. 020006 (2010) received: 10 august 2010, accepted: 28 october 2010 edited by: i. ippolito reviewed by: a. coniglio, universitá di napoli “federico ii”, napoli, italy. licence: creative commons attribution 3.0 doi: 10.4279/pip.020006 www.papersinphysics.org issn 1852-4249 invited review: effect of temperature on a granular pile thibaut divoux,1∗ as a fragile construction, a granular pile is very sensitive to minute external perturbations. in particular, it is now well established that a granular assembly is sensitive to variations of temperature. such variations can produce localized rearrangements as well as global static avalanches inside a pile. in this review, we sum up the various observations that have been made concerning the effect of temperature on a granular assembly. in particular, we dwell on the way controlled variations of temperature have been employed to generate the compaction of a granular pile. after laying emphasis on the key features of this compaction process, we compare it to the classic vibration-induced compaction. finally, we also review other granular systems in a large sense, from microscopic (jammed multilamellar vesicles) to macroscopic scales (stone heave phenomenon linked to freezing and thawing of soils) for which periodic variations of temperature could play a key role in the dynamics at stake. i. introduction: a granular pile as a fragile construction a granular pile can be described as a bunch of hard and frictional grains for which the thermal ambient agitation is negligible [1]. indeed, the potential energy of a grain of density ρ assessed over a displacement equivalent to its diameter d satisfies ρgd4/kbt ' 1011 � 1. thus a granular pile is an athermal system, and one needs to inject energy inside the pile to trigger any reorganization of the packing. the accessible packings can then be seen as jammed states, i.e. minima of the potential energy, in some energy landscape (fig. 1) and the energy one injects makes it possible to overcome the energy barrier that separates two configurations from one another. various methods have been used, over the passed 20 years, to provide energy ∗e-mail: thibaut.divoux@ens-lyon.fr 1 université de lyon, laboratoire de physique, école normale supérieure de lyon, cnrs umr 5672, 46 allée d’italie, 69364 lyon cedex 07, france. to a granular pile, among which mechanical vibrations [2–4] and shear [5,6] are the most widespread. nevertheless, this description of a granular pile in terms of an athermal system hinders one of its major characteristics: namely its fragility [1, 7]. the sensitivity of a granular pile to minute external perturbations has first been pointed out in the case of frictionless hard spheres [8–11]. a heap of such spheres presenting a slight polydispersity has been shown to be isostatic and thus very sensitive to external perturbations. in the case of frictional spheres, jamming and isostaticity no longer go hand in hand and the way the packing has been build plays a crucial role on its stability [12]. nevertheless, the pile pictured as a contact network, can be decomposed in two subnetworks: a network gathering strong contacts (weak contacts resp.) involving grains which carry a force larger (lower resp.) than the average force in the packing [13]. the key contribution of both this “strong contact” network and the surface roughness of the grains to the fragility of the pile is very well illustrated by the scalar arching model (sam) [14]. this model, inspired from the q-model developed by liu, coppersmith 020006-1 papers in physics, vol. 2, art. 020006 (2010) / t. divoux figure 1: sketch of the energy landscape associated to the dynamics of a glass, below the glass transition. in the case of a granular pile, the thermal agitation is negligible and one has to inject energy to overcome the energy barrier separating the jammed states. reproduced with permission from [85]. copyright wiley-vch verlag gmbh & co. kgaa. et al. [15, 16] only takes into account the weight of the grains and the solid friction between the grains following coulomb’s law. the control parameter of the simulation is the friction coefficient between two grains, denoted rc. in this model, looking at a static pile, claudin and bouchaud demonstrated that a relative variation of rc as small as 10−7 triggers large scale reorganizations inside the pile, named static avalanches (fig. 2), emphasizing the fragility of a pile to minute perturbations, despite its athermal nature [14]. of course, in a laboratory experiment, controlling and varying the friction coefficient between the grains during an experiment is impossible. nonetheless, as suggested in [14], variations of temperature might perturb the pile at the scale of the surface roughness of the grains, having an effect equivalent to a change of the friction coefficient. indeed, one can assess that a granular heap of size l (typically a few centimeters) submitted to variations of temperature of amplitude ∆t experiences a dilation δl = κgl∆t , where κg stands for the thermal expansion coefficient of the grains. dilations corresponding to the surface roughness scale lead to ∆t ' 0.1◦c for standard glass beads (κg = 10−6 k−1). such an amplitude is easily accessible and for instance daily variations of temperature in the lab are already of roughly a few degrees. this is the topic of this brief review, where we focus on the effect of temperature on a granular assembly. the content of this review goes as follows: in section ii we sum up the first experimental observations of uncontrolled temperature variations that have pushed for further experiments under controlled variations (section iii). in section iv, we dwell on the use of cycles of temperature to induce the compaction of a granular pile in a delicate way, in particular we discuss the role of both the amplitude and the frequency of the imposed cycles. section v deals with enlarged “granular” systems and extend the scope of the results presented in previous parts. finally, section vi proposes some outlooks to this method of thermal cycling as a general method to probe granular systems. figure 2: force chains in a two-dimensional granular pile (200×200) following the scalar arching model. the white dots label the grains involved in a force chain in the initial force network, whereas the black dots label the grains involved in the force chains network after a relative change of 10−7 of the friction coefficient rc. one observes that the reorganization takes place in the whole pile, despite the perturbation taking place at the scale of the grain surface roughness. reprinted figure with permission from [14]. copyright (1997) by the american physical society. 020006-2 papers in physics, vol. 2, art. 020006 (2010) / t. divoux ii. from undesired variations of temperature ... the effect of temperature variations over a granular pile has first been reported as a hindrance to perform reproducible measurements. those experiments were not dedicated to probe the consequences of temperature variations on a granular assembly, and the role of temperature is simply assessed. nonetheless, several issues were raised in this seminal work, and are highlighted here. i. sound in sand the influence of temperature is first mentioned in the early 90’s in a work dealing with sound propagation in sand [17–19]. an acoustic wave is generated inside a granular pile and recorded a few centimeters away. c. liu and s. nagel observed that “a temperature change of only 0.04 k inside the pile, produced by the change of the ambient temperature, or by a local heater, could cause a factor of 3 reversible change in the measured vibration transmission” [17]. one can indeed assess the grain dilation δd associated to such variations of temperature ∆t to be δd = κg∆t d ' 2 nm. in agreements with the reversibility of the effect, such an amplitude is on the one hand negligible compared to the typical surface roughness of the beads used in their experiments (probably 100 nm for glass beads [32]) and, on the other hand less than the typical deformation of the beads inside the heap (roughly 10 nm assuming a hertz-like contact). however, at the time of these experiments, it was still unclear whether the temperature or the gradient of temperature were leading to such observations. experiments performed by clément and co-workers later confirmed the key role of the gradient, and brought to the fore the role of the container dilation on the reorganization process [20]. ii. apparent mass fluctuations in 1997, e. clément and co-workers reproduced jansen experiments [20, 21], which consists in measuring the apparent mass of a granular pile confined in a tube. a piston at the tube bottom, implemented on an electronic scale, makes it possible to measure the apparent mass of the pile. although the experiment is in good agreement with jansen’s figure 3: apparent mass and temperature variations. the system (see text) consists in a vertical cylinder (radius 2.0 cm) filled with glass beads (typical diameter 3 mm). a drift in the temperature of 0.4◦c leads to several reorganizations inside the pile and thus to several variations of the apparent mass of the system. reprinted from [20]. prediction [22], they noticed that their data presented fluctuations as high as 20 % in the saturated limit, where the measured mass becomes independent of the amount of beads poured inside the tube. part of this fluctuations were attributed to temperature fluctuations and the authors emphasized that “the origin is mixed since it can be due to the dilation of the boundaries and the resulting action on the piston or it can be due to the dilation of the grains, themselves inducing spontaneous rearrangements of the force network ” [20]. in particular, the contribution of the boundary dilation will be addressed in section iv. figure 3 illustrates that even a drift in the temperature of 0.4◦c is sufficient to trigger rearrangements which lead to measurable mass fluctuations. however, no systematic measurements were performed on this jansen configuration and it would still be of great interest to assess the effect of temperature variations on apparent mass of a pile [23]. iii. ... to controlled variations of temperature since then, temperature variations have been used to generate minute perturbations inside a granular medium. local heaters placed at different positions 020006-3 papers in physics, vol. 2, art. 020006 (2010) / t. divoux figure 4: sketch of the experimental setup used to measure the effective thermal conductivity λg. the granular material consists in glass beads (typical diameter 300 µm). the copper tube dimensions are the followings: inner diameter 1 cm; outer diameter 5 cm; length 15 cm. reprinted from [26]. inside a static pile have been shown to produce very different disturbances on sound propagation underlining the key role of force chains in the propagation process [18]. moreover, small thermal perturbations have been used to generate large nongaussian conductance fluctuations in a 2d packing of metallic beads. perturbations are induced by a 75 w light bulb standing a few centimeters away from the packing and experiments were performed with light turned on for packing prepared with light off, and vice versa. gathered in bursts, such conductance fluctuations were interpreted as the signature of individual bead creep rather than collective vault reorganizations [24, 25]. in particular, the probability density function p (∆t) where ∆t denotes the waiting time between two successive bursts of conductivity, follows a power law: p (∆t) ∼ ∆t−(1+αt) the exponent for stainless steal beads is found to be αt ' 0.6 independent of both the strength of the perturbation and of the external stress. the value of αt is only governed by the surface roughness of the beads which has been confirmed by afm measurements of the hurst exponent of the bead surface [24]. in agreement with the sam [14] discussed in section i, the origin of the fluctuations of conductivity originate in local micro-contact rearrangements. more recently [26], temperature variations have been applied by means of a nickel wire (diameter rw =100 µm) connected to a power supply, and which crosses a granular medium partially filling a copper tube (fig. 4). the tube is maintained at a constant temperature te (precision 0.1◦c). the imposed current i and the resulting voltage u are simultaneously measured in order to estimate the heating power p = u i and the wire temperature tw which is deduced from the resistance rw = u/i (fig. 4). this setup makes it possible to measure the effective thermal conductivity λg of the granular material as tw − te ∝ p/λg. one obtains λg = 0.162 w/m/k which is compatible with the value expected for a pile of glass beads (λglass = 1.4 w/m/k) surrounded by air (λair = 0.025 w/m/k) [27]. more interestingly, the authors also observe that for the same material λg depends on the preparation : one measures λg ' 0.162 w/m/k if the system is tapped prior to the measurement and λg ' 0.156 w/m/k if not. it is then particularly interesting to consider the figure 5: thermal conductivity λg vs. number of cycles n. the upper curve corresponds to a pile which has been gently tapped prior to the experiment. the lower curve corresponds to a loose pile. note that the conductivity is larger for a larger density. moreover, the conductivity of the loose material increases significantly when one imposes the thermal cycles (typical grain diameter 300 µm, ∆t = 40◦c). reprinted from [26]. . 020006-4 papers in physics, vol. 2, art. 020006 (2010) / t. divoux figure 6: change in packing fraction after multiple thermal cycles (cycle temperatures: red, ∆t = 107 ± 2◦c; blue, ∆t = 41 ± 1◦c) for glass spheres in a plastic cylinder. lines represent fits of the data by the sum of two exponentials introducing a short (resp. long) timescale associated to individual (resp. collective) rearrangements of the grains [34,35]. reprinted with permission from [28]. copyright (2006) by the nature publishing group. behavior of the sample when subjected to several temperature cycles. when the measurements are repeated several times, one observes that the thermal conductivity of the loose sample along with its packing fraction significantly increase with the number n of imposed cycles. by contrast, the conductivity of the tapped sample only slightly fluctuates around a constant value (fig. 5). such results clearly emphasize that one can induce compaction by thermal cycling. this issue is addressed in the following section. iv. compaction induced by thermal cycling the compaction of a granular pile has been studied in two different configurations: (i) both the grains and the container are submitted to cycles of temperature [28–30], and (ii) the grains alone experience the cycles [31, 32]. in both cases, the packing fraction increases under thermal cycling. we first review the role of the amplitude and frequency of the cycles for both cases. second, we gather some insights on the dynamics at the grain scale. i. the role of the cycling amplitude and frequency chen and co-workers [28, 30] examined the change in packing fraction for glass, polystyrene or high density polyethylene spheres (typical diameter 1 mm) contained in polymethylpentene plastic or borosilicate glass cylinders (diameter ranging from 14 to 102 mm) in response to thermal cycling. one thermal cycle is conducted by placing both the grains and the container into an oven until the thermal equilibrium is reached, before letting them relax at ambient temperature. the typical cycling period is about 10 hrs and has not been varied. they observed that the packing fraction increases, even after one cycle [28]. this effect is independent of the sample depth and width (14-102 mm), more efficient for higher cycling amplitudes (fig. 6) and, contrary to their first guess, also independent of the relative coefficients of thermal expansion of the grains and the container [30], as confirmed by numerical simulations [33]. this indicates that the compaction mechanism is local as suggested by the observations discussed in section iii and thus linked to the solid friction between the beads, and between the beads and the container. two other setups have been used to impose cycles of temperature: the first one consists of a vertical glass tube filled with spherical glass beads [26, 29]. the temperature cycles are imposed by means of a heating cable (40 w/m) directly taped on the outer surface of the tube wall. here again both the container and the grains are thus submitted to the cycles. the free surface of the material is imaged from the side with a video camera which makes it possible to measure accurately the column height h from the images and to perform a time resolved study of its dynamic. in such a setup one can control both the amplitude ∆t of the cycles, and the penetration length lp ≡ √ 2λ/(ρ cω) through the frequency ω/2π of the cycles (λ and c respectively denote the thermal conductivity and heat capacity of a typical glass-grains pile). prior to each experiment, the granular column is prepared in a low-density state thanks to a dry-nitrogen upward flow. the top of the column is then higher than the field imaged by the camera and one sets the amplitude of the cycles, ∆t , to the largest accessible value, ∆t = 27.1◦ c. the column flows under thermal cycling, and the preparation of the sample 020006-5 papers in physics, vol. 2, art. 020006 (2010) / t. divoux figure 7: height variation hn vs. number of cycles n. one observes first an exponential behavior at short time followed by a logarithmic creep at long time. the black curve corresponds to the test function htn ≡ h0 + he exp (−n/nc) + hl ln(n). inset oscillations of the column height associated with the temperature cycles : an and δn are respectively defined to be the amplitude of the increase and the drift of hn at the cycle n (h = 140 cm, 2π/ω = 600 s and ∆t = 10.8◦c.). reprinted figure with permission from [29]. copyright (2008) by the american physical society. ends when the top of the column enters the observation field. at this point, the granular column is “quenched”: the amplitude of the cycles is set to the chosen value ∆t lying between 0 and 27.1◦c, which defines the origin of time t = 0. the granular column is subsequently submitted to at least 1000 cycles. role of cycling amplitudehere, the cycling period (10 minutes) is chosen to ensure that the associated thermal penetration length lp ' 6 mm is about the tube radius so that all the grains experience the dilation process. figure 7 reports the variation of the column height defined as hn ≡ h(2πn/ω) − h(0), where n denotes the number of imposed cycles. for a large ∆t (typically more than 3◦c), the column systematically compacts (fig. 7) at each cycle [fig. 7, (inset)]. following barker and mehta [34, 35] like chen and coworkers [28, 30], one can try to fit the height decay by the sum of two exponentials: the result is not satisfying and the compaction dynamics is better accounted for by the following test function: figure 8: amplitude an (defined in fig. 7) versus the number of cycles n. the data are successfully accounted for by an = ∆t [a0 + b0 ln(n)] with a0 = 1.0 µm·k−1 and a0 = 0.13 µm·k−1 (initial column height: 140 cm, cycling period: 2π/ω = 600 s and, from top to bottom, ∆t =27.1, 16.2 and 10.8◦c). such an increase in the dilation amplitude with the cycles of temperature is a signature of the aging the granular pile experiences under thermal cycling [26]. reprinted figure with permission from [29]. copyright (2008) by the american physical society. htn ≡ h0 + he exp (−n/nc) + hl ln(n) (fig. 7) [32]. first, this response to the thermal quenching is strikingly similar to the one the system exhibits to vanishing step strain perturbations [36]. second, the long time logarithmic behavior of the column height leads to an inverse logarithmic evolution for the packing fraction φn which is strongly reminiscent of the way the packing fraction of the granular pile submitted to tapping evolves in the absence of any convection [2]. another remarkable feature of this time resolved study is that one can also observe that an, defined as the amplitude of the increase of hn at the n-th cycle (fig. 7), is proportional to the cycling amplitude ∆t and increases logarithmically with n (fig. 8). the more the column compacts, the higher is the dilation amplitude of the total height of the column at each cycle. this is a signature of the aging of the pile under thermal cycling, like the increase of the thermal conductivity (section iii). in the limit of small cycling amplitude ∆t (here below ∆tc = 3◦c), one observes that the column is not flowing regularly anymore, but evolves by successive collapses (typical amplitude of a tenth 020006-6 papers in physics, vol. 2, art. 020006 (2010) / t. divoux of grain diameter) separated by rest periods (randomly distributed) [fig. 10 (top)]. such a different compaction process from the one observed for high cycling amplitude has been linked to the surface roughness of the glass beads. indeed, for ∆t < ∆tc the dilation of a grain is smaller than the surface roughness, and thus, the beads behave as smooth particles whose dilation induces only localized rearrangements. thus one has to wait several cycles of temperature before observing a large scale collapse of the column level, as a cumulative effect of several local reorganizations. on the contrary, for ∆t > ∆tc the dilation of a grain is larger than the surface roughness, and thus the beads behave as rough particles. one cycle of temperature is more likely to generate a large scale reorganization [29]. role of cycling frequencythe influence of the frequency is discussed in [32], but still remains to be fully explored. on the one hand, for frequencies shorter than the one discussed above, the penetration length is larger than the tube radius. the compaction process is more efficient, and the dilation amplitude an is constant, independent of the age of the system. on the other hand, for larger frequencies, the penetration length is shorter than the tube radius. the column is observed to flow continuously while small amplitude settlings may occur (fig. 9). the interpretation proposed in [32] is the following: the penetration length is shorter than the tube radius, the grains in the center of the column are not submitted to the cycles of temperature and thus behave as a solid body that experiences some stick-slip motion due to the periodic dilations of the grains close from the walls of the container. indeed, the column dynamics over a few cycles [fig. 9, (inset)] displays a very similar behavior to a tilted monolayer of grains on an inclined plane (see fig. 3 in [37]). in this case, the flow results from a competition between a gravitational flow and the formation of arches [37]. however, a full study on the role of the frequency, and in particular on the influence of harmonics in the shape of the cycles (squares, tooth-like signals, etc.) on the compaction dynamics still remains to be done. the second setup that has been used to impose temperature cycles is very similar to the one previously discussed in figure 4. a thin wire crosses a granular column in its center, while this time the container is a thin glass tube [31]. for large enough figure 9: evolution of the column height hn versus the number n of cycles of temperature. inset: zoom on the evolution of hn over 24 cycles of temperature: the column settles by jumps separated by periods of continuous flowing (h = 140 cm, t = 2π/ω = 150 s et ∆t = 9.5◦c). reprinted from [32]. frequencies, the penetration length is shorter than the tube radius which makes it possible to cycle the grains close to the wire and let the container at rest. it is here relevant to compare the way the dilation of the container modifies the compaction process in the limit of low amplitude cycles. in the case where both the grains and the container are submitted to variations of temperature, the compaction process is linear in the number n of applied cycles [fig. 10 (top)], whereas, in the case where the grains alone are submitted to cycles of temperature and the container is fixed, the compaction process goes logarithmically with n [fig. 10 (bottom)]. both experiments were conducted under the same cycling frequency, and for comparable cycling amplitudes (resp. ∆t = 2.8 and 1◦c). those results clearly indicate that the dilation of the container plays a key role in the global compaction dynamics of the granular column. however, up to now, the full contribution of the container dilation has not been characterized. the answer should be easily obtained with the set-up presented in figure 4. 020006-7 papers in physics, vol. 2, art. 020006 (2010) / t. divoux figure 10: height variation hn of the column versus the number of cycles of temperature. (top) curve obtained in the case where temperature cycles were applied to both the container and the grains. on average, hn decreases linearly with the number of cycles and the column settles by jumps separated by rest periods (h = 140 cm, 2π/ω = 600 s, and ∆t =2.8◦c). reprinted figure with permission from [29]. copyright (2008) by the american physical society. (bottom) curve obtained in the case where cycles of temperature are applied to the grains alone by means of a hot wire crossing the pile (see fig. 4). the column also settles by jumps, but the height variations goes as the logarithm of the number of cycles. reprinted from [31]. ii. dynamics at the grain scale under thermal cycling one of the key features of the motion induced by thermal cycling in a granular pile is that a grain stays in contact with its neighbors and that the global motion is really slow (hours and days typically). it makes it possible to use experimental techniques developed to study creep motion, like dynamic light scattering (dls) [38], successive snapshots [29] or 3d scanning [39] to follow the individual grain trajectories. recently, l. djaoui and j. crassous used a dls setup to follow the evolution of a granular pile under thermal cycling [40]. this method was successful at extracting both linear and non-affine displacement of the grains [41], as recently done under mechanical shear [42], thus confirming that controlled temperature variations are a delicate way to generate small displacements. slotterback and co-workers have been following the dynamics induced by thermal cycling at the grain scale [39]. in their experiment, the column of grains is immersed in an index-matching oil containing a laser dye which makes it possible to produce 3d images of the system at the end of each cycle by means of a laser sheet scanning method. particle displacements in this jammed fluid correlate strongly with rearrangements of the voronoi cells defining the local environment about the particles. this might be a proof for stringlike cooperative motion as already pointed out in quasi-2d air driven granular flows [43], and more generally in supercooled liquids [44, 45]. however, it is rather difficult to compare these results to those obtain for dry grains [28, 29]. first, because in the case of the immersed experiment, a weight is placed on top of the granular column to apply a controlled vertical force, whereas in the dry case the top column is free of any weight. second, the presence of the interstitial fluid lubricates the contacts between the grains, and certainly leads to a more homogeneous propagation of temperature front through the pile [46]. indeed, the thermal conductivity of oil is roughly 10 times higher than the thermal conductivity of air which reduces the role played by force chains in the heat conduction. those three effects speed up the compaction process compared to the dry case: the steady state packing fraction is obtained after only 10 cycles in the immersed system (fig. 11). iii. a comparison with the tapping experiments let us here recall that tapping experiments consist in imposing vertical vibrations to a container full of grains. in practical terms, a low frequency signal (usually 10 < f = ω/2π < 100 hz) [47] is used to generate a “tap” of amplitude a, two successive taps being separated by a duration ∆t long enough to be considered as independent (roughly ∆t ' 1 s). 020006-8 papers in physics, vol. 2, art. 020006 (2010) / t. divoux figure 11: packing fraction vs the number of cycles of temperature for various temperature differentials. the system consisting of glass beads immersed in an index matching oil inside a cylinder, is subjected to thermal cycling via a water bath. a weight is placed on top of the bead packing to apply a controlled vertical force. both this weight and the interstitial fluid might explain the rapid compaction compared to what is observed in fig. 7. reprinted figure with permission from [39]. copyright (2008) by the american physical society. the key control parameter is the reduced acceleration γ defined as γ ≡ aω2/g [48] and one can distinguish between three different regimes. for low values of the reduced acceleration (0 ≤ γ ≤ γ∗ ' 1.2), where γ∗ denotes the critical acceleration at which the grains lose contact with the bottom of the container, the compaction process is extremely slow [49]. this regime is very similar to that observed for thermal cycling: the geometry of the pile is essentially frozen and the force network can still evolve by slowly depleting the most fragile contacts [50, 51]. besides, under such small vibrations, the packing is aging: the behavior of the pile is first dominated by grain motion and then by the contact force variations at larger timescales [51]. however, in this range of acceleration, no phenomenological law was proposed to describe the evolution of the packing fraction and/or the column height that could be compared to the results obtained under thermal cycling. for intermediate values of the acceleration (γ∗ ' 1.2 ≤ γ ≤ γc ' 2), the compaction process is more efficient and takes place over timescales accessible in the lab [2,49]. at each tap, the packing loses contact with the bottom of the container, and crushes back which generates a shockwave responsible for the compaction of the pile. a convection roll might take place inside the container roughly, if the ratio of the container size to the typical bead diameter is large enough, a signature of which can be found on the free surface of the pile that presents a slope [52]. such convection rolls are not observed in a pile submitted to cycles of temperature as confirmed by the results observed under tapping for higher range of acceleration. indeed, for larger values of the acceleration (γ > γc ' 2), the compaction takes place homogeneously in the whole shaken sample [2, 53], but the dynamics strongly depends on the presence or the absence of convection rolls inside the pile [4,53,54]. without any convection roll, the packing fraction φn evolves in a similar way to what was found for thermal cycling in [29]: φ∞ − φn ∝ 1 ln(n) (1) where φ∞ is the steady state packing fraction obtained at long time. this phenomenological expression, first proposed in [2], has been justified theoretically using various approaches: analogy to a parking process [55,56], “tetris-like” models [57,58], excluded volume approach [59] and void diffusion models [60, 61]. a common feature of these models is the geometrical frustration: the denser the packing, the harder it becomes to insert grains from the top. such a frustration process seems to be also at stake in the compaction induced by thermal cycling, which still remains to be properly put into equations. still, under tapping, if convection rolls develop inside the pile, the compaction dynamics follows a kohlraush-williams-watts (kww) law: φ∞ − φn ∝ exp [ −(n/nf )β ] (2) where nf and β are two parameters which depend on the acceleration γ [49, 53, 62]. such a law pops up naturally if one considers the global dynamics at stake as a superposition of several processes, each with a proper and well defined characteristic time. indeed, one can see here the compaction process as the sum of the reorganization of several groups of beads, each with different sizes and different relaxation times. one can also emphasize that the global 020006-9 papers in physics, vol. 2, art. 020006 (2010) / t. divoux dynamics is not controlled by the biggest (and thus slowest) groups of beads, but by the fastest individual beads. these are the grains that can quickly relax individually and jump over large distances (roughly their radius), and so during a single tap, that control the compaction dynamics [63]. this result indicates once again that there are certainly no convection rolls inside a granular pile submitted to cycles of temperature as neither the kww law nor rare jumps of individual beads are observed experimentally in this case [64]. figure 12: (top) scheme of the capillary containing the onion gel, the location of the field of view and the orientation of the parallel and perpendicular axis. (bottom) portion of a typical image of the sample taken by light microscopy between cross polarizers; the arrow points to a large onion. reprinted figure with permission from [66]. copyright (2009) by the american physical society. v. enlarged “granular” systems in this section, we tackle two other systems different from a simple dry granular assembly and for which temperature cycles might play a crucial role in the observed dynamics. i. thermally driven aging in a polydisperse vesicle assembly the closest case from what we have been presenting in this article consists in a dense packing of polydisperse multilamellar vesicles, or “onions”, submitted to (unavoidable) temperature fluctuations [65, 66]. such a system, a water-based mixture of surfactants and block-copolymer, is fluid below 8◦c but forms a phase of jammed vesicles at ambient temperature (here 23.4◦c) which is known to experience aging [67, 68]. the system is loaded in a glass capillary which is flamed sealed and placed under a microscope. images are taken every 15 sec during 24 h (fig. 12). the temperature is controlled within 0.09◦c. mazoyer and co-workers observed that the unavoidable temperature fluctuations induce local mechanical shears in the whole sample due to local thermal expansion and contraction, which is a scenario very similar to the one proposed in [29] for dry grains. here, each image [fig. 12 (bottom)] is divided into small regions of interest (roi) and the authors look at the translation motion ∆r(tw, τ ) of each roi for pairs of images taken at two different times (tw and tw + τ ). both the global parallel displacement 〈∆r//〉(tw) [fig. 13 (b)] and the relative parallel displacement [〈∆r2 // 〉]1/2(tw) [fig. 13 (a)] present strong correlations with the temperature fluctuations ∆t [fig. 13 (c)] [65]. furthermore, looking at individual trajectories of rois, the authors identify two kinds of dynamical events: reversible and irreversible rearrangements. the first class is due to the contraction-elongation of the sample because of temperature fluctuations, and correspond to a shear deformation along the long axis of the capillary. the second class of events that are irreversible occurs as the result of repeated shear cycles. the motion resulting from these ultraslow rearrangements is ballistic (the initial growth of ∆r// is proportional to τ ), with a velocity that decreases exponentially as the sample ages. such results could be easily checked in dry and immersed granular systems [29, 39] to see how far the analogy between a jammed vesicle assembly and a granular pile goes. also, it raises the generality of such dynamics governed by temperature fluctuations. in the case of dilute colloidal systems [69], this scenario does not hold, as no correlation could be observed between temperature fluctuations and rearrangements. indeed, one expects temperature fluctuations to play a key role in divided and athermal systems being rather concentrated or solid-like. it might therefore be relevant to find out other systems presenting temperature-induced strain fluctuations, in order to find how general the properties of these ultra020006-10 papers in physics, vol. 2, art. 020006 (2010) / t. divoux slow rearrangements are. figure 13: age dependence of (a) the square root relative parallel displacement, (b) the global parallel displacement and (c) the temperature variation ∆t over a lag time τ . reprinted figure with permission from [65]. copyright (2006) by the american physical society. ii. stone heave phenomena cyclic freezing and thawing of soils can cause stones and particles embedded in those soils to move and relocate mainly depending on the initial void ratio (defined as the fraction of sample filled by empty space). the stones move vertically upward (fig. 14) in dense soils (low void ratio) and vertically downward in loose soils (high void ratio). a full interpretation of this phenomenon is still lacking, but in the case of stone heave, the mechanism invoked lays on the propagation of the frost front in the soil: the freezing from the top leads to the growth of the ice beneath the stone, due to its higher thermal conductivity than that of the surrounding soil and in turn, causes the stone heave. a review of this topic can be found in [71]. a remarkable effect is that the void ratio also evolves toward a critical value (in the range of 0.29-0.34) under the successive freezing and thawing cycles [71]. thus, a dense (loose resp.) soil will become looser (denser resp.) under cycles of freezing and thawing. this is exactly what experiences a granular assembly submitted to a mechanical shear: its packing fraction will tend to a critical value [72]. here again, as for onions and dry granular assemblies, the temperature-induced shear controls the dynamics. such an analogy suggests that one could shed some new lights on the stone heave phenomenon by looking at the way a large intruder (simulating the stone) in a granular pile (simulating the soil) behaves under thermal cycling (which produces a shear [29]). such a work has been tackled by chen and co-workers in [30]. they observed that, under thermal cycles, the intruder (aluminum or brass) either does not move, or sinks inside the pile (polystyrene beads in a borosilicate glass container) if the intruder density overcomes a certain threshold. the pressure exerted by the intruder on the pile seems to be the relevant parameter, despite the ratio of the container size to the grains diameter which might play a crucial role on the force network inside the pile, has not been varied. also, no experiments have been performed varying the initial packing fraction or the initial position of the intruder in the granular packing to mimic the observations performed on stone heave [71, 73]. still, such a simple setup is certainly relevant to extract the key ingredients of the stone heave phenomena, and deserves further use. a comparison of the results with those obtained for vibrated grains (brazil nut effect [3, 74]) would also be interesting. figure 14: raised stone in a paved parking lot in lule̊a, sweden. reprinted from [70], copyright (2000), with permission from elsevier. vi. summary, open questions and outlooks i. summary temperature variations, even of low amplitude, induce the reorganization of a granular assembly and 020006-11 papers in physics, vol. 2, art. 020006 (2010) / t. divoux its slow compaction, and lead to the aging of the system, a signature of which can be found in the slowing down of the dynamics and in the evolution of the key properties of the system (increase of the effective thermal expansion and thermal conduction coefficients, etc). in this sense, temperature variations induce aging in a pile the same way moisture [75,76], constant applied stress [77] and chemical reaction [78] do. however, the mechanism at stake lays on a pinning-depinning transition at the grain scale generated by the shear-induced successive contractions and dilations cycles. temperature variations are thus a local and very delicate way to perturb and induce displacement in a granular assembly. ii. open questions and outlooks first, the role of several parameters still remains to be assessed: the mean temperature t around which the variations of amplitude ∆t are imposed might play a role on the reorganization process inside the pile. because of the presence of the container and the fact that the granular pile is a deformable medium, the mean temperature impacts on the initial stress distribution and on the space available for the grains. its role fully remains to be investigated. also, the parameters relevant in the case of vibrated grains are to be studied here. namely, the shape of the grains [79–82], their surface roughness [83, 84], their polydispersity [3], etc. but also, the size ratio of the grains and the container [53]. such a study might help to compare more accurately tapping and thermal cycling experiments. it could then be relevant to mix the two types of solicitation on a granular assembly to test the statistical framework proposed by edwards [4, 85]. indeed, recent experimental results lead to think that the steady state packing fraction reached by a shaken granular column (and function of γ) is a genuine thermodynamic state, within this theoretical framework [86, 87]. it might thus be relevant to study the fluctuations of packing fraction [88] around its equilibrium value, induced by cycles of temperature. also, moisture is another relevant parameter that would deserve an exhaustive experimental study. indeed, the existence of a liquid bridge between two adjacent grains is expected to increase the effective thermal conductivity [46] and thus to reinforce the role of the force chains inside the pile. second, the aging phenomenon observed under thermal cycling needs to be better characterized. in particular, how can we compare the aging effects observed in vibrated granular piles and described in [89] to the one described here ? another way to put it would be to ask if we can loosen a sample by means of cycles of temperature. third, more insights on the individual grain dynamics is needed in the dry case. do the results of slotterback and co-workers discussed in section iv for immersed grains still hold in the dry case ? the techniques already used for vibrated dry granular assembly like x-rays tomography [90] or γ-rays absorption [49] would certainly provide some answers to these questions. also, numerical simulations either simply based on successive dilation/contraction of the grains to mimic the effect of temperature, or taking into account the heat conduction through the granular media [33, 91, 94] would spare time and unravel the key parameters in the dynamics at the grain scale. acknowledgements i am indebted to my phd advisor, j.-c. géminard, for introducing me to this flourishing topic and for countless enlightening discussions. i also thank s. manneville and m.-a. fardin for carefully reading this manuscript as well as e. bertin and s. ciliberto for fruitful discussions. [1] e weeks, soft jammed materials, in: statistical physics of complex fluids, eds s maruyama, m tokuyama, pag. 2.-1, 87, tohoku university press, sendai (2007). [2] j b knight, c g fandrich, c n lau, h jaeger, s r nagel, density relaxation in a vibrated granular material, phys. rev. e 51, 3957 (1995). [3] a kudrolli, size separation in vibrated granular media, rep. prog. phys. 67, 209 (2004). [4] p richard, m nicodemi, r delannay, p ribière, d bideau, slow relaxation and compaction of granular systems, nature materials, 4, 121 (2005). 020006-12 papers in physics, vol. 2, art. 020006 (2010) / t. divoux [5] d howell, r behringer, c veje, fluctuations in granular media, chaos 9, 559 (1999). [6] m toiya, j stambaugh, w losert, transient and oscillatory granular shear flow, phys. rev. lett. 93, 088001 (2004). [7] m e cates, j p wittmer, j-p bouchaud, p claudin, jamming, force chains, and fragile matter, phys. rev. lett. 81, 1841 (1998). [8] s ouagenouni, j-n roux, compaction of well-coordinated lubricated granular pillings, europhys. lett. 32, 449 (1995). [9] s ouagenouni, j-n roux, force distribution in frictionless granular packings at rigidity threshold, europhys. lett. 39, 117 (1997). [10] c f moukarzel, isostatic phase transition and instability in stiff granular materials, phys. rev. lett. 81, 1634 (1998). [11] c f moukarzel, granular matter instability: a structural rigidity point of view, proceedings of rigidity theory and applications (1998). [12] m van hecke, jamming of soft particles: geometry, mechanics, scaling and isostaticity, j. phys.: condens. matter 22, 033101 (2010). [13] f radjai, de wolf, m jean, j j moreau, bimodal character of stress transmission in granular packings, phys. rev. lett. 80, 61 (1998). [14] p claudin, j-p bouchaud, static avalanches and giant stress fluctuations in silos, phys. rev. lett. 78, 231 (1997). [15] c liu, s nagel, d schecter, s coppersmith, s majumdar, o narayan, t witten, force fluctuations in bead packs, science 269, 513 (1995). [16] s coppersmith, c liu, s majumdar, o narayan, t witten, model for force fluctuations in bead packs, phys. rev. e 53, 4673 (1996). [17] c liu, s nagel, sound in sand, phys. rev. lett. 68, 2301 (1992). [18] c liu, spatial patterns of sound propagation in sand, phys. rev. b 50, 782 (1994). [19] c h liu, s r nagel, sound and vibration in granular materials, j. phys.: condens. matter 6, 433 (1994). [20] e clément, y séréro, j rajchenbach, j duran, fluctuating aspects of the pressure in a granular column, in: proceedings of the iiird intern. conf. on powders & grains, eds. r p behringer, j t jenkins, balkema, rotterdam (1997). [21] l vanel, e clément, pressure screening and flutuations at the bottom of a granular column, eur. phys. j. b 11, 525 (1999). [22] h a janssen, versuche über getreidedruck in silozellen, z. ver. dtsch. ing. 39, 1045 (1895). [23] p-g de gennes, thermal expansion effects in a silo, c. r. acad. sci. paris 327, 267 (1999). [24] d bonamy, l laurent, p claudin, j-p bouchaud, f daviaud, electrical conductance of a 2d packing of mettallic beads under thermal perturbation, europhys. lett. 51, 614 (2000). [25] d bonamy, l laurent, f daviaud, electrical conductance of thermally perturbated packing, in: powders and grains 2001, eds. y kishino, pag. 77, balkema, rotterdam (2001). [26] t divoux, i vassilief, h gayvallet, j-c géminard, ageing of a granular pile induced by thermal cycling, in: powders and grains 2009, aip conference proceedings 1145, 473 (2009). [27] j-c géminard, h gayvallet, thermal conductivity of a partially wet granular material, phys. rev. e 64, 041301 (2001). [28] k chen, j cole, c conger, j draskovic, m lohr, k klein, t scheidemantel, p schiffer, packing grains by thermal cycling, nature 442, 257 (2006). [29] t divoux, h gayvallet, j-c géminard, creep motion of a granular pile induced by thermal cycling, phys. rev. lett. 101, 148303 (2008). 020006-13 papers in physics, vol. 2, art. 020006 (2010) / t. divoux [30] k chen, a harris, j draskovic, p schiffer, granular fragility under thermal cycles, granular matter 11, 237 (2009). [31] j-c géminard, habilitation à diriger des recherches, université joseph fourier, grenoble i, p. 32 (2003). available at http://tel.archives-ouvertes.fr /tel-00294761/fr/ [32] t divoux, bruit et fluctuations dans les écoulements de fluides complexes, université de lyon – ens de lyon (2009). available at http://tel.archives-ouvertes.fr [33] w l vargas, j j mccarthy, thermal expansion effects and heat conduction in a granular materials, phys. rev. e 76, 041301 (2007). [34] a mehta, g barker, vibrated powders: a microscopic approach, phys. rev. lett. 67, 394 (1991). [35] g barker, a mehta, transient phenomena, self diffusion, and orientational effectsin vibrated powders, phys. rev. e. 47, 184 (1993). [36] j brujić, p wang, c song, d l johnson o sindt, h a makse, granular dynamics in compaction and stress relaxation, phys. rev. lett. 95, 128001 (2005). [37] t scheller, c huss, g lumay, n vandewalle, s dorbolo, precursors to avalanches in a granular monolayer, phys. rev. e 74, 031311 (2006). [38] j crassous, j f metayer, p richard, c laroche, experimental study of a creeping granular flow at very low velocity, j. stat. mech. p03009 (2008). [39] s slotterback, m toiya, l goff, j f douglas, w losert, correlation between particle motion and voronoi-cell-shape fluctuations during the compaction of granular matter, phys. rev. lett. 101, 258001 (2008). [40] l djaoui, j crassous, probing creep motion in a granular materials with light scattering, granular matter 7, 1985 (2005). [41] j crassous, m erpelding, a amon, diffusive waves in a dilating scattering medium, phys. rev. lett. 103, 013903 (2009). [42] b utter, r behringer, experimental measures of affine and nonaffine deformation in granular shear, phys. rev. lett. 100, 208302 (2008). [43] a s keys, a r abate, s c glotzer, d j durian, measurement of growing dynamical length scales and prediction of the jamming transition in a granular material, nature phys. 3, 260 (2007). [44] c donati, j f douglas, w kob, s j plimpton, p h poole, s c glotzer, stringlike cooperative motion in a supercooled liquid, phys. rev. lett. 80, 2338 (1998). [45] j f douglas, j dudowicz, k f reed, does equilibrium polymerization describe the dynamic heterogeneity of glass-forming liquids ?, j. chem. phys. 125, 144907 (2006). [46] w l vargas, j j mccarthy, conductivity of granular media with stagnant interstitial fluids via thermal particle dynamics, int. j. heat mass transfer 45, 4847 (2002). [47] the highest frequency accessible is roughly given by the time τ ≡ √ g/d for a grain to fall freely under gravity over a size equal to its diameter d. for d = 1 mm, one gets: τ−1 ' 100 hz. [48] note that the compaction process seems to be controlled both by the acceleration γ and by its duration t = 2π/ω. thus, the relevant parameter for tapping experiments may not be γ ∼ a/t 2 , but rather γt ∼ a/t as claimed in [93]. however we discuss the results of tapping experiments in terms of γ as it is the paramater that has been used in the literature until now to describe the experimental observations. [49] p philippe, étude théorique et expérimentale de la densification de milieux granulaires, université de rennes i (2002). available at http://tel.archives-ouvertes.fr 020006-14 papers in physics, vol. 2, art. 020006 (2010) / t. divoux [50] a kabla, g debregeas, contact dynamics in a gently vibrated granular pile, phys. rev. lett. 92, 035501 (2004). [51] p umbanhowar, m van hecke, force dynamics in weakly vibrated granular packings, phys. rev. e 72, 030301 (2005). [52] p evesque, j rajchenbach, instability in a sand heap, phys. rev. lett. 62, 44 (1989). [53] p philippe, d bideau, compaction dynamics of a granular medium under vertical tapping, europhys. lett. 60, 677 (2002). [54] p ribiere, p philippe, p richard, r delannay, d bideau, slow compaction of granular systems, j. phys.: condens. matter 17, 2743 (2005). [55] p krapivsky, e ben-naim, collective properties of adsorption-desorption process, j. chem. phys. 100, 6778 (1993). [56] e ben-naim, j knight, e nowak, e jaeger, s nagel, slow relaxation in granular compaction, physica d 123, 380 (1998). [57] e caglioti, v loreto, h herrmann, m nicodemi, a “tetris-like” model for the compaction of dry granular media, phys. rev. lett. 79, 1575 (1997). [58] m nicodemi, a coniglio, h. herrmann, frustration and flow dynamics in granular packings, phys. rev. e 55, 3962 (1997). [59] t boutreux, p de gennes, compaction of granular mixtures: a free volume model, physica a 244, 59 (1997). [60] s linz, phenomenological modeling of the compaction dynamics of shaken granular systems, phys. rev. e 54, 2925 (1996). [61] s linz, a. döhle, minimal relaxation law for compaction of tapped granular matter, phys. rev. e 60, 5737 (1999). [62] note that nf follows an arrhenius-like law with γ: nf ∝ exp(−γ/γ0) [92] which hints to a stronger analogy between vibrated granular piles and glassy systems [4], the reduced acceleration being equivalent to the effective temperature. [63] p ribière, p richard, r delannay, d bideau, m toiya, w losert, effect of rare events on out-of-equilibrium relaxation phys. rev. lett. 95, 268001 (2005). [64] we have to mention that the results of the compaction dynamics induced by thermal cycling and obtained by numerical simulations were fitted in [33] by the fractional mittagleffler law which is rather similar to the kww law. however, the steady state packing fraction is obtained over a short number of cycles (roughly 10) and the use of the fractional mittag-leffler law seems to be farfetched as the change in packing fraction presented in fig. 10 in [33] could easily be fitted by a single (or by the sum of two) exponential(s), or the inverse logarithmic law proposed by knight et al. [2, 4]. [65] s mazoyer, l cipelletti, l ramos, origin of the slow dynamics and the aging of a soft glass, phys. rev. lett. 97, 238 (2006). [66] s mazoyer, l cipelletti, l ramos, direct space investigation of the ultraslow ballistic dynamics of a soft glass, phys. rev. e 79, 011501 (2009). [67] l ramos, l cipelletti, intrinsic aging and effective viscosity in the slow dynamics of a soft glass with tunable elasticity, phys. rev. lett. 94, 158301 (2005). [68] l ramos, l cipelletti, ultraslow dynamics and stress relaxation in the aging of a soft glassy system, phys. rev. lett. 87, 245503 (2001). [69] a duri, l cipelletti, length scale dependence of dynamical heterogeneity in a colloidal fractal gel, europhys. lett. 76, 912 (2006). [70] p viklander, d eigenbrod, stone movements and permeability changes in till caused by freezing and thawing, cold reg. sci. technol. 31, 151 (2000). [71] p viklander, laboratory study of stone heave in till exposed to freezing and thawing, cold reg. sci. technol. 27, 141 (1998). 020006-15 papers in physics, vol. 2, art. 020006 (2010) / t. divoux [72] e aharonov, d sparks, rigidity phase transition in granular packings, phys. rev. e 60, 6890 (1999). [73] e kolstrup, t thyrsted, stone heave field experiment in clayey silt, geomorphology 117, 90 (2010). [74] s ulrich, m schröter, h swinney, influence of friction on granular segregation, phys. rev. e 76, 042301 (2007). [75] l bocquet, e charlaix, s ciliberto, j crassous, moisture-induced ageing in granular media and the kinetics of capillary condensation, nature 396, 735 (1998). [76] l bocquet, e charlaix, f restagno, physics of humid granular media, cr physique 3, 207 (2002). [77] w losert, j-c géminard, s nasuno, j p gollub, mechanisms for slow strengthening in granular materials, phys. rev. e 61, 4060 (2000). [78] h gayvallet, j-c géminard, ageing of the avalanche angle in immersed granular matter, eur. phys. j. b 30, 369 (2002). [79] f villaruel, b lauderdale, d mueth, h jaeger, compaction of rods: relaxation and ordering in vibrated, anisotropic granular material, phys. rev. e 61, 6914 (2000). [80] g lumay, n vandewalle, compaction of anisotropic granular materials: experiments and simulations, phys. rev. e 70, 051314 (2004). [81] p ribière, p richard, d bideau, r delannay, experimental compaction of anisotropic granular media, eur. phys. j. e 16, 415 (2005). [82] g lumay, n vandewalle, experimental study of the compaction dynamics for twodimensional anisotropic granular materials, phys. rev. e 74, 021301 (2006). [83] f ludewig, s dorbolo, n vandewalle, effect of friction in a toy model of granular compaction, phys. rev. e 70, 051304 (2004). [84] n vandewalle, g lumay, o gerasimov, f ludewig, the influence of grain shape, friction and cohesion on granular compaction dynamics, eur. phys. j. e 22, 241 (2007). [85] h makse, j brujnić, s edwards, statistical mechanics of jammed matter, eds. h hinrichsen, d e wolf, wiley-vch, berlin (2004) [86] m schröter, d goldman, h swinney, stationary state volume fluctuations in a granular medium, phys. rev. e 71, 030301 (2005). [87] p ribiere, p richard, p philippe, d bideau, r delanay, on the existence of stationary state during granular compaction, eur. phys. j. e 22, 249 (2007). [88] e r nowak, j b knight, e ben-naim, h m jaeger, s r nagel, density fluctuations in vibrated granular materials, phys. rev. e 57, 1971 (1998). [89] c josserand, a v tkachenko, d m mueth, h m jaeger, memory effects in granular materials, phys. rev. lett. 85, 3632 (2000). [90] p richard, p philippe, f barbe, s bourlès, x thibault, d bideau, analysis by x-ray microtomography of a granular packing undergoing compaction, phys. rev. e 68, 020301 (2003). [91] w l vargas, j j mccarthy, heat conduction in granular materials, aiche journal 47, 1052 (2001). [92] p philippe , d bideau, granular medium under vertical tapping: change of compaction and convection dynamics around the liftoff threshold, phys. rev. lett. 91, 104302 (2003). [93] j dijksmann, m van hecke, the role of tap duration for the steady-state density of vibrated granular media, europhys. lett. 88, 44001 (2009). [94] w l vargas, j j mccarthy, stress effects on the conductivity of particulate beds, chem. eng. sci. 57, 3119 (2002). 020006-16 papers in physics, vol. 2, art. 020007 (2010) received: 10 november 2010, accepted: 12 november 2010 edited by: i. ippolito licence: creative commons attribution 3.0 doi: 10.4279/pip.020007 www.papersinphysics.org issn 1852-4249 commentary on “effect of temperature on a granular pile” antonio coniglio1∗, massimo pica ciamarra2 the author of [1] writes an interesting and critical review on the effect of temperature in a granular pile. first, it is shown the effect of temperature in several experiments. then, it is shown how the compaction experiments, originally done by shaking granular material, can equally be performed by thermal cycling. to a first approximation, one may expect that at the particle level the only effect of a temperature variation is a small particle volume change. if this is the case, in a thermal cycle the volume fraction of the system changes, and a thermal cycle could be seen as a compression cycle. this analogy suggests that the initial and final states of a cycle differ because of the disorder of the system. to understand this point, it is convenient to first consider the effect of a particle expansion cycle in a crystalline structure of spherical balls. in this structure, all inter–particle forces are equal in magnitude, and the net force acting on each particle is zero. on inflating the particles, all forces increase exactly by the same amount, and the net force acting on each particle never changes. accordingly, particles keep their position and no motion occurs. conversely, in the presence of disorder, the inter–particle forces are different and vary by different amounts on inflating the particles. particle swelling drives the ∗e-mail: coniglio@na.infn.it 1 dip.to di scienze fisiche, universitá di napoli “federico ii” and cnr–spin, naples, italy 2 cnr–spin, universitá di napoli “federico ii”, monte s. angelo via cinthia, 80126 napoli, italy. system out of mechanical equilibrium, and induces motion. using different words, one may say that ordered structures respond in an affine way to particle swelling, while disordered structures are characterized by a non–affine response. this picture, which is also valid for frictionless particles, becomes more involved in the presence of friction. in fact, one should consider that the microscopic origin of friction is in the asperities of the surfaces; the shearing of the frictional contacts induced by the thermal cycle, may cause them to break. in a series of thermal cycles, contacts repeatedly break, allowing the system to compact. one may also speculate that, apart from the the temperature variation which controls the relative volume change of the particles, an important control parameter in thermal cycles is the absolute value of the temperature. in fact, the height of the asperities is expected to decrease as particles become bigger. the role of friction in granular materials has been extensively investigated in the literature (see, for instance, the paper by song et al. [5] and a recent review [4]), and we have recently proposed a jamming phase diagram in a three dimensional space, where the axes are volume fraction, shear stress and friction coefficient [2]. in this line of research, the results of ref. [1] are of particular interest; in fact, relating friction to temperature may allow to experimentally tune the friction coefficient and to validate different proposed theoretical scenarios. we take the occasion to present a speculative picture regarding the role of friction in sheared granular systems, making an analogy between frictional 020007-1 papers in physics, vol. 2, art. 020007 (2010) / a. coniglio et al. figure 1: pressure σzz as a function of the volume fraction φ, for σ = 2 10−3, and µ = 0.1 in a small (main panel) and in a much larger (inset) pressure range. circles correspond to measures taken when the system flows, and diamonds to measures taken in the jammed phase. full symbols correspond to measures taken in the steady state, while the open circles for φj1 < φ < φj2 correspond to measures taken in flowing metastable states which jam at long times. sheared granular and thermal systems. in this picture, we speculate an analogy between the ratio µ/σ and the ratio ε/t . here, µ and σ are the friction coefficient and the shear stress of a granular system, while ε measures the strength of the attractive force between thermal particles, and t is the temperature. we associate µ to ε, as higher the µ, stickier the contact, i.e. the greater the shear force the contact is able to sustain. we have investigated the limit of validity of this analogy performing molecular dynamics simulations of sheared granular systems in three dimensions. particles are confined between two parallel rough plates in the x–y plane, the bottom plate is fixed, while the other may move. we apply to them a shear force. periodic boundary conditions are along x an y. details of the numerical model and of the investigated system are in ref. [2,3]. we vary the volume fraction (changing the number of particles at constant volume), the shear stress σxy and the friction coefficient µ. figure 1 illustrates a typical pressure versus volfigure 2: normal pressure of the confining plate σzz as a function of the volume fraction φ at σ = 5 10−2. different symbols correspond to φj1 (squares) and φj3 (circles), for different values of the friction coefficient, from µ = 0 (top) to µ = 0.8 (bottom). the shaded area can therefore be identified with the coexistence region. ume fraction curve, for a fixed value of the shear stress, where three transitions are enlighten. here, σzz is the normal force acting on the confining plate per unit surface. for φ < φj1 , the system is in a steady flowing state. for φj1 < φ < φj2 , the system is found either in a metastable flowing state, or in a equilibrium disordered solid state able to sustain the applied stress. when in the metastable state, the system flows with a constant velocity for a long time, but it suddenly jams in an equilibrium solid-like state.1 for φj2 < φ < φj3 , the system quickly jams in response to the applied stress. for φ > φj3 , the system responds as a solid to the applied stress. the equilibrium value of the pressure σzz, marked by solid symbols in fig. 1, increases in the flowing state, it discontinuously jumps to a different value at φj1 , and grows for φ > φj3 . this scenario, and particularly the presence of a density range where the pressure is constant, suggests to interpret the φj1 –φj3 segment as a coexistent line. we have investigated the limit of validity of this scenario performing a number of simulations at different values of the friction coefficient µ. in the proposed analogy, low values of µ correspond to a high 1the terms ‘metastable’ and ‘equilibrium’ are used to indicate states with a finite/infinite lifetime, respectively. 020007-2 papers in physics, vol. 2, art. 020007 (2010) / a. coniglio et al. t/ε ratio. for each value of µ, we have estimated φj1 (µ) and φj3 (µ), which are the two extrema of the coexistence line at that value of µ. as show in fig. 2, these two lines allow to identify the (analogous to the) region in the σzz–φ plane. at µ = 0, φj1 (µ) = φj3 (µ), and the coexistence area ends in what should be the critical point, which here occurs at infinite temperature (as µ ∝ 1/t = 0). at finite friction, φj1 (µ) < φj3 (µ), and coexistence lines are found, as shown in the figure for few values of µ. the coexistence area has a lower bound, which is found in the limit of high friction. these results suggest that it is not unreasonable to associate the friction coefficient of sheared granular systems to the inverse temperature of thermal systems. a deeper investigation is required to define the limits of validity of this analogy. [1] t divoux, invited review: effect of temperature on a granular pile, pap. phys. 2, 020006 (2010). [2] m pica ciamarra, r pastore, m nicodemi, a coniglio, jamming phase diagram for frictional particles, arxiv:0912.3140v1 (2009). [3] d s grebenkov, m pica ciamarra, m nicodemi, a coniglio, flow, ordering, and jamming of sheared granular suspensions, phys. rev. lett. 100, 078001 (2008). [4] m van hecke, jamming of soft particles: geometry, mechanics, scaling and isostaticity, j. phys.: condens. matter 22, 033101 (2010). [5] c song, p wang, h a makse, a phase diagram for jammed matter, nature 453, 629 (2008). 020007-3 papers in physics, vol. 11, art. 110003 (2019) received: 29 october 2018, accepted: 30 april 2019 edited by: a. goñi, a. cantarero, j. s. reparaz licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.110003 www.papersinphysics.org issn 1852-4249 pressure-induced lifshitz transition in fese0.88s0.12 probed via 77se-nmr t. kuwayama,1 k. matsuura,2 y. mizukami,2 s. kasahara,3 y. matsuda,3 t. shibauchi,2 y. uwatoko,4 n. fujiwara1∗ recently, fese1−xsx systems have received much attention because of the unique pressure– temperature phase diagram. we performed 77se-nmr measurements on a single crystal of fese0.88s0.12 to investigate its microscopic properties. the shift of 77se spectra exhibits anomalous enhancement at 1.0 gpa, suggesting a topological change in the fermi surfaces, the so-called lifshitz transition, occurs at 1.0 gpa. the magnetic fluctuation simultaneously changes its properties, which implies a change in the dominant nesting vector. i. introduction in contrast to most iron pnictides, fese undergoes nematic and superconducting (sc) transitions without any magnetism: in iron pnictides, such as the bafe2as2 family, a sc phase emerges near an antiferromagnetic (afm) phase, which accompanies a tetragonal-to-orthorhombic transition so called a nematic transition [1]. the electronic state of fese dramatically changes under pressure [2]. the nematic transition temperature ts is suppressed with increasing pressure and the afm order is induced instead. these phases overlap each other in the pressure range of 1.2 gpa < p < 2.0 gpa. the sc transition temperature tc exhibits double-dome structure and it reaches ∗email: naoki@fujiwara.h.kyoto-u.ac.jp 1 graduate school of human and environmental studies, kyoto university, 606-8501 kyoto, japan. 2 graduate school of frontier sciences, university of tokyo, 277-8581 kashiwa, japan. 3 division of physics and astronomy, graduate school of science, kyoto 606-8502 kyoto, japan. 4 institute for solid state physics, university of tokyo, 277-8581 kashiwa, japan. ∼ 37 k at 6.0 gpa. such complicated pressuretemperature (p − t) phase diagram makes it difficult to understand the origin of the high tc. recently, the detailed p − t phase diagram for s-substituted fese, fese1−xsx (0 < x < 0.17), has been obtained from the resistivity measurements [3]. intriguingly, the nematic and afm phases are completely separated in the intermediate s concentration (0.04 < x < 0.12). in these compositions, the sc dome appears in a moderate pressure region. therefore, a bare sc phase is more easily attainable than pure fese. to understand the pairing mechanism of fese systems, the 12%-s doped sample is preffered over the pure sample, because a high tc over 25 k is attainable at low pressures (∼ 3 gpa), and it is free from complicated overlapping of the nematic, sc, and afm states. ii. experimental methods we performed 77se-nmr measurements on a 12%s doped single crystal, fese0.88s0.12, up to 3.0 gpa with a fixed field of 6.02 t applied parallel to the a axis. a single crystal with dimensions of about 1.0×1.0×0.5 mm3 was used for the measurements. we used a nicral pressure cell [4] and daphne oil 110003-1 papers in physics, vol. 11, art. 110003 (2019) / kuwayama et al. !"# !$# figure 1: the t dependence of the ac susceptibility at several pressures. (a) and (b) show the ac susceptibilities at zero field and 6.02 t, respectively. the dashed lines correspond to the linear fittings, and the intersection points represent the superconducting transition points, tcs. as pressure transmitting medium. the pressure was determined by ruby fluorescence measurements [4]. we placed the crystal in the pressure cell so that the fese plane was parallel to the applied field. iii. experimental results i. determination of tc figure 1 shows the t dependence of the ac susceptibility at several pressures measured by the tank circuit of a nmr probe. to clarify the influence of the magnetic field on tc, we measured the susceptibilities not only at zero field, but also at 6.02 t. a resonant frequency of the circuit f is expressed as follows: f = 1 2π √ l(1 + 4πχ)c (1) where l, c, and χ are the coil inductance, the capacitance of the variable capacitor, and the ac susceptibility, respectively. when a sample undergoes a sc transition, f diverges due to the meissner effect (χ = −1/4π). we determined tc from the intersection point of linear fittings (fig.1). tc increases up to ∼ 27 k at 3.0 gpa from tc ∼ 9 k at ambient pressure. we found that tc at 1.0 gpa was 0.32 0.31 0.30 0.29 0.28 k ( % ) 1208040 t (k) k a k b 20 15 10 5 0 s p in e c h o i n te n si ty ( a rb .u .) 49.0449.0249.0048.9848.96 frequensy (mhz) 40k 50k 60k 4k 10k 20k 65k 70k 80k 90k 100k 30k !"# !$# figure 2: (a) the t evolution of 77se spectrum at ambient pressure. the black dashed line shows peak frequencies. (b) the 77se shift at ambient pressure determined from a single gaussian fit. ka and kb reflect the high and low frequency peak, respectively. anomalously suppressed at 6.02 t, and the system has not undergone the sc transition above 4.2 k. in contrast, tcs at 2.0 and 3.0 gpa are slightly decreased by the field, as shown in fig. 1. ii. 77se-nmr spectra and 77se shift we measured 77se-nmr (i = 1/2, γ/2π = 8.118 mhz/t) spectra on fese0.88s0.12 with a fixed field of 6.02 t. figure 2(a) shows the t evolution of the spectra at ambient pressure. a single 77se signal in a tetragonal state (t > 60 k) becomes a double-peak structure below ts ∼ 60 k, which is in good agreement with the structural transition temperature observed by the resistivity measurements [3]. figures 2(b) and 3 show the t dependence of the 77se shift at ambient pressure and the shift at several pressures, respectively. the average of the peaks below ts is plotted for the data at ambient pressure in fig. 3. the shift k is proportional to the density of states (dos). in general, the dos changes monotonically with increasing pressure due to a change in the bandwidth. in our sample, however, the dos is enhanced at 1.0 gpa, and then it reduces with increasing pressure. as discussed below, the origin of this anomalous p dependence of the dos could be interpreted as a topological change in the fermi surfaces, the so-called lifshitz transition. 110003-2 papers in physics, vol. 11, art. 110003 (2019) / kuwayama et al. 0.305 0.300 0.295 0.290 0.285 k a v ( % ) 3.02.01.00.0 p (gpa) 0.32 0.31 0.30 0.29 0.28 0.27 k ( % ) 120100806040 t (k) ambient 1.0 gpa 2.0 gpa 3.0 gpa figure 3: the 77se shift in the non-sc state at several pressures. the average of ka and kb is plotted below 60 k. the inset shows the 77se shift at 70 k, reflecting the pressure dependence of the dos. iii. the relaxation rate divided by temperature, 1/t1t figure 4 shows the relaxation rate divided by temperature, 1/t1t . we measured the relaxation time t1 with the inversion-recovery method for 77se. the relaxation rate provides a measure for the lowenergy spin fluctuations. in general, an afm fluctuation is enhanced when a system comes near an afm phase. by contrast, the afm fluctuation of fese0.88s0.12 is strongly suppressed at 1.0 gpa and is slightly enhanced above 2.0 gpa, although the afm phase is induced above 3.0gpa. iv. discussion from the results mentioned above, we suggest that the lifshitz transition at around 1.0 gpa is crucial to understand the anomalies of fese0.88s0.12. firstly, the dos suggested from the 77se shift shows that some kind of anomaly occurs at 1.0 gpa as mentioned above (see the inset of fig. 3). according to a recent theoretical investigation in fese, a lifshitz transition may occur with reducing the lattice constants [5]. s-substitution is isovalent and s-substituted fese has smaller lattice constants than pure fese [6]. furthermore, applying pressure also causes the lattice compression. in our sample, fese0.88s0.12, therefore, the lifshitz transition may account for the anomaly in the dos. assuming that the lifshitz transition occurs at around 1.0 gpa, the fermi surfaces are recon100 80 60 40 20 0 t ( k ) 86420 p (gpa) !" #$% &'()*+, figure 4: the t dependence of the relaxation rate divided by t , 1/t1t . the dashed lines are a guide to the eye. the inset shows the phase diagram of fese0.88s0.12 determined from the resistivity measurements [3]. structed, and the reconstruction of the fermi surfaces could induce a change in the dominant nesting vector. when the dominant nesting vector changes, it is possible that the afm fluctuation at 3.0 gpa become weaker than that at ambient pressure, even though the afm phase appears in a high pressure region. to clarify this scenario, it is necessary to determine the spin configuration of the pressureinduced afm phase from the measurements in the higher pressure region. v. conclusions we carried out 77se-nmr measurements on fese0.88s0.12, and the 77se shift suggests that the dos exhibits an anomalous enhancement at 1.0 gpa. the lifshitz transition, the change in topology of fermi surface, could account for this anomaly. the fermi surfaces are reconstructed due to the lifshitz transition, resulting in a change of the dominant nesting vector. this is the reason why the afm fluctuation at ambient pressure is stronger than that at 3.0 gpa despite the afm order being induced above 3.0 gpa. acknowledgements the nmr work was supported by jsps kakenhi grant number 110003-3 papers in physics, vol. 11, art. 110003 (2019) / kuwayama et al. jp18h01181 and a grant from mitsubishi foundation. we thank h. kontani and p. toulemonde for discussion. [1] r m fernandes, a v chubukov, j schmalian, what drives nematic order in iron-based superconductor?, nat. phys. 10, 97 (2014). [2] j p sun, k matsuura, g z ye, y mizukami, m shimozawa, k matsubayashi, m yamashita, t watashige, s kasahara, y matsuda, j q yan, b c sales, y uwatoko, j g cheng, t shibauchi, dome-shaped magnetic order competing with high-temperature superconductivity at high pressures in fese, nat. commun. 7, 12146 (2016). [3] k matsuura, y mizukami, y arai, y sugimura, n maejima, a machida, t watanuki, t fukuda, t yajima, z hiroi, k y yip, y c chan, q niu, s hosoi, k ishida, k mukasa, s kasahara, j g cheng, s k goh, y matsuda, y uwatoko, t shibauchi, maximizing tc by tuning nematicity and magnetism in fese1−xsx superconductors, nat. commun. 8, 1143 (2017). [4] n fujiwara, t matsumoto, k nakazawa, a hisada, y uwatoko, fabrication and efficiency evaluation of a hybrid nicral pressure cell up to 4 gpa, rev. sci. instrum. 78, 073905 (2007). [5] s l skornyakov, v i anisimov, d vollhardt, i leonov, correlation strength, lifshitz transition, and the emergence of a two-dimensional to three-dimensional crossover in fese under pressure, phys. rev. b 97, 115165 (2018). [6] j n millican, d phelan, e l thomas, j b leão, e carpenter, pressure-induced effects on the structure of the fese superconductor, solid state commun. 149, 707 (2009). 110003-4 papers in physics, vol. 4, art. 040005 (2012) received: 3 september 2012, accepted: 24 october 2012 edited by: m. c. barbosa licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.040005 www.papersinphysics.org issn 1852-4249 a criterion to identify the equilibration time in lipid bilayer simulations rodolfo d. porasso,1∗ j.j. lópez cascales,2 with the aim of establishing a criterion for identifying when a lipid bilayer has reached steady state using the molecular dynamics simulation technique, lipid bilayers of different composition in their liquid crystalline phase were simulated in aqueous solution in presence of cacl2 as electrolyte, at different concentration levels. in this regard, we used two different lipid bilayer systems: one composed by 288 dppc (dipalmitoylphosphatidylcholine) and another constituted by 288 dpps (dipalmitoylphosphatidylserine). in this sense, for both type of lipid bilayers, we have studied the temporal evolution of some lipids properties, such as the surface area per lipid, the deuterium order parameter, the lipid hydration and the lipid-calcium coordination. from their analysis, it became evident how each property has a different time to achieve equilibrium. the following order was found, from faster property to slower property: coordination of ions ≈ deuterium order parameter > area per lipid ≈ hydration. consequently, when the hydration of lipids or the mean area per lipid are stable, we can ensure that the lipid membrane has reached the steady state. i. introduction over the last few decades, different computational techniques have emerged in different fields of science, some of them being extensively implemented and used by a great number of scientists around the globe. among others, the molecular dynamics (md) simulation is a very popular computational technique, which is widely used to obtain insight with atomic detail of steady and dynamic properties in the fields of biology, physics and chemistry. in this regard, a critical aspect that must be identified in all the md simulations is related to the required equilibration time to achieve a steady state. ∗e-mail: rporasso@unsl.edu.ar 1 instituto de matemática aplicada san luis (imasl) departamento de f́ısica, universidad nacional de san luis/conicet, d5700hhw, san luis, argentina. 2 universidad politécnica de cartagena, grupo de bioinformática y macromoléculas (biomac) aulario ii, campus de alfonso xiii, 30203 cartagena, murcia, spain. this point is crucial in order to avoid simulation artifacts that could lead to wrong conclusions. currently, with the increment of the computing power accessible to different investigation groups, much longer simulation trajectories are being carried out to obtain reliable information about the systems, with the purpose of approaching the time scale of the experimental phenomena. however, even when this fact is objectively desirable without further objections, nowadays, much longer equilibration times are arbitrarily being required by certain reviewers during the revision process. from our viewpoint, this should be thoroughly revised due to the following two main reasons: first, because it results in a limiting factor in the use of this technique by other research groups which cannot access to very expensive computing centers (assuming that authors provide enough evidence of the equilibration of the system). second, to avoid wasting expensive computing time in the study of certain properties which do not require such long equilibration times, once the steady state of the system has been properly 040005-1 papers in physics, vol. 4, art. 040005 (2012) / r d porasso et al. identified. phospholipid bilayers are of a high biological relevance, due to the fact that they play a crucial role in the control of the diffusion of small molecules, cell recognition, and signal transduction, among others. in our case, we have chosen the phosphatidylcholine (pc) bilayer because it has been very well studied by md simulations [1–7] and experimentally as well [8–14]. furthermore, studies of the effects of different types of electrolytes on a pc bilayer have also been studied, experimentally [15–24] and by simulation [25–33]. as mentioned above, the molecular dynamics (md) simulations have emerged during the last decades as a powerful tool to obtain insight with atomic detail of the structure and dynamics of lipid bilayers [34–36]. several md simulations of membranes under the influence of different salt concentrations have been carried out. one of the main obstacles related to these studies has been the time scale associated to the binding process of ions to the lipid bilayer. considering the literature, a vast dispersion of equilibration times associated to the binding of ions to the membrane has been reported, where values ranging from 5 to 100 ns have been suggested for monovalent and divalent cations [25, 27–29, 32, 37, 38]. in this regard, we carried out four independent simulations of a lipid bilayer formed by 288 dppc in aqueous solutions, for different concentrations of cacl2 to provide an overview of their equilibration times. among other properties, the surface area per lipid, the deuterium order parameters, lipid hydration and lipid-calcium coordination were studied. finally, in order to generalize our results, a bilayer formed by 288 dpps in its liquid crystalline phase, in presence of cacl2 at 0.25n, was simulated as well . ii. methodology different molecular dynamics (md) simulations of lipid bilayer formed by 288 dppc were carried out in aqueous solutions for different concentrations of cacl2, from 0, up to 0.50 n. furthermore, with the aim of generalizing our results, a bilayer of 288 dpps in presence of cacl2 at 0.25 n was simulated as well. note that the concentration of cacl2 in terms of normality is defined as: type of lipid [cacl2] n ca 2+ cl− water dppc 0 0 0 10068 dppc 0.06 5 10 10053 dppc 0.13 12 24 10032 dppc 0.25 23 46 9999 dppc 0.50 46 92 9930 dpps 0.25 204 120 26932 table 1: the simulated bilayer systems. note that the salt concentration is given in normal units. the numerals describe the number of molecules contained in the simulation box. o o o o o o o o p n o o o o o o o o o o p nh3 dppc dpps 1 3 4 2 5 6 7 8 9 11 10 12 13 14 32 33 15 17 16 18 19 20 21 22 23 24 25 26 27 28 29 30 31 34 35 36 38 40 42 44 46 48 50 37 39 41 43 45 47 49 1 2 3 45 6 7 8 9 10 11 12 13 14 15 16 32 33 35 34 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 36 38 40 42 44 46 48 50 37 39 41 43 47 4945 figure 1: structure and atom numbers for dppc and dpps used in this work. normality = nequivalent grams lsolution (1) where nequivalent grams = gr(solute) equivalent weight and equivalent weight = molecular weight n , being n the charge of the ions in the solution. in table 1, the number of molecules that constitute each system, applying eq. (1), is summarized. to build up the original system, a single dppc lipid molecule, or dpps lipid (fig. 1), was placed with its molecular axis perpendicular to the membrane surface (xy plane). next, each dppc, or dpps, was randomly rotated and copied 144 times on each leaflet of the bilayer. finally, the gaps existing in the computational box (above and below the phospholipid bilayer) were filled using an equilibrated box containing 216 water molecules of the extended simple point charge (spc/e) [39] water model. thus, the starting point of the first system of table 1 was formed by 288 dppc in absence of 040005-2 papers in physics, vol. 4, art. 040005 (2012) / r d porasso et al. cacl2. once this first system was generated, the whole system was subjected to the steepest descent minimization process to remove any excess of strain associated with overlaps between neighboring atoms of the system. thereby, the following dppc systems in presence of cacl2 were generated as follows: to obtain a [cacl2]=0.06 n, 15 water molecules were randomly substituted by 5 ca2+ and 10 cl−. an analogous procedure was applied to the rest of the systems, where 36, 69 and 138 water molecules were substituted by 12, 23 and 46 ca2+ and 24, 46 and 92 cl−, to obtain a [cacl2] concentration of 0.13 n, 0.25 n and 0.50 n, respectively. finally, the dpps bilayer was generated following the same procedure described above for the dppc, starting from a single dpps molecule and once the lipid bilayer in presence of water passed the minimization process, 324 water molecules were substituted by 204 ca2+ and 120 cl− (note that 144 of the 204 calcium ions were added to balance the negative charge associated with the dpps). gromacs 3.3.3 package [40,41] was used in the simulations, and the properties showed in this work were obtained using our own code. the force field proposed by egberts et al. [2] was used for the lipids, and a time step of 2 fs was used as integration time in all the simulations. a cut-off of 1.0 nm was used for calculating the lennard-jones interactions. the electrostatic interaction was evaluated using the particle mesh ewald method [42,43]. the real space interaction was evaluated using a 0.9 nm cut-off, and the reciprocal space interaction using a 0.12 nm grid with a fourth-order spline interpolation. a semi-isotropic coupling pressure was used for the coupling pressure bath, with a reference pressure of 1 atm which allowed the fluctuation of each axis of the computer box independently. for the dppc bilayer, each component of the system (i.e., lipids, ions and water) was coupled to an external temperature coupling bath at 330 k, which is well above the transition temperature of 314 k [44, 45]. for dpps bilayer, each component of the system was coupled to an external temperature coupling bath at 350 k, which is above the transition temperature [46, 47]. all the md simulations were carried out using periodic boundary conditions. the total trajectory length of each simulated system was of 80 ns of md simulation, where the coordinates of the system were recorded every 5 ps for their appropriate analysis. finally, in order to study the effect of the temperature, only the case corresponding to 0.25 n cacl2 was investigated at two additional temperatures, 340 k and 350 k. iii. results and discussion i. effect of the cacl2 concentration a. surface area per lipid surface area per lipid 〈a〉 is a property of lipid bilayers which has been accurately measured from experiments [48]. the calculation of mean area per lipid can be determined from the md simulation as: 〈a〉 = x ·y n (2) where x and y represent the box sizes in the direction x and y (perpendicular to the membrane surface) over the simulation, and n is the number of lipids contained in one leaflet, in our case n = 144. focusing on the study of the time evolution of the area per lipid, figure 2 depicts the running 0.60 0.64 0.68 (a) dppc 0.60 0.64 0.68 a re a p e r li p id ( n m 2 ) (b) dppc 0.60 0.64 0.68 (c) dppc 0.60 0.64 0.68 (d) dppc 0 10 20 30 40 50 60 70 80 time (ns) 0.50 0.54 0.58 (e) dpps figure 2: running area per lipid at t = 330 k in presence of [cacl2] at (a) 0.06 n, (b) 0.13 n, (c) 0.25 n, (d) 0.50 n and (e) 0.25 n (in this case, t = 350 k). solid lines represent the mean area obtained from the last 70 ns of the simulated trajectories (see text for further explanation). the type of lipid is indicated in the legends. 040005-3 papers in physics, vol. 4, art. 040005 (2012) / r d porasso et al. surface area per lipid for different concentrations of cacl2 and type of lipid. in general, for the 5 bilayers formed by dppc or dpps, the area per lipid achieved a steady state after 10 ns of simulation, being this equilibration time almost independent of the concentration of cacl2 and type of lipid which composed the membrane. in absence of salt, an average area per lipid of 〈a〉 = 0.663 ± 0.008 nm2 was calculated from the last 70 ns of the simulated trajectory, discarding the first 10 ns corresponding to the equilibration time. this value agrees with experimental data, where values in a range from 0.55 to 0.72 nm2 have been measured [10, 11, 48–51]. table 2 shows the mean surface area per lipid (again, after discarding the equilibration time of 10 ns) with their corresponding error bar. from the simulation results, a shrinking in the surface area per lipid with the increment of the ionic strength of the solution is observed. this shrinking is expected and attributed to the complexation of lipid molecules by calcium, such as it has been pointed out in previous studies [28,29,52]. b. deuterium order parameter the deuterium order parameter, scd, is measured from 2h-nmr experiments. this parameter provides relevant information related to the disorder of the hydrocarbon region in the interior of the lipid bilayers by measuring the orientation of the hydrogen dipole of the methylene groups with respect to the perpendicular axis to the lipid bilayer. due to the fact that hydrogens of the lipid methylene groups (ch2) have not been taken into account (in an explicit way) in our simulations, the order parameter −scd on the i + 1 methylene group was defined as the normal unitary vector to the vector defined from the i to the i + 2 ch2 group and contained in the plane formed by the methylene groups i, i + 1 and i + 2. thus, the deuterium order parameter −scd on the i− th of the ch2 group can be estimated by molecular dynamics simulations as follows: −scd = 1 2 〈3 cos2(θ) − 1〉 (3) where θ is the angle formed between the unitary vector defined above and the z axis. the expression in brackets 〈. . .〉 denotes an average over all type of [cacl2] n 〈a〉 (nm2) hydration lipid number dppc 0 0.663 ±0.008 1.758 ±0.009 dppc 0.06 0.658 ±0.008 1.740 ±0.009 dppc 0.13 0.651 ±0.007 1.719 ±0.010 dppc 0.25 0.641 ±0.009 1.680 ±0.015 dppc 0.50 0.628 ±0.010 1.610 ±0.015 dpps 0.25 0.522 ±0.007 2.552 ±0.010 table 2: area per lipid and lipid hydration number as a function of salt concentration (see text for further explanation). note that the salt concentration is given in normal units. error bars were calculated for each system separately from subtrajectories of 10 ns length. simulation temperature = 330 k. the lipids and time. hence, note that the −scd can adopt any value between -0.5 (corresponding to a parallel orientation to the lipid/water interface) and 1 (oriented along the normal axis to the lipid bilayer). figure 3 shows the running −scd for different carbons of the dppc and dpps tails and salt concentrations. only the carbons which correspond to the initial (hydrocarbons 2 and 6), the middle (hydrocarbon 10) and final (hydrocarbons 13 and 15) methylene groups of the lipid tails were depicted in this figure. each point of the figure represents the average values of −scd over 5 ns of subtrajectory length, and the lines represent the mean values calculated from the last 70 ns of the trajectories simulated. from this figure, it is observed how in all the cases, the required equilibration time is less than 10 ns of simulation time, independently of the salt concentration and the type of lipid. finally, it is noted that figure 3 exhibits an increase in the deuterium order parameters with the salt concentration, consistent with the shrinking of the area per lipid described above. c. lipid hydration to analyze the lipid hydration, the radial distribution function g(r) of water around one of the oxygens of the phosphate group (atom number 10 in fig. 1 for dppc and dpps) was calculated. the radial distribution function g (r) is defined as follows: 040005-4 papers in physics, vol. 4, art. 040005 (2012) / r d porasso et al. 0.0 0.1 0.2 (a) dppc 0.0 0.1 0.2 (b) dppc 0.0 0.1 0.2 d e u te ri u m o rd e r p a ra m e te r (s c d ) (c) dppc 0.0 0.1 0.2 (d) dppc 0 10 20 30 40 50 60 70 80 time (ns) 0.0 0.1 0.2 (e) dpps figure 3: running deuterium order parameter, −scd, in presence of [cacl2] at (a) 0.06 n, (b) 0.13 n, (c) 0.25 n, (d) 0.50 n and (e) 0.25 n. dppc simulations were performed at 330 k and dpps simulation temperature was 350 k. solid lines represent mean values −scd obtained from the last 70 ns of the simulated trajectories. the type of lipid is indicated in the legends. symbols: ◦ hydrocarbon 2; � hydrocarbon 6; / hydrocarbon 10; + hydrocarbon 13 and × hydrocarbon 15. note that the error bars have the same size as symbol. g(r) = n(r) 4πr2ρδr (4) where n (r) is the number of atoms in a spherical shell at distance r and thickness δr from a reference atom. ρ is the density number taken as the ratio of atoms to the volume of the total computing box. from numerical integration of the first peak of the radial distribution function, the hydration numbers can be estimated for different atoms of the dppc or dpps. figure 4 depicts the hydration number of phosphate oxygen (atom 10 in fig. 1 for dppc and dpps) in presence of cacl2, where each point represents the average of 5 ns subtrajectory length. these results show how this property reached a steady state for the cases (a), (b) and (e), after 10 ns of simulation. however, for the cases (c) and (d), 5 ns of extra simulation trajectory were required to reach a steady state. table 2 1.4 1.6 1.8 (a) dppc 1.4 1.6 1.8 (b) dppc 1.4 1.6 1.8 h y d ra ti o n n u m b e rs (c) dppc 1.4 1.6 1.8 (d) dppc 0 10 20 30 40 50 60 70 80 time (ns) 2.4 2.6 2.8 (e) dpps figure 4: hydration number of the phosphate oxygen (atom 10 in fig. 1) along the simulated trajectories in presence of [cacl2] at (a) 0.06 n, (b) 0.13 n, (c) 0.25 n, (d) 0.50 n (for dppc t = 330 k) and (e) 0.25 n. in this case, t = 350 k. solid lines represent the mean value of the hydration number calculated from the last 70 ns of the simulated trajectories. the type of lipid is indicated in the legends. note that the error bars have the same size as symbol. shows the hydration numbers for the last 70 ns of the trajectory length, corresponding to four concentrations of cacl2 and both types of lipids, dppc and dpps. in this regard, from fig. 4, the significant lipid dehydration with the increment of the ionic strength of the solution is evident, in good accordance with previous results [52]. d. phospholipid-calcium coordination some authors have reported how the lipid coordination by divalent cations widely varies . thus, on the one hand, some authors [25] have reported that this is a very slow process, which requires about 85 ns of simulation time, but, on the other hand, other authors [26] have suggested that this process results much more rapid, taking less than 1 ns. in this sense, the coordination of dppc-ca2+ was studied by monitoring the oxygen-calcium coordination of the carbonyl oxygens (atoms 16 and 35 in fig. 1) and phosphate oxygens (atoms 9 and 10 of dppc 040005-5 papers in physics, vol. 4, art. 040005 (2012) / r d porasso et al. 1 2 3 4 5 n (a) 0 25 50 75 100 0 3 6 9 12 n (b) 0 25 50 75 100 0 5 10 15 20 25 n (c) 0 25 50 75 100 0 10 20 30 40 50 60 70 80 time (ns) 0 15 30 45 n (d) 0 10 20 30 40 50 60 70 80 time (ns) 0 25 50 75 100 figure 5: left column represents the number of ca2+ coordinated to lipids in presence of [cacl2] at (a) 0.06 n, (b) 0.13 n, (c) 0.25 n and (d) 0.50 n, for t = 330 k. right column shows the quantity of calcium ions coordinated to lipids expressed in percentage, along the simulated trajectory. in fig. 1), as a function of time. the left column in fig. 5 represents the oxygen coordination number, while the right one depicts the percentage of calcium ions involved in the coordination process with respect to the total number of calcium ions present in the aqueous solution. figure 5 shows how the dppc coordination by calcium is a quick process, taking less than 5 ns of simulation time to achieve a steady state. the kinetic of this process appears to be related to the ratio between calcium/lipid. thus, after the first 5 ns of simulation time, the ca–lipid coordination presents some fluctuation along the rest of the simulated trajectory. however, in fig. 5 (a) and (b) (for the cases of lower concentration), it is observed how the percentage of coordination fluctuates between a 60% and a 100%. we consider that this broad fluctuation is related to the limited sample size of our simulations that introduces a certain noise in our results. ii. effect of temperature this section focuses on the study of the role played by temperature on the equilibration process. in this regard, only the system corresponding to a t (k) 〈a〉 (nm2) hydration number 330 0.642 ±0.009 1.680 ± 0.010 340 0.650 ±0.007 1.683 ± 0.020 350 0.666 ±0.008 1.689 ± 0.015 table 3: area per lipid and the lipid hydration number as a function of temperature. error bars were calculated from subtrajectories of 10 ns length. dppc bilayer in presence of [cacl2] = 0.25 n. concentration of 0.25 n in cacl2 was studied, for a range of temperatures from 330 k to 350 k (all of them above the transition temperature of 314 k [44, 45] for the dppc). figure 6 shows the running area along the trajectory. in this case, it was noticed how the systems achieve a steady state after a trajectory length of roughly 10 ns, where table 3 shows the mean area per lipid calculated from the last 70 ns of simulation time. figure 7 shows the deuterium order parameter of the methylene groups along the lipid tails, calculated from eq. (3). figure 7, on the one hand, clearly shows that for the three temperatures the systems have reached the steady state before the first 10 ns of simulation. on the other hand, it shows an increase in the disorder of the lipid tails with temperature, which is closely related with the increase of the area per lipid, such as it was pointed out above. figure 8 depicts the results of the hydration numbers of dppc for the three temperatures studied, where the equilibrated state was achieved after 10 ns of simulation time. table 3 provides the calculated hydration numbers in the equilibrium, showing how the lipid hydration remained invariable with the rising of the temperature. concerning the lipid-calcium coordination, fig. 9 represents the lipid-calcium coordination number, and the right column represents the calcium that participates in the coordination expressed in percentage respect the total of calcium ions in solution. from simulation, it becomes evident how calcium ions required less than 5 ns to achieve an equilibrated state for the three temperatures studied. in summary, for all the properties studied in this section, a slight decrease in the equilibration time with the increasing temperature was appreciated. 040005-6 papers in physics, vol. 4, art. 040005 (2012) / r d porasso et al. 0.60 0.63 0.66 0.69 (a) 0.60 0.63 0.66 0.69 a re a p e r li p id ( n m 2 ) (b) 0 10 20 30 40 50 60 70 80 time (ns) 0.60 0.63 0.66 0.69 (c) figure 6: running area per lipid for [cacl2] = 0.25 n at different temperatures, (a) t = 330 k, (b) t = 340 k and (c) t = 350 k. solid lines represent the mean values obtained from the last 70 ns of simulation. 0.00 0.05 0.10 0.15 0.20 0.25 (a) 0.00 0.05 0.10 0.15 0.20 0.25 o rd e r p a ra m e te r (s c d ) (b) 0 10 20 30 40 50 60 70 80 time (ns) 0.00 0.05 0.10 0.15 0.20 0.25 (c) figure 7: deuterium order parameter, −scd, along the simulated trajectory for a concentration of [cacl2] = 0.25 n, for the following temperatures: (a) t = 330 k, (b) t = 340 k and (c) t = 350 k. solid lines represent the average order parameter for the last 70 ns of simulation. note that the error bars have the same size as symbol. iv. conclusions the present work deals with the simulation time required to achieve the steady state for a lipid bilayer system in presence of cacl2. in this regard, we studied two different systems: one with dppc and another one with dpps bilayer; both systems in 1.6 1.7 1.8 (a) 1.6 1.7 1.8 h y d ra ti o n n u m b e rs (b) 0 10 20 30 40 50 60 70 80 time (ns) 1.6 1.7 1.8 (c) figure 8: hydration number of phosphate oxygen (atom 10 in fig. 1) along the simulated trajectories for a [cacl2] = 0.25 n at different temperatures: (a) t = 330 k, (b) t = 340 k and (c) t = 350 k. solid lines represent the average hydration number for the last 70 ns of simulation. note that the error bars have the same size as symbol. presence of cacl2 (at different level concentration). the salt free case was also studied, as control. the analysis of various lipid properties studied here indicates that some properties reach the steady state more quickly than others. in this sense, we found that the area per lipid and the hydration number are slower than the deuterium order parameter and the coordination of cations. consequently, to ensure that a system composed by a lipid bilayer has reached a steady state, the criterion that we propose is to show that the area per lipid or the hydration number have reached the equilibrium. from our results, two important aspects should be remarked: 1. the equilibration time is strongly dependent on the starting conformation of the system. wrong starting conformations will require much longer equilibration times, even of one order of magnitude higher than the requested from a more refined starting conformation. 2. temperature is a critical parameter for reducing the equilibration time in our simulations, due to the fact that higher temperatures increase the kinetic processes, i.e., the sampling of the configurational space of the system. 040005-7 papers in physics, vol. 4, art. 040005 (2012) / r d porasso et al. 0 5 10 15 20 25 n 0 5 10 15 20 25 n 0 10 20 30 40 50 60 70 80 time (ns) 0 5 10 15 20 25 n 0 25 50 75 100 0 25 50 75 100 0 10 20 30 40 50 60 70 80 time (ns) 0 25 50 75 100 (c) (b) (a) figure 9: left column represents the number of calcium ions involved in the lipid coordination along time for a concentration of [cacl2]= 0.25 n at different temperatures: (a) t = 330 k, (b) t = 340 k and (c) t = 350 k. the right column shows the same information expressed as a percentage of the total number of calcium ions in solution. acknowledgements authors wish to thank the assistance of the computing center of the universidad politécnica de cartagena (sait), spain. rdp is member of ‘carrera del investigador’, conicet, argentine. [1] o berger, o edholm, f jahnig, molecular dynamics simulations of a fluid bilayer of dipalmitoylphosphatidylcholine at full hydration, constant pressure, and constant temperature, biophys. j. 72, 2002 (1997). [2] e egberts, s j marrink, h j c berendsen, molecular dynamics simulation of a phospholipid membrane, eur. biophys. j. 22, 423 (1994). [3] u essmann, l perera, m l berkowitz, the origin of the hydration interaction of lipid bilayers from md simulation of dipalmitoylphosphatidylcholine membranes in gel and crystalline phases, langmuir 11, 4519 (1995). [4] s e feller, y zhang, r w pastor, r b brooks, constant pressure molecular dynamics simulations: the langevin piston method, j. chem. phys. 103, 4613 (1995). [5] w shinoda, t fukada, s okazaki, i okada, molecular dynamics simulation of the dipalmitoylphosphatidylcholine (dppc) lipid bilayer in the fluid phase using the nosr-parrinellorahman npt ensemble, chem. phys. lett. 232, 308 (1995). [6] d p tieleman, h j c berendsen, molecular dynamics simulations of a fully hydrated dipalmitoylphosphatidylcholine bilayer with different macroscopic boundary conditions and parameters, j. chem. phys. 105, 4871 (1996). [7] k tu, d j tobias, m l klein, constant pressure and temperature molecular dynamics simulation of a fully hydrated liquid crystal phase dipalmitoylphosphatidylcholine bilayer, biophys. j. 69, 2558 (1995). [8] m f brown, theory of spin-lattice relaxation in lipid bilayers and biological membranes. dipolar relaxation, j. chem. phys. 80, 2808 (1984). [9] m f brown, theory of spin-lattice relaxation in lipid bilayers and biological membranes. 2h and 14n quadrupolar relaxation, j. phys. chem. 77, 1576 (1982). [10] j f nagle, r zang, s tristam-nagle, w s sun, h i petrache, r m suter, x-ray structure determination of fully hydrated l. phase dipalmitoylphosphatidylcholine bilayers, biophys. j. 70, 1419 (1996). [11] r p rand, v a parsegian, hydration forces between phospholipid bilayers, biochim. biophys. acta 988, 351 (1989). [12] j seelig, deuterium magnetic resonance: theory and application to lipid membranes, q. rev. biophys. 10, 353 (1977). [13] j seelig, a seelig, lipid conformation in model membranes and biological systems, q. rev. biophys. 13, 19 (1980). 040005-8 papers in physics, vol. 4, art. 040005 (2012) / r d porasso et al. [14] w j sun, r m suter, m a knewtson, c r worthington, s tristram-nagle, r zhang, j f nagle, order and disorder in fully hydrated unoriented bilayers of gel phase dipalmitoylphosphatidylcholine, phys. rev. e. 49, 4665 (1994). [15] h akutsu, j seelig, interaction of metal ions with phosphatidylcholine bilayer membranes, biochemistry 20, 7366 (1981). [16] m g ganesan, d l schwinke, n weiner, effect of ca2+ on thermotropic properties of saturated phosphatidylcholine liposomes, biochim. biophys. acta 686, 245 (1982). [17] l herbette, c a napolitano, r v mcdaniel, direct determination of the calcium profile structure for dipalmitoyllecithin multilayers using neutron diffraction, biophys. j. 46, 677 (1984). [18] d huster, k arnold, k gawrisch, strength of ca2+ binding to retinal lipid membrane: consequences for lipid organization, biophys. j. 78, 3011 (2000). [19] y inoko, t yamaguchi, k furuya, t mitsui, effects of cations on dipalmitoyl phosphatidylcholine/cholesterol/water systems, biochim. biophys. acta 413, 24 (1975). [20] r lehrmann, j j seelig, adsorption of ca2+ and la3+ to bilayer membranes: measurement of the adsorption enthalpy and binding constant with titration calorimetry, biochim. biophys. acta 1189, 89 (1994). [21] l j lis, w t lis, v a parsegian, r p rand, adsorption of divalent cations to a variety of phosphatidylcholine bilayers, biochemistry 20, 1771 (1981). [22] l j lis, v a parsegian, r p rand, binding of divalent cations to dipalmitoylphosphatidylcholine bilayers and its effect on bilayer interaction, biochemistry 20, 1761 (1981). [23] t shibata, pulse nmr study of the interaction of calcium ion with dipalmitoylphosphatidylcholine lamellae, chem. phys. lipids. 53, 47 (1990). [24] s a tatulian, v i gordeliy, a e sokolova, a g syrykh, a neutron diffraction study of the influence of ions on phospholipid membrane interactions, biochim. biophys. acta 1070, 143 (1991). [25] r a böckmann, h grubmüller, multistep binding of divalent cations to phospholipid bilayers: a molecular dynamics study, angewandte chemie 43, 1021 (2004). [26] j faraudo, a travesset, phosphatidic acid domains in membranes: effect of divalent counterions, biophys. j. 92, 2806 (2007). [27] a a gurtovenko, asymmetry of lipid bilayers induced by monovalent salt: atomistic molecular-dynamics study, j. chem. phys. 122, 244902 (2005). [28] p mukhopadhyay, l monticelli, d p tieleman, molecular dynamics simulation of a palmitoyloleoyl phosphatidylserine bilayer with na+ counterions and nacl, biophys. j. 86, 1601 (2004). [29] s a pandit, d bostick, m l berkowitz, molecular dynamics simulation of a dipalmitoylphosphatidylcholine bilayer with nacl, biophys. j. 84, 3743 (2003). [30] u r pedersen, c laidy, p westh, g h peters, the effect of calcium on the properties of charged phospholipid bilayers, biochim. biophys. acta 1758, 573 (2006). [31] j n sachs, h nanda, h i petrache, t b woolf, changes in phosphatidylcholine headgroup tilt and water order induced by monovalent salts: molecular dynamics simulations, biophys. j. 86, 3772 (2004). [32] k shinoda, w shinoda, m mikami, molecular dynamics simulation of an archeal lipid bilayer whit sodium chloride, phys. chem. chem. phys. 9, 643 (2007). [33] n l yamada, h seto, t takeda, m nagao, y kawabata, k inoue, saxs, sans and nse studies on unbound state in dppc/water/cacl2 system, j. phys. soc. jpn. 74, 2853 (2005). 040005-9 papers in physics, vol. 4, art. 040005 (2012) / r d porasso et al. [34] d frenkel, b smit, understanding molecular simulations, academic press, new york (2002). [35] j j lópez cascales, j garćıa de la torre, s j marrink, h j c berendsen, molecular dynamics simulation of a charged biological membrane, j. chem. phys. 104, 2713 (1996). [36] w f van gunsteren, h j c berendsen, computer simulations of molecular dynamics: methodology, applications and perspectives in chemistry, angew. chem int. ed. engl. 29, 992 (1990). [37] r a böckmann, a hac, t heimburg, h grubmüller, effect of sodium chloride on a lipid bilayer, biophys. j. 85, 1647 (2003). [38] a a gurtovenko, i vattulainen, effect of nacl and kcl on phosphatidylcholine and phosphatidylethanolamine lipid membranes: insight from atomic-scale simulations for understanding salt-induced effects in the plasma membrane, j. phys. chem. b. 112, 1953 (2008). [39] h j c berendsen, j r grigera, t p straatsma, the missing term in effective pair potentials, j. phys. chem. 91, 6269 (1987). [40] h j c berendsen, d van der spoel, r van drunen, a message-passing parallel molecular dynamics implementation, comp. phys. comm. 91, 43 (1995). [41] e lindahl, b hess, d van der spoel, gromacs 3.0: a package for molecular simulation and trajectory analysis, j. mol. mod. 7, 306 (2001). [42] t darden, d york, l pedersen, particle mesh ewald: an n.log(n) method for ewald sums in large systems, j. chem. phys. 98, 10089 (1993). [43] u essmann, l perea, m l berkowitz, t darden, h lee, l g pedersen, a smooth particle mesh ewald method, j. chem. phys. 103, 8577 (1995). [44] l r de young, k a dill, solute partitioning into lipid bilayer-membranes, biochemistry 27, 5281 (1988). [45] a seelig, j seelig, the dynamic structure of fatty acyl chains in a phospholipid bilayer measured by deuterium magnetic resonance, biochemistry 13, 4839 (1974). [46] g cevc, a watts, d marsh, titration of the phase transition of phosphatidilserine bilayer membranes. effect of ph, surface electrostatics, ion binding and head-group hydration, biochemistry 20, 4955 (1981). [47] h hauser, f paltauf, g g shipley, structure and thermotropic behavior of phosphatidylserine bilayer membranes, biochemistry 21, 1061 (1982). [48] j f nagle, s tristam-nagle, structure of lipid bilayers, biochim. biophy. acta 1469, 159 (2000). [49] b a lewis, d m engelman, lipid bilayer thickness varies linearly with acyl chain length in fluid phosphatidylcholine vesicles, j. mol. biol. 166, 211 (1983). [50] r j pace, s i cham, molecular motions in lipid bilayer. i. statistical mechanical model of acyl chains motion, j. chem. phys. 76, 4217 (1982). [51] r l thurmond, s w dodd, m f brown, molecular areas of phospholipids as determined by 2h nmr spectroscopy, biphys. j. 59, 108 (1991). [52] r d porasso, j j lópez cascales, study of the effect of na+ and ca2+ ion concentration on the structure of an asymmetric dppc/dpps + dpps lipid bilayer by molecular dynamics simulation, coll. and surf. b. bioint. 73, 42 (2009). 040005-10 papers in physics, vol. 6, art. 060003 (2014) received: 11 march 2014, accepted: 1 august 2014 edited by: g. martinez mekler reviewed by: f. bagnoli, dipartimento di fisica ed astronomia, universita degli studi di firenze, italy licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060003 www.papersinphysics.org issn 1852-4249 critical phenomena in the spreading of opinion consensus and disagreement a. chacoma,1 d. h. zanette1, 2∗ we consider a class of models of opinion formation where the dissemination of individual opinions occurs through the spreading of local consensus and disagreement. we study the emergence of full collective consensus or maximal disagreement in oneand twodimensional arrays. in both cases, the probability of reaching full consensus exhibits well-defined scaling properties as a function of the system size. two-dimensional systems, in particular, possess nontrivial exponents and critical points. the dynamical rules of our models, which emphasize the interaction between small groups of agents, should be considered as complementary to the imitation mechanisms of traditional opinion dynamics. i. introduction the remarkable regularities observed in many human social phenomena —which, in spite of the disparate behavior of individual human beings, emerge as a consequence of their interactions— have since long attracted the attention of physicists and applied mathematicians. collective manifestations of human behavior have been mathematically modeled in a variety of socioeconomic processes, such as opinion formation, decision making, resource allocation, cultural and linguistic evolution, among many others, often using the tools provided by statistical physics [1]. the stylized nature of these models emphasizes the identification of the generic mechanisms at work in human interactions, as well as the detection of broadly significant fea∗e-mail: zanette@cab.cnea.gov.ar 1 instituto balseiro and centro atómico bariloche, 8400 san carlos de bariloche, ŕıo negro, argentina. 2 consejo nacional de investigaciones cient́ıficas y técnicas, argentina. tures in their macroscopic outcomes. they provide the key to a deep insight into the common elements that underlie those processes. models of opinion formation constitute a central paradigm in the mathematical description of social processes from the viewpoint of statistical physics. starting in the seventies and eighties [2–5], much work —which we cannot aim at inventorying here, but which has been comprehensibly reviewed in recent literature [1]— has exploited the formal resemblance between opinion spreading and spin dynamics in order to apply well-developed statistical techniques to the analysis of such models. the key mechanism driving most agent-based models of opinion formation is imitation. for instance, in the voter model —to which we refer several times in the present paper— the basic interaction event consists in an agent copying the opinion of another agent chosen at random from a specified neighborhood. at any given time, the opinion of each agent adopts one of two values, typically denoted as ±1. the voter model can be exactly solved for populations of agents distributed over regular (hyper)cubic arrays in any dimension 060003-1 papers in physics, vol. 6, art. 060003 (2014) / a. chacoma et al. [6]. for infinitely large populations, it is characterized by the conservation of the average opinion. in one dimension, a finite population always reaches an absorbing state of full collective consensus, all agents sharing the same opinion. the probability of final consensus on either opinion coincides with the initial fraction of agents with that opinion, and the time needed to reach the absorbing state is of the order of the population size squared [1]. in this paper, we present an introductory analysis of a class of models where opinion dynamics is driven by the spreading of consensus and disagreement, rather than by the dissemination of individual opinions. the basic concept behind these models is that agreement of individual opinions in a localized portion of the population may promote the emergence of consensus in the neighborhood while, in contrast, local disagreement may inhibit the growth of, or even decrease, the degree of consensus in the surrounding region. in real social systems, the mechanism of consensus and disagreement spreading should be complementary to the direct transmission of opinions between individual agents. in our models, however, we disregard the latter to focus on the dynamical effects of the former. since the degree of consensus can only be defined for two or more agents, the spreading of consensus and disagreement engages groups of agents rather than individuals. such groups are, thus, the elementary entities involved in the social interactions [7–11]. we stress that several other social phenomena — related, notably, to decision making [10] and resource allocation [12]— are also based on group interactions that cannot be reduced to two-agent events. in the class of models analyzed here, each interaction event is conceived to occur between two groups: an active group g and a reference group g′. as a result of the interaction, the agents in g change their individual opinions in such a way that the level of consensus in g approaches that of g′. this generic mechanism extends dynamical rules where the opinion of each single agent changes in response to the collective state of a reference group [1, 8, 13, 14]. the size and internal structure of the interacting groups, as well as the precise way in which opinions are modified in the active group with respect to the reference group, defines each model in this class. for the sake of concreteness, we limit the analysis to systems where, as in the voter model, individual opinions can adopt two values (±1). in the next section, we analyze the case where both the active group and the reference group are formed by two agents, and the population is structured as a one-dimensional array. in this case, the system admits stationary absorbing states of full consensus and maximal disagreement, with simple scaling laws with the population size. in section iii., we study a two-dimensional version of the same kind of model with larger groups, where nontrivial critical phenomena —not present in the one-dimensional case— emerge. results and perspectives are summarized in the final section. ii. two-agent groups on onedimensional arrays we begin by considering the simple situation where each of the two groups involved in each interaction event is formed by just two agents. the situation within each group, thus, is one of either full consensus (when the two agents bear the same opinion, either +1 or −1) or full disagreement (when their opinions are different). we take a population where agents are distributed on a one-dimensional array, and consecutively labeled from 1 to n. periodic boundary conditions are applied at the ends. at each time step, we choose four contiguous agents, say, i−1 to i+2. the central pair i, i+1 acts as the reference group g′. if they are in disagreement, the agents i−1 and i+2 respectively adopt the opinions opposite to those of i and i+ 1 with probability pd, while with the complementary probability 1 − pd nothing happens. if, on the other hand, i and i + 1 agree with each other, i − 1 and i + 2 copy the common opinion in g′ with probability pc , while with probability 1 − pc nothing happens. in this way, both consensus and disagreement spread from g′ outwards, to the left and right. the probabilities pc and pd control the relative frequency with which consensus and disagreement are effectively transmitted. the left panel of fig. 1 illustrates the states of the four consecutive agents in the two possible outcomes of the interaction (up to opinion inversions). it is not difficult to realize that, for pd = pc = 1, our one-dimensional array is equivalent to two intercalated subpopulations —respectively occupying even and odd sites— each of them evolving accord060003-2 papers in physics, vol. 6, art. 060003 (2014) / a. chacoma et al. 0 200 400 600 800 1000 1200 1400 1600 0 50 100 150 200 time g' g figure 1: left: the two possible outcomes of the interaction, up to opinion inversions, for four consecutive agents along the one-dimensional array. the active and the reference groups, g and g′, are respectively formed by the outermost and innermost agents. right: time evolution of a 200-agent array with n+(0) = 0.5 and pd = pc = 1. black and white dots correspond, respectively, to opinions +1 and −1. at time t = 1534, an absorbing state of maximal disagreement is reached. ing to the voter model. the dynamical rules are reduced in this case to binary interactions between agents. in fact, whatever the opinions in group g′ at each interaction event, agent i − 1 and i + 2 respectively copy the opinions of i + 1 and i. now, since the voter model always leads a finite population to an absorbing state of full consensus, the final state of our system can be one of full consensus on either opinion, or a state of maximal disagreement where opposite opinions alternate over the sites of the one-dimensional array. in the latter, the two neighbors of each agent with opinion +1 have opinion −1 and vice versa. the right panel of fig. 1 shows the evolution of a 200-agent array for n+(0) = 0.5 and pd = pc = 1, black and white dots respectively corresponding to opinions +1 and −1. at any given time, the population is divided into well-defined domains either of consensus in one of the opinions or disagreement. note that the domain boundaries show the typical diffusive motion found in stochastic coarsening processes [1, 15]. taking into account that, in the voter model, the probability of ending with full consensus on opinion +1 is given by the initial fraction of agents with that opinion, n+(0), and assuming that the initial distribution of opinions is homogeneous over the array, the probability that our system ends in a state of full consensus on either opinion is pcons = n 2 +(0)+n 2 −(0) = 1−2n+(0)+2n2+(0). note 0.4 0.6 0.8 1.0 n = 130 n = 230 n = 430 n = 930 p c o n s 0.0 0.1 0.2 0.3 0.4 0.5 10 -3 10 -2 n − 2 t n + (0) figure 2: numerical results for consensus and disagreement spreading on a one-dimensional array with pd = pc = 1, obtained from 10 3 realizations for each parameter set (see text for details). upper panel: probability of reaching full consensus, pcons, as a function of the initial fraction of agents with opinion +1, n+(0), for four values of the population size n. lower panel: total time t needed to reach the final absorbing state, normalized by the squared population size n2. since both pcons and n−2t are symmetric with respect to n+(0) = 1/2, only the lower half of the horizontal axis is shown. that this coincides with the probability that, in the initial state, any two contiguous agents are in consensus. moreover, we know that the time needed to reach an absorbing state in the one-dimensional voter model is proportional to n2, a result that should also hold in our case. the upper panel of fig. 2 shows numerical results for the probability of final full consensus pcons, determined as the fraction of realizations that ended in full consensus out of 103 runs, as a function of n+(0) and for several population sizes n. the curve is the analytic prediction given above. the result is analogous to the probability of final consensus found in sznajd-type models [13]. the lower panel shows the total time t needed to reach the final absorbing state (of either consensus or disagree060003-3 papers in physics, vol. 6, art. 060003 (2014) / a. chacoma et al. 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00 0.5 0.6 0.7 0.8 0.9 1.0 n = 130 n = 230 n = 430 n = 930 p c o n s p d 300 600 900 0.01 0.1 w id th n figure 3: probability of reaching full consensus, pcons, as a function of the probability pd, with pc = 1 and for four values of the system size n. results were obtained averaging over 103 realizations for each parameter set. insert: width of the variation range of pcons as a function of n. the straight line has slope −1. ment), averaged over 103 realizations and normalized by n2. as expected, both pcons and n −2t are independent of the population size. when pd 6= pc , the two intercalated subpopulations cannot be considered independent of each other any more. if pd < pc , for instance, an opinion prevailing in one of the subpopulations will invade the other subpopulation faster than the opposite opinion, thus favoring the establishment of collective consensus. to analyze this asymmetric situation, we first fix pc = 1 and let pd vary in (0, 1), so that the spreading of consensus is more probable than that of disagreement. the main plot in fig. 3 shows numerical results for pcons, measured as explained above, as a function of pd and for four values of n. in all the realizations, n+(0) = 0.5, and the two opinions are homogeneously distributed over the population. as pd decreases below 1, the probability of reaching full consensus grows rapidly, approaching pcons = 1. as n grows, moreover, the change in pcons is more abrupt. fitting of a sigmoidal function to the data of pcons vs. pd near pd = 1 makes it possible to assign a width to the range where pcons changes between 1 and 0.5. the insert of fig. 3 shows this width as a function of the system size n in a log-log plot. the slope of the linear fitting is −1.00±0.02. 0 5 10 15 20 25 0.0 0.2 0.4 0.6 0.8 1.0 n = 130 n = 230 n = 430 n = 930 varying p c p c o n s n(1−p d ), n(1−p c ) varying p d figure 4: probability of reaching full consensus, pcons, as a function of n(1−pd) when varying pd with pc = 1, and as a function of n(1 −pc ) when varying pc with pd = 1. therefore, the width is inversely proportional to n. the facts that pcons = 0.5 for pd = 1 and for all n, and that the width of the range where pcons changes decreases as n −1, make it possible to conjecture the existence of a function φ(u), with φ(0) = 0.5 and φ(u) → 1 for large u, such that pcons = φ[n(1 − pd)]. to test this hypothesis, we have plotted our numerical data for pcons against n(1−pd) in fig. 4. the results are those in the upper half of the plot (“varying pd”). the collapse of the data for different n on the same curve confirms the conjecture. analogous results were obtained when fixing pd = 1 and pc was varied. now, pcons drops to 0 in a narrow interval for pc just below 1, indicating the prevalence of disagreement. again, the width of the interval is proportional to n−1. the results in the lower half of fig. 4 (“varying pc ”) illustrate the collapse of the corresponding values of pcons when plotted against n(1 −pc ). in our numerical realizations with pd 6= pc , we have also recorded the average time t needed to reach the final absorbing state. figure 5 shows results for n−2t in the case where pc = 1 and pd changes (cf. lower panel of fig. 2). in contrast with the case with pd = pc = 1, rescaling of the time t with n2 leaves a remnant discrepancy between results for different population sizes n. specifically, for pd < 1, t grows faster than n 2. moreover, t 060003-4 papers in physics, vol. 6, art. 060003 (2014) / a. chacoma et al. 0.5 0.6 0.7 0.8 0.9 1.0 0.01 0.02 0.03 0.04 n = 130 n = 230 n = 430 n = 930 n − 2 t p d figure 5: total time t needed to reach the final absorbing state, normalized by the squared population size n2, as a function of the probability pd (pc = 1). bézier curves have been plotted as a guide to the eye. is nonmonotonic as a function of pd, exhibiting a minimum which shifts towards pd = 1 as n grows. the same dependence with n and pc is observed when we fix pd = 1 and let pc vary. summarizing our results for a one-dimensional population with two-agent groups, we can say that the possibility that both consensus and disagreement spread over the system makes it possible to find absorbing collective states of either full consensus, with all the agents having the same opinion, or maximal disagreement, where opposite opinions alternate between consecutive neighbor agents. for large populations, the relative prevalence of collective consensus and disagreement is controlled by how the probabilities pd and pc compare with each other. our results suggest that, in the limit n → ∞, the condition pc > pd univocally leads to full consensus and vice versa. for smaller sizes, however, the system can approach full consensus even when pd > pc , and vice versa —presumably due to finite-size fluctuations. iii. larger groups on twodimensional arrays a two-dimensional version of the above model, where agents occupy the n = l × l sites of a regular square lattice with periodic boundary conditions, can be defined as follows. the reference group g′ at each interaction event is a randomly chosen 2×2-agent block. the corresponding active group g is formed by the eight nearest neighbors to the agents in g′ which are not in turn members of the reference group. the active group, thus, surrounds g′. of the sixteen possible opinion configurations of the reference group, two correspond to full consensus —with the four agents sharing the same opinion— and six correspond to maximal disagreement —with two agents in each opinion. the remaining eight configurations correspond to partial consensus, with only one agent disagreeing with the other three. the dynamical rules are the following: (1) if g′ is in full consensus, all the agents in g copy the common opinion in g′; (2) if g′ is in maximal disagreement, each agent in g adopts the opinion opposite to that of the nearest neighbor in g′; (3) otherwise, nothing happens. hence, both consensus and disagreement spread outwards from the reference group. probabilities pd and pc for the spreading of disagreement and consensus are introduced exactly as above. the left part of fig. 6 shows, up to rotations and opinion inversions, the three possible outcomes of a single interaction event. the states of full collective consensus —with all the agents in the population having the same opinion— and of maximal collective disagreement —with the two opinions alternating site by site along each direction over the lattice— are absorbing states, in correspondence with the onedimensional case. however, for pd = pc = 1, the system cannot be reduced anymore to a collection of sublattices governed by the voter model. the definition of g and g′ establish now correlations between the opinion changes in the active group at each interaction event. moreover, some opinion configurations in the reference group induce evolution in the active group, while others do not. figure 6 shows, to its right, four snapshots of a 120×120agent population, along a realization starting with n+(0) = 0.35 and pd = pc = 1. note the formation of consensus clusters at rather early stages, and the final prevalence of disagreement. the line boundaries between disagreement regions are also worth noticing. following the same lines as for the onedimensional array, we study first the probability pcons of reaching full collective consensus as a func060003-5 papers in physics, vol. 6, art. 060003 (2014) / a. chacoma et al. t = 3207 g t = 12 g' t = 1017 t = 0 figure 6: left: the three possible outcomes of the interaction, up to ±90◦ rotations and opinion inversions, on the two-dimensional lattice. the active and the reference groups, g and g′, are respectively formed by the outermost and innermost agents. right: four snapshots of a population with l = 120, for n+(0) = 0.35 and pd = pc = 1, including the initial condition and two intermediate states. at time t = 3207, an absorbing state of maximal disagreement has been reached. black and white dots correspond, respectively, to opinions +1 and −1. tion of the initial fraction of agents with opinion +1, n+(0), in the case pd = pc = 1. opinions are homogeneously distributed all over the population. for very small n+(0), as expected, we find pcons ≈ 1. however, in sharp contrast with the onedimensional case (see fig. 3), pcons remains close to its maximal value until n+(0) ≈ 0.35, where it drops abruptly to pcons ≈ 0. the width of the transition zone decreases as a nontrivial power of the system size, ∼ l0.83±0.04, as illustrated in the insert of fig. 7. our best estimate for the critical value of n+(0) at which pcons drops is n crit + = 0.353±0.001. the main plot in the figure shows the collapse of numerical measurements of pcons as a function of n+(0) for different sizes l, averaged over 100 realizations, when plotted against the rescaled shifted variable l0.83[n+(0) − 0.353]. these results suggest that, for very large populations, the probability of reaching full consensus jumps discontinuously from pcons = 1 to 0 at −4 −2 0 2 4 0.0 0.2 0.4 0.6 0.8 1.0 l = 90 l = 120 l = 150 l = 180 l = 210 p c o n s l 0.83 [n + (0) − 0.353] 10 100 0.01 0.1 w id th l figure 7: numerical results for the probability of reaching full consensus, pcons, on a two-dimensional lattice with pd = pc = 1, obtained from 100 realizations for each parameter set. collapse for several system sizes l is obtained plotting pcons against l0.83[n+(0) − 0.353]. insert: scaling of the width of the transition zone of pcons, determined from fitting a sigmoidal function, as a function of the size l. the straight line has slope −0.83. n+(0) = n crit + . compare this with the smooth, sizeindependent behavior of the one-dimensional case. note also that ncrit+ is close to, but does not coincide with, n+(0) = 1/3. at this latter value, in the initial condition with homogeneously distributed opinions, the probability of finding a 2 × 2-agent block in full consensus becomes lower than that of maximal disagreement as n+(0) grows. in the above simulations, we have also measured the average total time t needed to reach the final absorbing state. results are shown in fig. 8. again in contrast with the one-dimensional case, t exhibits a remarkable change in its scaling with the system size as n+(0) overcomes the critical value ncrit+ . going now to the dependence of pcons on the probability of disagreement spreading pd —with pc = 1 and n+(0) = 0.5— it qualitatively mirrors that of the one-dimensional case, shown in fig. 3. namely, as pd decreases from 1, pcons grows from 0 to 1 in an interval whose width decreases with the population size. in the twodimensional system, however, the transition takes place at a critical probability pcritd that can be 060003-6 papers in physics, vol. 6, art. 060003 (2014) / a. chacoma et al. 0.20 0.25 0.30 0.35 0.40 10 100 1000 10000 l = 90 l = 120 l = 150 l = 180 l = 210 t n + (0) figure 8: total time t needed to reach the final absorbing state in a two-dimensional lattice, as a function of n+(0), for different sizes l. clearly discerned from pd = 1. our estimate is pcritd = 0.984 ± 0.002. moreover, the scaling of the transition width with the population size exhibits a nontrivial exponent, decreasing as l−0.93±0.05. collapse of the rescaled numerical results for various sizes, obtained from averages of 100 realizations, are shown in fig. 9, where we plot pcons as a function of l0.93(0.984−pd) (cf. fig. 4). the insert displays the power-law dependence of the width on the size l. analogous results are obtained if the probability of consensus spreading pc is varied, with pd = 1. finally, we have found that the transition in pcons as a function of the disagreement probability pd shows a dependence on the initial fraction of agents with opinion +1. to characterize this effect in a way that highlights the relative prevalence of disagreement and consensus, we have measured the value of pd at which the probability of getting full collective consensus reaches pcons = 0.5, as a function of n+(0). the parameter plane (n+(0), pd), thus, becomes divided into regions where a final state of full consensus is more probable than that of maximal disagreement, and vice versa. results for a 120 × 120-agent population are presented in fig. 10. in summary, while spreading of consensus and disagreement on a two-dimensional lattice bears superficial qualitative similarity with the onedimensional case, the probability that the population reaches full collective consensus in two dimen−2 0 2 4 6 0.0 0.2 0.4 0.6 0.8 1.0 l = 90 l = 120 l = 150 l = 180 l = 210 p c o n s l 0.93 (0.984 − p d ) 10 100 0.01 0.1 w id th l figure 9: collapse of numerical results for the probability of reaching full consensus, pcons, on a twodimensional lattice with pc = 1 and n+(0) = 0.5, for several system sizes l when plotted against l0.93(0.984 − pd). insert: scaling of the width of the transition zone of pcons as a function of the size l. the straight line has slope −0.93. 0.35 0.40 0.45 0.50 0.96 0.97 0.98 0.99 1.00 consensus p d n + (0) disagreement figure 10: zones of relative prevalence of full consensus and maximal disagreement in a twodimensional lattice with l = 120, plotted on the parameter plane (n+(0), pd). symbols stand for numerical results, and the curve serves as a guide to the eye. 060003-7 papers in physics, vol. 6, art. 060003 (2014) / a. chacoma et al. sions exhibits a quite different dependence on the system size, on the initial conditions, and on the spreading probabilities. in particular, our results reveal the existence of critical phenomena involving scaling laws with nontrivial exponents. iv. conclusion in this paper, we have considered the emergence of collective opinion in a population of interacting agents where, instead of imitation between individual agents, opinions are transmitted through the spreading of local consensus and disagreement toward their neighborhoods. the basic interacting units in this mechanism are not individual agents but rather small groups of agents which mutually compare their internal degrees of consensus and modify their opinions accordingly. in this sense, it extends the basic mechanism underlying such models as the majority-rule and sznajd-like dynamics [1, 8, 13], where the opinion of each individual agent changes in response to the collective state of a reference group. it is expected that in real social systems the dissemination of individual opinions through agent-to-agent imitation on one side, and the spreading of consensus and disagreement by group interaction on the other, are complementary mechanisms simultaneously shaping the overall opinion distribution. here, in order to gain insight on the specific effects of the second class, we have focused on models solely driven by the spreading of consensus and disagreement. the combined effects of the two mechanisms is a problem open to future work. our numerical simulations concentrated on two-opinion models evolving on oneand twodimensional arrays [14]. in both cases, absorbing states with all the population bearing the same opinion (full consensus) and with half of the population in each opinion (maximal disagreement) are possible final states for the system. maximal disagreement states are characterized by alternating opinions between neighbor sites along the arrays. a relevant quantity to characterize the behavior is the probability of reaching full consensus, as a function of the initial condition —i.e., the initial fraction of the population with each opinion— and of the relative probabilities of consensus and disagreement spreading. the total time needed to reach the final absorbing state, averaged over realizations, has also been measured as a characterization of the dynamics. we have found that, in several cases, these quantities display critical phenomena when the control parameters are changed, with power-law scaling laws as functions of the system size, pointing to the presence of discontinuities in the limit of infinitely large populations. it is interesting to remark that the scaling laws are rather simple for one-dimensional arrays, but involve nontrivial exponents and critical points in the case of two-dimensional systems. within the same oneand two-dimensional models analyzed here, an aspect that deserves further exploration is the dynamics and mutual interaction of the opinion domains that develop since the first stages of evolution (figs. 1 and 6). however, the most interesting extension of the present analysis should progress along the direction of considering more complex social structures. the interplay between the dynamical rules of consensus and disagreement spreading and the topology of the interaction pattern underlying the population might bring about the emergence of new kinds of collective self-organization phenomena. acknowledgements we acknowledge enlightening discussions with eduardo jagla. financial support from anpcyt (pict2011-545) and sectyp uncuyo (project 06/c403), argentina, is gratefully acknowledged. [1] c castellano, s fortunato, v loreto, statistical physics of social dynamics, rev. mod. phys. 81, 591 (2009). [2] w weidlich, the statistical description of polarization phenomena in society, br. j. math. stat. psychol. 24, 251 (1971). [3] r holley, t liggett, ergodic theorems for weakly interacting infinite systems and the voter model, ann. probab. 3, 643 (1975). [4] s galam, y gefen, y shapir, a mean behavior model for the process of strike, j. math. sociol. 9, 1 (1982). 060003-8 papers in physics, vol. 6, art. 060003 (2014) / a. chacoma et al. [5] s galam, majority rule, hierarchical structures and democratic totalitarism: a statistical approach, j. math. psychol. 30, 426 (1986). [6] s redner, a guide to first-passage processes, cambridge university press, cambridge (2001). [7] k starkey, ch barnatt, s tempest, beyond networks and hierarchies: latent organization in the uk television industry, org. sci. 11, 299 (2000). [8] p krapivsky, s redner, dynamics of majority rule in two-state interacting spin systems, phys. rev. lett. 90, 238701 (2003). [9] j johnson, multidimensional events in multilevel systems, in: the dynamics of complex urban systems, eds. s albeverio et al., pag. 311, physica-verlag, heidelberg (2008). [10] d h zanette, beyond networks: opinion formation in triplet-based populations, phil. trans. r. soc. a 367, 3311 (2009). [11] d h zanette, a note on the consensus time of mean-field majority-rule dynamics, pap. phys. 1, 010002 (2009). [12] d g hernández, d h zanette, evolutionary dynamics of resource allocation in the colonel blotto game, j. stat. phys. 151, 623 (2013). [13] k sznajd-weron, j sznajd, opinion evolution in closed community, int. j. mod. phys. c 11, 1157 (2000). [14] d stauffer, a o sousa, s m de oliveira, generalization to square lattice of sznajd sociophysics model, int. j. mod. phys. c 11, 1239 (2000). [15] a bray, theory of phase-ordering kinetics, adv. phys. 43, 357 (1994). 060003-9 papers in physics, vol. 6, art. 060014 (2014) received: 21 november 2014, accepted: 5 december 2014 edited by: l. a. pugnaloni reviewed by: k. to, institute of physics, academia sinica, taipei, taiwan. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060014 www.papersinphysics.org issn 1852-4249 invited review: clogging of granular materials in bottlenecks iker zuriguel1∗ during the past decades, notable improvements have been achieved in the understanding of static and dynamic properties of granular materials, giving rise to appealing new concepts like jamming, force chains, non-local rheology or the inertial number. the ‘saltcellar’ can be seen as a canonical example of the characteristic features displayed by granular materials: an apparently smooth flow is interrupted by the formation of a mesoscopic structure (arch) above the outlet that causes a quick dissipation of all the kinetic energy within the system. in this manuscript, i will give an overview of this field paying special attention to the features of statistical distributions appearing in the clogging and unclogging processes. these distributions are essential to understand the problem and allow subsequent study of topics such as the influence of particle shape, the structure of the clogging arches and the possible existence of a critical outlet size above which the outpouring will never stop. i shall finally offer some hints about general ideas that can be explored in the next few years. i. clogging in bottlenecks, a multiscale and multidisciplinary problem. when a system of discrete bodies passes through a constriction, the interactions among the particles might lead to the development of clogging structures that, eventually, completely arrest the flow. this phenomenon is observed in a wide range of systems with relevant consequences. clogging of granular materials in silos may force a production line to be stopped. much in the same way, clogging of suspended hydrated particles is a major issue concerning oil and gas transport through pipelines [1]. at a smaller spatial scale, clogging leads to intermittent flow when a dense suspension of colloidal particles passes through a constriction in a ∗e-mail: iker@unav.es 1 departamento de f́ısica, facultad de ciencias, universidad de navarra, 31080 pamplona, spain. microchannel [2, 3]. a straightforward application of the understanding that could be gained concerning clogging in suspensions is found in ecological engineering. nowadays, an alternative that is becoming widely used for removing pollutants from wastewater is the use of subsurface flow treatment. the most important drawback of this technique is its unpredictable lifetime, mostly limited by clogs that obstruct the pores [4]. at an even smaller scale, intermittent flows are observed when electrons on the liquid helium surface pass through nanoconstrictions [5]. finally, clogging can also develop when crowds in panic are evacuated through emergency exits that cannot absorb the amount of people approaching the doors [6–8]. all these examples of clogging take place in a broad range of systems where widely different forces are at play: those concerning the interactions among particles as well as those related to the interaction between particles and the surrounding media. for the case of non-cohesive inert grains, gravity and contact forces are the only relevant 060014-1 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel ones. for particle suspensions, however, the hydrodynamics of the flowing fluid as well as the capillary effects must be taken into account. dynamics of crowds through bottlenecks are even more difficult to approach theoretically, yet a social force model has been proved to adequately reproduce the observed behavior in some circumstances. in the last decade, the number of works published about clogging has experienced a sudden increase, which constitutes a gauge of the interest and relevance of this phenomenon. despite this, the physical mechanisms behind clogging are still not well understood. several issues contribute to this, but probably the most important one concerns the local character of clogging when compared, for example, with the global nature of jamming [9], a scenario that has attracted much more attention over the last years. this ‘local character of clogging’ seems to complicate the definition of global extensive variables within the system which could be used to characterize the phenomenology. in this manuscript i shall give a brief summary of the major advances achieved in the understanding of clogging of non-cohesive inert grains at a bottleneck. i will start by presenting results on the static silo, then i will move to the case of a vibrated silo, and finally i will present some conclusions and mention several questions that —from my point of view— should be addressed in the forthcoming years. ii. clogging in hoppers or silos the development of clogs that obstruct the flow in the discharge of bins or silos by gravity is a problem that has always worried the engineering community [10–13]. the goal in all these works was finding a ratio of the outlet to particle size that could guarantee the absence of clogging. for non-cohesive materials, it was known that this value is around 5 although, depending on the particle properties, it could increase up to 10. nevertheless, little was understood about the mechanism that triggers a clog and the physical variables that control its development. indeed, it was not until the beginning of this century when the scientific community started to carefully investigate this problem [14]. figure 1: histogram nr(s) for the number of grains s that flow between two successive clogs. data correspond to a circular orifice of 6 mm diameter and glass beads with a diameter of 2 mm. more than 4000 events were recorded. in the inset, a semilogarithmic plot with the solid line indicating an exponential fit. i. avalanche size distribution one of the first questions that was tackled concerns the statistics of the avalanche sizes (the usual measured magnitude is the number of grains that flow out of the silo from the breakage of a clogging arch, until the development of a new one). the key feature of avalanches, which is now well accepted, is that the distribution of their sizes follows an exponential decay (see fig. 1), a result reported for the first time in [15]. this trend was explained in [16] by assuming that the probability of clogging is constant during the whole avalanche, a behavior observed for all the outlet sizes explored. afterward, a many-particle-inspired theory was proposed based on a continuity equation in polar coordinates [17]. although friction, force networks or inelastic collapse of the particles were not taken into account, the authors reproduced the exponential decay of the avalanche (or burst) sizes. the intermittent flows reported were explained in terms of a random alternation between particle propagation and gap propagation. very recently, a probabilistic model —in which the arches were modeled by a one-dimensional stochastic cellular automaton— also sheds light on the origin of the exponential character of the avalanche size distribution [18]. 060014-2 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel the stochastic nature of the clogging process was also suggested in [19] where it was shown that the particles that end up forming the clog were totally uncorrelated at the beginning of the avalanche. the exponential decay of the avalanche size distribution has been reported in several arrangements: 2d and 3d silos [16, 20, 21], 2d hoppers [22, 23], 2d and 3d tilted hoppers and silos [24, 25], silos with the presence of obstacles [26,27], 2d silos where the particles were driven by different gravity forces [28], and fluid driven particles in 2d and 3d [29, 30]. however, there are some examples where this exponential tail breaks down. those situations are typically related to a breaking of the symmetry of the problem as in the following cases: 1) usage of particles with shapes that are not spherical [31]; 2) emplacement of multiple orifices [32]; 3) implementation of slots in 3d silos instead of the normal circular orifices [33]. in the latter case, a power law distribution was observed as it will be explained in section iv. incidentally, power law distributions were also numerically obtained when considering internal avalanches, defined as the number of grains that move inside the silo between consecutive clogs. this result was compatible with the idea of self organized criticality, in analogy with avalanches developed at the surface of a pile [34]. this work was, indeed, one of the first approaches to the avalanche statistics in the silo problem. ii. does a critical outlet size exist? considering the exponential character of the avalanche size distribution, its first moment (the average avalanche size 〈s〉) can be easily calculated and used to study the dependence of clogging on the size ratio between the outlet and the particles. for spherical beads in a 3d silo, a divergence of the avalanche size was reported for an outlet diameter about 5 times the bead diameter (see fig. 2) [31]. this divergence was shown to be robust, as it holds for particles with widely different properties. among these, the shape of the particle was reported to be the most influential on the critical outlet size value. nevertheless, in a subsequent work, the existence of such a critical outlet size was challenged by k. to [22]. in a two dimensional silo, it was shown that several empirical fits agree reasonably well with the experimental data: some were compatible with the existence of a critical outlet, but others were not (see fig. 3). following this idea, janda et al. [20] demonstrated that one of the non-divergent expressions proposed in [22] could be analytically deduced using both, the probability that a given number of particles meet above the outlet —as suggested by roussel et al. [35]— and the probability of finding arches of a given size within a granular deposit — as found in [36, 37]. unfortunately, the reasoning used in two dimensions was not applicable to three dimensional silos, where the transition seems to actually exist. interestingly, this clogging transition has been also identified for inclined silos and orifices [24, 25], as well as in the discharge of granular piles through an orifice below its apex [38]. very recently, the mean avalanche size has been put on relation with the fraction of clogging configurations that are sampled by the orifice, suggesting that 〈s〉 should increase exponentially with the hole width raised to the system dimensionality [39]. according to these results, clogging is akin to the jamming and glass transitions in the sense that there is not any sharp discontinuity in the behavior, but a dramatic increase of the relaxation times as the orifice size is enlarged. iii. clogging arches complementary to the analysis of the avalanche sizes, some authors have paid special attention to the arches that clog the orifices. clogging arches are structures of several mutually stabilizing particles that have to span, at least, the size of the constriction. in his seminal work, to et al. introduced a simple model to explain the clogging probability based in the geometry of the clogging arches (see fig. 4) [14]. they proposed that the position of particles in a clogging arch is the result of a random walk model with some restrictions: 1) the horizontal span of the arch should be larger than the orifice; 2) the arch has to be convex everywhere; 3) particles conforming the arch should be in contact with each other. this model nicely reproduced the clogging probability for hopper angles below 75◦. in a subsequent work [40], the same authors introduced an approximation of the arch shape to a circular arc centered at the apex of the hopper cone. from this, they calculated detailed properties of the clogging arches, such as the number of disks conforming them, finding good agreement with the 060014-3 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel 1 2 3 4 5 0 1x106 2x106 3x106 4x106 0.5 1 2 3 10-1 102 105 108 < s> 1/(r c -r) r figure 2: mean avalanche size 〈s〉 vs. r, the ratio between the outlet and particle diameters. the solid line is a fit with the equation: 〈s〉 = a(rc−r)−γ; with rc = 4.94±0.03, γ = 6.9±0.2, and a = 9900 ± 100. inset: mean avalanche size 〈s〉 vs. 1/(rc − r). note the logarithmic scale. figure reprinted with permission from ref. [31]. copyright (2005) by the american physical society [65]. experimental results. some of the ideas proposed by to et al. were corroborated in ref. [41] where it was reported that the aspect ratio of the arches (the height divided by half the span) tends to one, a result that is compatible with a semicircular shape. in addition, it was shown that the convexity condition assumed by to is not necessarily fulfilled in all the particles. indeed, 17% of the particles had an associated angle with their two neighbors above 180◦. hence, arches were locally concave at that particle, a situation which was named ‘defect’. despite this seemingly mismatch with the restricted random walk model, a strong inverse correlation of the angle associated to a particle and the one of their neighbors was also shown. apparently, this inverse correlation compensates the apparition of defects and preserves the validity of the restricted random walk model. apart from the works mentioned above, where only the geometry of the arches was evaluated, there have been preliminary attempts to consider the forces involved within the particles conformfigure 3: variation of the decay rate α with hopper or silo exit d. α is obtained from the fittings of the avalanche sizes with f(s) = e−α(s−s0). the solid line is the fitted curve α = ae−bd 2 with a = 0.846 and b = 0.275. the inset shows the same data and the fitted curve with d2 plotted in the x axis. figure reprinted with permission from ref. [22]. copyright (2005) by the american physical society [65]. ing the arches. in [42], force analysis was used to calculate the jamming probability of mixed sizes disks that move downwards under gravity in a twodimensional hopper. the authors focused in the simplest case of arches formed by three discs which, for the outlet size employed, were the most common. finally, hidalgo et al. [43] performed numerical simulations and found that —in clogging arches— the tangential forces in the ‘defects’ were very high, while normal forces were abnormally low. the outcome concerning tangential forces is somehow expected as friction is necessary to stabilize defects. on the contrary, the result concerning the normal forces is rather counterintuitive, but it was in accord with a previous forecasted prediction based on experimental works on the stability of arches in a vibrated silo [44]. iv. orifice geometry. as stated above, clogging is a local phenomenon in the sense that it always takes place at the constriction. according to this, it seems rather obvious that the properties of the confining geometry would 060014-4 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel figure 4: (a) image of a typical clogging arch. (b) configuration of the arch where ri illustrate the i steps of the restricted random walk model proposed in [14]. figure reprinted with permission from ref. [14]. copyright (2001) by the american physical society [65]. importantly affect the clogging process. an evidence of this can be found in [14] where it is shown that the clogging probability in a hopper is notably reduced as the hopper angle increases from 60◦ to 75◦. on the contrary, hopper angles below 60◦ give rise to similar clogging probabilities. the reason for this seems to be founded on the fact that, for sufficiently flat hoppers, the grains develop a spontaneous internal angle of repose which acts as an internal hopper. the effect of this internal angle of repose is also relevant in the works of durian’s group who implemented inclined orifices and silos [24, 25]. this practice is relatively common in industrial hoppers and hence, knowing the way in which clogging is affected, becomes significant. the authors have proved that increasing the tilting angle of the orifice or silo augments the propensity to clog according to a reduction in the projection of the aperture area against the average flow direction. in addition, a clogging phase diagram is proposed combining tilting angle and outlet size. for circular apertures, the same diagram is found for four grain types (including prolate and oblate ones). for slots, however, the shape of the phase diagram for the case of lentils and rice seems to be different than for more isotropic grains, an effect attributed to an alignment between grains and slit axes. the use of slots instead of circular orifices was already proved to be beneficial to prevent clogging [45]. in this work, some conservative guidelines are given to select the minimum outlet size that assures no flow interruption. while for horizontal and vertical slots the ratio of slot width to particles size are 3.3 and 4.6 respectively, for horizontal circular outlets the ratio of orifice diameter to particle size is 6.4. this number may seem considerably larger than the ones reported in [31], but it should be taken into account that particles of different properties (including anisotropic ones) were employed. even more importantly, in a subsequent work, it was reported that, as the length of the slot increases, the avalanche size distribution departs from the exponential behavior displaying a power law decay [33]. this behavior is explained in terms of a model where a slot is represented by a series of statistically independent cells whose length is related with a hypothetical distance along which particles’ movement is correlated. interestingly, the model matches experimental outcomes for a correlation distance of around 10 particle diameters. nevertheless, this result needs to be confirmed as in other experiments using slots, the avalanche size distribution has been found to be exponential for different types of grains [25]. a configuration which is closely related to the slot geometry is the placement of several aligned orifices. very recently it has been reported that clogging can be significantly reduced by having more than one exit orifice. in this situation, when one of the orifices jams, the flow through the adjacent unjammed orifice might cause perturbations in the clogging arch, destroying it and leading to a sequence of jamming and unjamming events [46]. 060014-5 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel the necessary condition to observe this behavior is, of course, that orifices are close enough to each other. in the same line, mondal and sharma [32] have shown that adjacent outlets start affecting each other when the distance is approximately three times the diameter of the particles. these authors point toward the importance of stable particles (adjacent to the arches) resting on the base of the silo. remarkably, the role of these particles was in fact overseen in previous works that analyzed the properties of clogging arches [14, 40–44]. a smart alternative to alter the clogging process consists of placing an obstacle just above the outlet. in [26], it was reported that the clogging probability may be reduced up to 100 times if the obstacle position is properly selected. this dramatic effect was attributed to a reduction in the pressure (or particle confinement) in the orifice neighborhood, which apparently favors arch destabilization. the explanation given is that particles colliding above the orifice —which eventually could form a clogging arch— are not easily stabilized if there is not a certain confinement that facilitates energy dissipation. this idea was supported by the observation of a sudden increase on the number of particles ejected upwards in the outlet proximities when the obstacle was placed. in the same work, simulations of a silo filled with a few layers of grains revealed the same kind of clogging reduction as the layer of grains above the orifice was reduced, then confirming the important role of pressure in the process. in a subsequent work [27], it was shown that the effect of the obstacle is enhanced as the outlet size enlarges. it is noteworthy that, in all the cases, the clogging reduction is achieved with just a tiny alteration of the flow rate (up to 10% in the worst situation). in these works, an issue that remains unclear is the role of the packing fraction above the orifice. clearly, the placement of the obstacle affects this variable which should be, indeed, related to pressure. nevertheless, robust measurements of volume fraction are extremely difficult near the outlet due to the existence of strong gradients. as far as i know, there has been only one attempt to unveil the role of volume fraction on clogging in dry granular media [47]. in this work, a pseudodynamic model was implemented to prepare samples with different initial configurations by means of a tapping procedure. although packing fraction affects clogging, their main conclusion is that this is not a good macroscopic parameter to predict the size of the avalanches that would flow through a given aperture, suggesting that further information about the packing properties is necessary. a nice alternative to study the effect of packing fraction on the ability of a system to develop clogs is the use of solid particles suspended in a fluid. this is precisely what it was done in [48] where it was proved that the probability of bridge formation increased with the volume fraction. note that this system has the advantage of allowing a better control of the volume fraction than just varying the initial configuration as done in [47]. v. effect of polydispersity and particle shape a recursive topic that arises in the granular community is the roles that size polydispersity and particle shape play on the behavior of such materials. for the case of clogging in silos, in [31] it was reported that polydisperse samples displayed the same exponential decay of the avalanche size than monodisperse ones. furthermore, it was revealed that polydispersity had a negligible effect in the critical outlet size above which clogging would not occur. in [49], clogging of bidisperse samples was also shown to be similar to the monodisperse case as long as segregation is prevented. in addition, the authors propose that the parameter that should be considered to characterize the mixture is the particles volume-average diameter. contrary to polydispersity, particle shape seems to play a major role in clogging development as evidenced using prolate (rice) and oblate (lentils) particles [31]. the critical outlet size increases (i.e., clogging is more likely) when anisotropic particles are employed, a result coherent with that obtained in fluid driven suspensions of mica flakes when compared with glass beads [48]. an issue that is still open concerning anisotropic particles in the discharge of a silo is the characteristic particle length that should be chosen to compare with that of the orifice. in addition, there is a lack of experiments or simulations about the effect of using faceted particles in the clogging probability. some words have been written suggesting that faceted particles dramatically increase the clogging ability due to their tendency to align [50, 51], yet there are not systematic results on this interesting topic. 060014-6 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel vi. dynamic signatures of clogging provided that clogging in bottlenecks is a consequence of the sudden formation of a stable arch at the very narrowing, it is a big challenge to find dynamical descriptors in the flowing state that can be used to predict an eventual arrest of the flow. the first important result about this challenge was reported by longhi et al. [52] who studied the impulses recorded by a force transducer at the hopper boundary near the orifice. although the distribution of impulses does not reveal any static signature of jamming, the distribution of the time intervals between collisions (τ) produces interesting distinctive features as the outlet size is reduced and approaches the clogging region (see fig. 5). in fact, this distribution tends to a power-law p(τ) ∼ τ−3/2 implying that the mean time interval tends to diverge as the outlet is reduced. this is so even when the average time computed from a finite (albeit large) data set shows a relatively negligible dependence on the outlet size. in [53], the flow rate properties were carefully examined in a two-dimensional silo for outlet sizes both, above and below the supposed critical outlet size. even though the average flow rate behaved smoothly and did not display any characteristic property near the critical size, it was observed that the flow rate fluctuations are non symmetric for small apertures. for large orifices, the measurements of the instantaneous flow rate (q) display gaussian-like distribution of the fluctuations around the average. nevertheless, as the outlet size is reduced, temporal interruptions of the flow are evidenced by the development of a peak at q = 0 in addition to the one that corresponds to the flowing regime. in this direction, a step further was performed by tewari et al. [54] who implemented event-driven simulations to analyze the velocity fluctuations of grains flowing through a hopper. the analysis in this work was not restricted to the region of the orifice, as all the grains of the silo were studied. interestingly, although the kinetic temperatures are always higher at the boundaries of the silo, the correlation times display a tendency that reverses as the outlet size is reduced: whereas for high flow rates (far above the critical outlet size) the flow at the center has longer autocorrelation times than at the boundary, the opposite is valid for low flow rates as figure 5: (a) probability distributions, p(τ), of the time intervals τ between collisions, on a log-log scale for an outlet length 3.3 times the diameter of the particles. the solid line corresponds to a power law p(τ) ∼ τ−3/2. the average time interval between impulses is marked in the figure. (b) p(τ) on a log-log scale for different opening sizes ranging from 3 to 16 times the diameter of the particles (the curves on the top correspond to the smaller outlet sizes). the curves are displaced vertically for clarity. the solid line is the power law: p(τ) ∼ τ−3/2. figure reprinted with permission from ref. [52]. copyright (2002) by the american physical society [65]. fluctuations relax more slowly at the boundaries. in this work, it is also suggested that clogging is preceded by the appearance of vortices that nucleate at the corners of the hopper and extend inwards. iii. vibrated silos: clogging and unclogging up to now, i have described investigations related to the clogging process presuming that, once a clogging bridge is formed, all the kinetic energy is dissipated and the structure is forever stable. nevertheless, an alternative approach can be implemented, which consists on applying an external input of en060014-7 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel ergy and study its effect on clogging. this strategy gives rise to a dramatic change in the observed dynamics when the orifice is small, i.e., in the region where clogging is frequent. unlike the case of a static silo, the flow in the vibrated silo is characterized by the alternation of jamming and unjamming events. indeed, apart from the flow rate fluctuations found in a static silo [53], in the vibrated case long flow interruptions were present. these were attributed to arches that form and were initially stable, but destabilize as a consequence of vibrations (see fig. 6). this behavior suggests that the intermittent flow in vibrated silos can be split in two different, independent processes: clogging and unclogging. following this line of reasoning, mankoc et al. [55] reported that the probability that a system clogs does not depend on the vibration, which only introduces a non-zero probability of unclogging once an arch has blocked the orifice. this probability of unclogging was measured in three different ways which led to consistent results whose most conspicuous feature was an increase of the unclogging probability with the outlet size. janda et al. [56] devised a similar experiment in which the hopper wall of an eccentrically discharged silo was a piezoelectric, allowing a local perturbation of the clogging arch. in this sense, this work is conceptually different than that of mankoc et al. where the whole silo was vibrated. the most interesting result revealed by janda et al. was that the distribution of times that the system takes to get unclogged exhibits a power law decay. at low vibration accelerations, anomalous statistics for the jamming times were evidenced as the exponent α of the power law was below 2 and the first moment could not be calculated (see fig. 7). this property is, indeed, strongly reminiscent of the anomalous dynamics usually observed for creeping flows of glassy materials. in a recent work, this behavior has been shown to be universal in other systems of macroscopic particles flowing through a bottleneck like sheep, a model of pedestrians, and colloids [57]. furthermore, for the case of inert grains, several variables have been shown to affect the value of the exponent going from α > 2 to α ≤ 2, i.e., from an unclogged situation (where averages can be defined) to a clogged scenario (where the average flow rate would tend to zero as the measuring time increases). these variables are: the intenfigure 6: signal from a photosensor at the exit of a silo: a value of 1 indicates that a particle is blocking the beam, zero means that the beam is unobstructed. (a) static silo. (b) vibrated silo. (c) a zoom of the signal shown in (b) during the first three seconds, the same time stretch as in (a). all the data were obtained using an orifice of diameter 3.05 times the beads diameter. figure reprinted with permission from ref. [55]. copyright (2009) by the american physical society [65]. sity of vibration, the outlet size, the height of the layer of grains above the outlet, and the inclination of the 2d silo with respect to the vertical which modifies the component of the gravity affecting the grains. increasing the intensity of vibration and enlarging the outlet size favors the development of unclogged situations, while increasing the layer of grains or the silo verticality facilitates the transition to clogging. a similar idea was anticipated by valdés and santamarina [58] who suggested that the acceleration that would be required for unclogging increases with increasing skeletal forces in the particles forming the bridge. furthermore, they related this prediction to the higher stability exhibited by the arches formed in a suspension when 060014-8 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel figure 7: histogram, in logarithmic scale, for the time lapses that the orifice remains blocked in a vibrated silo for different vibration accelerations as indicated in the legend. data correspond to an outlet size 1.78 times the particle diameter. the dashed line has a slope of two evidencing that, for the smallest acceleration displayed, the slope is smaller than two. figure reprinted with permission from [56]. copyright (2009) by iop. subjected to high fluid velocities [59]. finally, in a 2d vibrated silo which allowed observation of the clogging structures, it was established a relationship among the bridge geometry and its resistance to vibration [44]. in particular, it was revealed that the intensity of vibration at which the arches collapse is inversely correlated with the maximum angle among the particles conforming it. for the particular case of angles above 1800 (the so called defects), this dependence was explained in terms of a very simple force analysis. in summary, from this work it was concluded that arches break at defects and, the larger the maximum angle, the weaker is the arch. iv. perspectives after more than a decade of research, significant advance has been achieved in the understanding of clogging. despite all that, the relevance of the remaining open issues and the importance of the consequences that clogging has from an applied point of view, hint about an augment of activity on this topic in the forthcoming years. probably, a sensible approach that should be investigated is isolating the dynamic and geometric contributions in the development of clogs. effectively, a clogging arch should have a structure compatible with the confined geometry. but in addition, this structure has to be able to persist until all the kinetic energy of the system is dissipated. unfortunately, increasing the outlet size leads to a modification of the geometry of the problem (as the span of clogging arches has to increase), but also affects to the velocity of the particles (which increases with the square root of the orifice diameter). in a recent work, arévalo et al. made initial progress in this direction by exploring clogging when reducing the driving force up to 10−3g where g is the gravity; but undoubtedly, new strategies should be devised to understand the effect of dynamics in the clogging process. a situation which seems simpler as dynamic effects are removed is the study of unclogging as explained in section iii. the power law decays observed in the time that the system needs to become unclogged, suggest a creeping process where the bridges would age with time, increasing their endurance. a straightforward way of testing the validity of these ideas would be an analysis of this process using photoelastic particles to evaluate temporal evolution of the forces within the arch [60]. the usage of this kind of particles could also be implemented in order to unveil an old question concerning the relationship among clogging arches and force chains. another issue that remains unsolved is, whether or not, clogging can be seen as a phase transition and, if so, what kind of transition clogging is. as explained above, from the measurements of unclogging times in a vibrated silo, a divergence has been found that can be used to rigourously characterize the clogged state through the definition of a ‘flowing parameter’ [57]. a thorough inspection of the dependence of this parameter on different variables becomes necessary to corroborate its usefulness. nonetheless, as this approach is based on the unclogging times, it cannot be used for the ‘singular’ case of a static silo where, if formed, clogs last forever. in such scenario, instead of the traditional way of studying the divergence of the avalanche size as the outlet is enlarged, i believe that it is pertinent to approach the transition from the flowing region. there, it should exist some parameter that reveals distinctive behavior when the outlet size is 060014-9 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel reduced as explained in section ii.vi. a reminiscent problem of this alternative is the difficulty of choosing a region where to perform the analysis as the silo is precisely characterized by the existence of strong spatial and temporal gradients. in this sense, a geometry that becomes promising is a narrow pipe without any constriction where clogs may develop at any place [61–63]. apart from a zone at the top of the pipe where the pressure increases with depth, a rather homogeneous behavior should be observed within the rest of the system, allowing clean measurements of variables like velocity, density and so on. the study of this geometry can also be seen as an intermediate stage between clogging and jamming as it is also the case of ‘jamming by pinning’ [64] where the increase of the number of obstacles in the system (and so the characteristic distance between them) was shown to reduce the density at which the system jams. acknowledgements i would like to thank the referee kiwing to whose comments have, undoubtedly, helped to improve the quality of this manuscript. i am very grateful to angel garcimart́ın, diego maza, carlos pérez-garćıa and luis pugnaloni, without whom this work would never have been possible. [1] e sloan, c koh, a sum, n mcmullen, g shoup, a ballard, t palermo, j creek, m eaton, j lachance, l talley, natural gas hydrates in flow assurance, elsevier, burlington, ma (2011). [2] m d haw jamming, two-fluid behavior, and self-filtration in concentrated particulate suspensions, phys. rev. lett. 92, 185506 (2004). [3] d genovese, j sprakel, crystallization and intermittent dynamics in constricted microfluidic flows of dense suspensions, soft matter 7, 3889 (2011). [4] p knowles, g dotro, j nivala, j garćıa, clogging in subsurface-flow treatment wetlands: occurrence and contributing factors, ecol. eng. 37, 99 (2011). [5] d g rees, h totsuji, k kono, commensurability-dependent transport of a wigner crystal in a nanoconstriction, phys. rev. lett. 108, 176801 (2012). [6] d helbing, i farkas, t vicsek, simulating dynamic features of escape panic, nature 407, 487 (2000). [7] d helbing, l buzna, a johansson, t werner, self-organized pedestrian crowd dynamics: experiments, simulations, and design solutions. transport. sci. 39, 1 (2005). [8] m moussäıd, d helbing, g theraulaz, how simple rules determine pedestrian behavior and crowd disasters, proc. natl. acad. sci. usa 108, 6884 (2011). [9] a j liu, s r nagel, jamming is not just cool anymore, nature 396, 21 (1998). [10] r kvapil, gravity flow of granular material in hoppers and bins in mines, int. j. rock mech. min. 2, 277 (1965). [11] d m walker, a basis for bunker design, powder technol. 1, 228 (1967). [12] h sakaguchi, e ozaki, t igarashi, plugging of the flow of granular materials during the discharge from a silo, int. j. mod. phys. b 7, 1949 (1993). [13] a drescher, a j waters, c a rhoades, arching in hoppers: ii. arching theories and critical outlet size, powder technol. 84, 177 (1995). [14] k to, p y lai, h k pak, jamming of granular flow in a two-dimensional hopper, phys. rev. lett. 86, 71 (2001). [15] e clément, g reydellet, f rioual, b parise, v fanguet, j lanuza, e kolb, jamming patterns and blockade statistics in model granular flows, in: traffic and granular flow ’99, eds. d helbing, h j herrmann, m schreckenberg, d e wolf, pag. 457, springer, berlin (2000). [16] i zuriguel, l a pugnaloni, a garcimart́ın, d maza, jamming during the discharge of grains from a silo described as a percolating transition, phys. rev. e 68, 030301 (2003). 060014-10 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel [17] d helbing, a johansson, j mathiesen, m h jensen, a hansen, analytical approach to continuous and intermittent bottleneck flows, phys. rev. lett. 97, 168001 (2006). [18] t masuda, k nishinari, a schadschneider, critical bottleneck size for jamless particle flows in two dimensions. phys. rev. lett. 112, 138701 (2014). [19] j tang, r p behringer, how granular materials jam in a hopper, chaos 21, 041107 (2011). [20] a janda, i zuriguel, a garcimart́ın, l a pugnaloni, d maza, jamming and critical outlet size in the discharge of a two-dimensional silo, europhys. lett. 84, 44002 (2008). [21] g pérez, numerical simulations in granular matter: the discharge of a 2d silo, pramana 70, 989 (2008). [22] k to, jamming transition in two-dimensional hoppers and silos, phys. rev. e 71, 060301 (2005). [23] l kondic, simulations of two dimensional hopper flow, granul. matter 16, 235 (2014). [24] h g sheldon, d j durian, granular discharge and clogging for tilted hoppers, granul. matter 12, 579 (2010). [25] c c thomas, d j durian, geometry dependence of the clogging transition in tilted hoppers, phys. rev. e 87, 052201 (2013). [26] i zuriguel, a janda, a garcimart́ın, c lozano, r arévalo, d maza, silo clogging reduction by the presence of an obstacle, phys. rev. lett. 107, 278001 (2011). [27] c lozano, a janda, a garcimart́ın, d maza, i zuriguel, flow and clogging in a silo with an obstacle above the orifice, phys. rev. e 86, 031306 (2012). [28] r arévalo, i zuriguel, d maza, a garcimart́ın, role of driving force on the clogging of inert particles in a bottleneck, phys. rev. e 89, 042205 (2014). [29] a guariguata, m a pascall, m w gilmer, a k sum, e d sloan, c a koh, d t wu, jamming of particles in a two-dimensional fluid-driven flow, phys. rev. e 86, 061311 (2012). [30] p g lafond, m w gilmer, c a koh, e d sloan, d t wu, a k sum., orifice jamming of fluid-driven granular flow, phys. rev. e 87, 042204 (2013). [31] i zuriguel, a garcimart́ın, d maza, l a pugnaloni, j m pastor, jamming during the discharge of granular matter from a silo, phys. rev. e 71, 051303 (2005). [32] s mondal, m m sharma, role of flying buttresses in the jamming of granular matter through multiple rectangular outlets, granul. matter 16, 125 (2014). [33] s saraf, s v franklin power-law flow statistics in anisometric (wedge) hoppers, phys. rev. e 83, 030301 (2011). [34] s s manna, h j herrmann, intermittent granular flow and clogging with internal avalanches, eur. phys. j. e 1, 341 (2000). [35] n roussel, t l h nguyen, p coussot, general probabilistic approach to the filtration process, phys. rev. lett. 98, 114502 (2007). [36] r arévalo, d maza, l a pugnaloni, identification of arches in 2d granular packings, phys. rev. e 74, 021303 (2006). [37] l a pugnaloni, g c baker, structure and distribution of arches in shaken hard sphere deposits, physica a 337, 428 (2004). [38] c f m magalhães, j g moreira, a p f atman, catastrophic regime in the discharge of a granular pile, phys. rev. e 82, 051303 (2010). [39] c c thomas, d j durian, fraction of clogging configurations sampled by granular hopper flow, arxiv:1410.0933 (2014). [40] k to, p y lai, jamming pattern in a twodimensional hopper, phys. rev. e 66, 011308 (2002). 060014-11 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel [41] a garcimart́ın, i zuriguel, l a pugnaloni, a janda, shape of jamming arches in twodimensional deposits of granular materials, phys. rev. e 82, 031306 (2010). [42] a longjas, c monterola, c saloma, force analysis of jamming with disks of different sizes in a two-dimensional hopper, j. stat. mech. 2009, 05006 (2009). [43] r c hidalgo, c lozano, i zuriguel, a garcimart́ın, force analysis of clogging arches in a silo, granul. matter 15, 841 (2014). [44] c lozano, g lumay, i zuriguel, r c hidalgo, a garcimart́ın, breaking arches with vibrations: the role of defects, phys. rev. lett. 109, 068001 (2012). [45] c e davies, m desai, blockage in vertical slots: experimental measurement of minimum slot width for a variety of granular materials, powder technol. 183, 436 (2008). [46] a kunte, p doshi, a v orpe, spontaneous jamming and unjamming in a hopper with multiple exit orifices, phys. rev. e 90, 020201 (2014). [47] r o uñac, a m vidales, l a pugnaloni, the effect of the packing fraction on the jamming of granular flow through small apertures, j. stat. mech. 2012, 04008 (2012). [48] j r valdés, j c santamarina, particle clogging in radial flow: microscale mechanisms, spe j. 11, 193 (2006). [49] l pournin, m ramaioli, p folly, th m liebling, about the influence of friction and polydispersity on the jamming behavior of bead assemblies, eur. phys. j. e 23, 229 (2007). [50] t kanzaki, m acevedo, i zuriguel, i pagonabarraga, d maza, r c hidalgo, stress distribution of faceted particles in a silo after its partial discharge, eur. phys. j. e 34, 133 (2011). [51] d höhner, s wirtz, v scherer, a numerical study on the influence of particle shape on hopper discharge within the polyhedral and multisphere discrete element method, powder technol. 226, 16 (2012). [52] e longhi, n easwar, n menon, large force fluctuations in a flowing granular medium, phys. rev. lett. 89, 045501 (2002). [53] a janda, r harich, i zuriguel, d maza, p cixous, a garcimart́ın, flow-rate fluctuations in the outpouring of grains from a twodimensional silo, phys. rev. e 79, 031302 (2009). [54] s tewari, m dichter, b chakraborty, signatures of incipient jamming in collisional hopper flows, soft matter 9, 5016 (2013). [55] c mankoc, a garcimart́ın, i zuriguel, d maza, l a pugnaloni, role of vibrations in the jamming and unjamming of grains discharging from a silo, phys. rev. e 80, 011309 (2009). [56] a janda, d maza, a garcimart́ın, e kolb, j lanuza, e clément, unjamming a granular hopper by vibration, europhys. lett. 87, 24002 (2009). [57] i zuriguel, d r parisi, r c hidalgo, c lozano, a janda, p a gago, j p peralta, l m ferrer, l a pugnaloni, e clément, d maza, i pagonabarraga, a garcimart́ın. clogging transition of many-particle systems flowing through bottlenecks, sci. rep. 4, 7324 (2014). [58] j r valdés, j c santamarina, clogging: bridge formation and vibration-based destabilization, canadian geotech. j. 45, 177 (2008). [59] t w muecke, formation fines and factors controlling their movement in porous media, j. petrol. technol. 31, 144 (1979). [60] t s majmudar, r p behringer, contact force measurements and stress-induced anisotropy in granular materials, nature 435, 1079 (2005). [61] j hadjigeorgiou, j f lessard, numerical investigations of ore pass hang-up phenomena, int. j. rock mech. min. 44, 820 (2007). [62] j-c. tsai, w losert, g a voth, j p gollub, two-dimensional granular poiseuille flow on an incline: multiple dynamical regimes, phys. rev. e 65, 011306 (2001). 060014-12 papers in physics, vol. 6, art. 060014 (2014) / i. zuriguel [63] a janda, i zuriguel, a garcimart́ın, d maza, clogging of granular materials in narrow vertical pipes (unpublished). [64] c j o reichhardt, e groopman, z nussinov, c reichhardt, jamming in systems with quenched disorder, phys. rev. e 86, 061301 (2012). [65] readers may view, browse, and/or download material for temporary copying purposes only, provided these uses are for noncommercial personal purposes. except as provided by law, this material may not be further reproduced, distributed, transmitted, modified, adapted, performed, displayed, published, or sold in whole or part, without prior written permission from the american physical society. 060014-13 papers in physics, vol. 5, art. 050007 (2013) received: 25 january 2013, accepted: 10 october 2013 edited by: s. a. grigera reviewed by: e. m. forgan, school of physics & astronomy, university of birmingham, u.k. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.050007 www.papersinphysics.org issn 1852-4249 invited review: graphite and its hidden superconductivity p. esquinazi1∗ we review experimental results, from transport to magnetization measurements, on different graphite samples, from bulk oriented graphite, thin graphite films to transmission electron microscope lamellae, that indicate the existence of granular superconductivity at temperatures above 100 k. the accumulated evidence speaks for a localization of the superconducting phase(s) at certain interfaces embedded in semiconducting crystalline regions with bernal stacking order. i. introduction over the past decade, our interpretation of the magnetic and transport properties of ordered graphite bulk samples has experienced a change respect to the partially accepted general description of their intrinsic properties. the description of graphite one finds in the not-so-old literature tells us that it is a kind of (semi)metal with a finite fermi energy and carrier (electron plus hole) densities per graphene layer at low temperatures n0 ∼ 1010 . . . 1012 cm−2, see e.g., [1–3]. however, real samples are not necessarily ideal, we mean defect-free, and therefore those carrier densities are not necessarily intrinsic of ideal graphite. the exhaustive experience accumulated in gapless and narrow band semiconductors [4] already indicates us how important defects and impurities (not necessarily magnetic ones but, for example, hydrogen) are in determining some of the measured properties. therefore, taking experimental data of real samples as intrinsic, without knowing their microstructure ∗e-mail: esquin@physik.uni-leipzig.de 1 division of superconductivity and magnetism, institute for experimental physics ii, fakultät für physik und geowissenschaften, universität leipzig, linnéstrasse 5, d04103 leipzig, germany. and/or defect concentration, was indeed a misleading assumption in the past. this assumption has drastically influenced the description of the band structure of graphite we found nowadays in several books and publications. for example, if graphite has a finite fermi energy ef (whatever the majority carriers are), as assumed everywhere, up to seven free parameters have to be introduced [2, 5, 6] to describe the apparently ideal band structure of bernal graphite with the well-known abab stacking order of the graphene layers. the impact of well defined two-dimensional interfaces inside graphite samples [7, 8] had not been realized until recent studies of the transport properties as a function of thickness of the graphite sample provided a link to the microstructure of the samples obtained by transmission electron microscope (tem) studies. we also have to add the sensitivity of the graphite transport properties to very small amount of defects [9]. those results [7, 9] do not only indicate us that at least a relevant part of the carrier densities measured in graphite is not intrinsic but also that the metallic-like behavior of the electrical resistance does not reflect ideal, defect-free graphite [10]. an anomalous vanishing of the amplitude of the shubnikov-de haas (sdh) oscillations decreasing the thickness of the graphite samples was published, more than 10 years 050007-1 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi ago, [11] without attracting the necessary attention, although those results already suggested that the sdh oscillations are probably not intrinsic of the graphite structure. these results are supported by the absence of sdh oscillations, i.e., no evidence for the existence of a fermi surface, found recently in bulk oriented samples of high grade and high purity but without internal interfaces [12]. all these results indicate that the internal microstructure of the graphite samples play an important role, a microstructure that was neither characterized nor considered in the discussion of the measured properties of different graphite samples, from highly oriented pyrolytic graphite (hopg) to kish or natural graphite, even in nowadays literature [6, 13, 14]. what does this have to do with superconductivity? if we start searching for superconductivity in graphite by measuring the behavior of the electrical resistance (r) with temperature (t) and magnetic field (h), for example, it should be clear that the knowledge of the intrinsic, normal state dependence is needed. otherwise, we may misleadingly interpret an anomalous behavior due to, for example, the influence of non-percolative, granular superconducting regions embedded in a (normal state) graphite matrix, as intrinsic of the material, clearly missing an interesting aspect of the sample. a reader with expertise in superconductivity might not be convinced that such a mistake could be ever made. however, the ballistic transport characteristics of the graphene layers in ideal graphite with their huge mobility and mean free path [15–18] provide a high conductivity path in parallel; such that it is not at all straightforward by simple experiments to realize and prove the existence of superconductivity at certain regions in some, not all, graphite samples. one needs indeed to do systematic experiments decreasing the size of the graphite samples (but not too much) to obtain clear evidence for the embedded or “hidden” superconductivity. a note on samples: the internal ordering or mosaicity of the graphite crystalline regions inside commercial hopg samples is given usually by the grade. for example, the highest ordered pyrolytic graphite samples have is a grade “a”, which means a rocking curve width ∆ ∼ 0.4◦±0.2◦ (“b”, ∆ ∼ 0.8◦, etc.). interestingly, and due to the contribution of two dimensional highly conducting internal interfaces between crystalline regions [7, 10], the highest grade, i.e., smaller rocking curve width, does not always mean that the used sample provides the intrinsic transport properties of ideal graphite. the characterization of the internal structure of usual hopg samples, as well as the thickness dependence of r(t) to understand the transport and the magnetic properties of graphite, indicate that these two dimensional interfaces are of importance. the existence of rhombohedral inclusions [19, 20] (stacking order abcabc instead of abab of the usual bernal graphite structure) in hopg as well as in kish graphite samples can also have a relationship with the hidden superconductivity in graphite, following the theoretical work in ref. [21]. according to literature (see e.g., fig. 22 in ref. [8]), the density of interfaces parallel to the graphene layers in kish graphite, in regions of several microns length, is notable. therefore, quantifying the perfection of any graphite sample through the resistivity ratio between 300 k and 4.2 k [8] is not necessarily the best criterion to be used if we are interested on the intrinsic properties of the graphene layers in graphite, because of the high conductivity of the interfaces in parallel to the graphene layers of the sample [10]. two examples of the interfaces we are referring to can be seen in fig. 1. on the other hand, commercial hopg bulk samples are of high purity with average total impurity concentrations below 20 ppm. especially the existence of magnetic impurities are of importance if the defect-induced magnetism (dim) is the main research issue. their concentration remains below a few ppm for high grade hopg samples [22]. the graphite flakes discussed in this work were obtained by exfoliation of hopg samples of different batches, by careful mechanical press and rubbing the initial material on a previously cleaned substrate. as substrate, we used p-doped si with a 150 nm sin layer on top. we selected the flakes using microscopic and micro-raman techniques to check their quality. more details on the preparation can be taken from ref. [7] and other publications cited below. this review is organized as follows. in the next section we discuss the experimental data for r(t,h) from different graphite samples published in the last 12 years and argue that the first hints on unusual superconducting contribution can be already found in those measurements. in section iii, we discuss the anomalous hysteresis in the magne050007-2 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi toresistance, a first clear indication for embedded granular superconductivity. section iv, deals with the josephson behavior measured in tem lamellae whereas section v deals with the granular superconducting behavior found in the magnetization of water-treated graphite powder as well as in bulk hopg samples for fields normal to the interfaces found inside those samples. in the last section, section vi, before the conclusion, we discuss possible origins for the superconducting signals on the basis of earlier and recent experimental and theoretical work. ii. the behavior of the resistance vs. temperature at different applied magnetic fields in this section, we discuss the behavior of the resistance r(t,h) of different hopg samples including kish graphite. the data we present here were taken from [7, 23, 24] and a quick search in literature demonstrates that these data are reproducible and can be found in different publications, see, e.g., [2, 25–27]. figure 2(a) shows the r(t) for different bulk graphite samples of different grades (rocking curve width) and for one sample (hopg-1) at zero and under a magnetic field applied normal to the main area, i.e., normal to the graphene planes of the sample. this figure reveals a general behavior, namely that the lower the resistivity ρ of the hopg sample, the more metallic-like its temperature dependence. it is appealing to assume that these characteristics, low ρ, low ∆ and the metallic behavior are clear signs for more ideal graphite. therefore, from the measured r(t), we may conclude that sample hopg-3 is more ideal than sample hopg-1 and the latter being more ideal than sample hopg-2, see fig. 2(a). this is indeed the usual interpretation found in several reviews in the literature, see e.g., [8, 28]. from a quick look at all the curves in fig. 2, however, one recognizes a striking similarity between them, although we are comparing different samples with different thickness and some of the curves were measured under a magnetic field applied normal to the graphene layers of the samples; also normal to the interfaces commonly found in some ordered samples [7, 8]. let us start discussing the metallic-like behavior figure 1: transmission electron microscope pictures of two different kinds of interfaces and their distribution in hopg samples. the tem pictures were taken from two different lamellae, each about 300 nm thick and with the electron beam nearly parallel to the graphene planes of the samples. (a) the interfaces are recognized at the borders of crystalline regions of different gray colors. taken from [7]. (b) interfaces found in a hopg sample used for magnetization measurements (see section v)that reveals hysteretic behavior in field and temperature. taken from [29]. of r(t) of sample hopg-3 in fig. 2(a). this sample behaves as the hopg-uc sample shown in (c) at zero magnetic field, having also a maximum at t ∼ 150 k. a “better” metallic character shows the hopg sample in (b) or the kish graphite sample in (d), without any maximum in the shown temperature range. is this metallic-like behavior really intrinsic of ideal graphite? 050007-3 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi the following experimental evidence does not support such interpretation: first, for samples from the same batch, the metallic character of r(t) vanishes in the whole temperature range when the sample thickness is below ∼ 50 nm [7, 11, 27], see fig. 2(b). second, the metallic-like behavior vanishes in the whole temperature range after applying a magnetic field of the order of 1 to 2 koe, an interesting behavior known as the metal-insulator transition (mit) [23, 25, 26], see figs. 2(a), (c) and (d). note that such magnetic field strength influences mainly the metallic-like region, see e.g., the change of sample hopg-1 in fig. 2(a) at zero and at 1 koe field, an interesting behavior noted first in [30] and interpreted as due to superconducting instabilities. at those field strengths, i.e., h ∼ 1 koe, the obtained r(t) curves, for samples showing at zero field a metallic-like behavior, resemble the semiconducting-like curves obtained for sample hopg-2 (fig. 2(a)) or for samples with small thickness (fig. 2(b)). at fields higher than a few koe, the rather large magnetoresistance of graphite starts to play the main role and the r(t) curve increases in the whole temperature range. finally, all these results added to the existence of well defined interfaces in the metallic-like hopg samples as well as in kish graphite, with distances in the c−axis direction usually larger than ∼ 30 nm, indicate that the metallic-like behavior is due to the contribution of these interfaces and it is not intrinsic of the graphene layers of ideal graphite [7, 10]. therefore, explanations of the mit based on ideal graphite band models with a large number of free parameters [25, 26] are certainly not the appropriate ones. all the different r(t) curves for different samples shown in fig. 2 and at zero field can be very well understood assuming the parallel contribution of semiconducting graphite paths in parallel to the one from the highly conducting interfaces [10]. the saturation of the resistance at t → 0 k is interpreted as due to the finite resistance of the sample surfaces (the free one and the one on the substrate) short circuiting the intrinsic behavior of the bulk graphene layers at low enough temperatures. the question is now whether parts of these interfaces hide superconducting regions. it is certainly appealing to suggest that the huge mit at fields normal to the interfaces (and below ∼ 2 koe) is related to josephson-coupled superconducting regions embedded in some of the interfaces. note that the huge anisotropy of the mit (parallel fields to the interfaces main plain do not affect the electrical transport) already implies that the regions responsible for the mit must be laying parallel to the graphene layers [31]. without the knowledge on the existence of these interfaces, an interpretation of the low field mit based on the influence of superconductivity has been discussed in detail in the reviews [24, 32]. in those reviews, one can recognize the remarkable similarity between the scaling approaches used to characterize the magnetic-fieldinduced superconductor-insulator quantum phase transition [33] or of the field-driven mit in 2d electron (hole) systems [34] and the one obtained for the mit observed in graphite. as we will see in the next sections, the experimental evidence obtained in the last recent years indicates that granular superconductivity exists within some of those interfaces, indeed. if superconducting patches exist embedded in parts of the interfaces or in other two dimensional regions of the bulk ordered samples, one expects to measure some signs of granular superconductivity, as for example nonlinear i − v curves or hysteresis in the magnetoresistance. however, this is not really observed in bulk large samples. there are at least two reasons for the apparent absence of these expected phenomena. one is the distribution of the input current between the ballistic channels given by the graphene layers [17, 18], the metallic, normal conducting parts of the interfaces and the regions where the superconducting patches exist. in other words, the usual maximum currents used in transport experiments reported in bulk samples may have been small enough so that the current through the superconducting regions remained below the critical josephson one. the other reason is the experimental voltage sensitivity to measure the possible irreversibility in the magnetoresistance due to the existence of pinned vortices or fluxons. we will see in the next section that part of these problems can be overcome decreasing the sample size; in this way, one obtains the voltage signals from the regions of interest with enough sensitivity. apart from the large magnetic field sensitivity of the metallic-like resistance measure in bulk graphite samples with interfaces, is there any further hint for the existence of granular supercon050007-4 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi 0 50 100 150 200 250 300 0.4 0.6 0.8 1.0 1.2 1.4 hopg-3, h = 0 hopg-1, h = 0 hopg-1, h = 1 koe hopg-2, h = 0 r (t )/ r (3 0 0 k ) temperature, t (k) 1 10 100 10 -3 10 -2 10 -1 1 10 100 10 -3 10 -2 10 -1 hopg-uc temperature (k) 1 t 5 koe 2 koe 1 koe 500 oe 300 oe 100 oe h = 0 b a s a l -p l a n e r e s is t a n c e ( ) k-1 1 t 5 koe 2.5 koe 1 koe 500 oe 300 oe h = 0 (c) (a) (b) (d) figure 2: (a) normalized resistance vs. temperature for three different hopg bulk samples. the bottom, metallic-like curve corresponds to the sample hopg-3, the curves above correspond to hopg1 (h = 0), hopg-1 (h = 1 koe), and hopg-2. the grade and resistivity values are hopg-1 (∆ = 1.4◦, resistivity at 300 k ρ(300k) = 45 µωcm), hopg-2 (∆ = 1.2◦, ρ(300k) = 135 µωcm) and hopg-3 (∆ = 0.5◦, ρ(300k) = 5 µωcm). taken from [23]. (b) similar to (a) but for hopg samples from the same batch but with different size, namely (thickness × length × width) l5: 12 ± 3 nm, 27 µm, 14 µm, l2a: 20 ± 5 nm, 5 µm, 10 µm, l8a: 13 ± 2 nm, 14 µm, 10 µm, l8b: 45 ± 5 nm, 3 µm, 3 µm, l7: 75 ± 5 nm, 17 µm, 17 µm, hopg: 17 ± 2 µm, 4.4 mm, 1.1 mm, taken from [7]. (c) and (d) resistance of bulk graphite samples vs. temperature at different applied fields normal to the graphene layers. the sample in (c) is a hopg bulk sample from union carbide of grade a and the sample in (d) is kish graphite. taken from [24]. ductivity in those r(t) curves? yes, this hint is related to the thermally activated function (∝ exp(−ea/kbt) with ea, a sample dependent effective thermal barrier ∼ 30 k) one needs in order to fit the metallic-like contribution below t ∼ 200 k [10]. this function is relevant in spite of only a factor five increase of the resistance between low and high temperatures, see fig. 2. skeptical readers can convince themselves about its relevance taking a similar example, as the exponential function 050007-5 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi used to fit the increase, by a similar factor, of the ultrasonic attenuation with temperature below tc in conventional superconductors. we note that this exponential function has already been used to describe the increasing resistance of bulk graphite samples with temperature and it was speculated to be related to some superconducting-like behavior in graphite [32]. it is clear that this function is not the usual one, one expected for metals or semimetals and that cannot be understood within the usual electron-phonon interaction mechanisms, nor in two dimensions. a similar dependence has been observed in granular alge [35], which shows for a particular al concentration a superconductor-semiconductor transition similar to that reported in ref. [10] or, after an appropriate scaling in temperature, to some of the curves shown in fig. 2. the observed thermally activated behavior might be understood on the basis of the langer-ambegaokar-mccumber-halperin (lamh) model [36, 37] that applies to narrow superconducting channels in which thermal fluctuations can cause phase slips. this interpretation gets further support from the evidence we discuss in the following sections. iii. hysteresis in the magnetoresistance in order to reveal by transport measurements the existence of granular superconductivity in some regions of the graphite samples, we need to increase the sensitivity of the measured voltage to those regions. to achieve this, we decrease the size of the sample enhancing in this way the probability to get some measurable influence of this phenomenon in the voltage. the work in ref. [38] reported the first observations of an anomalous irreversible behavior in the magnetoresistance (mr) in a few tens of nm thick and several micrometer large multigraphene samples. hysteresis in the magnetoresistance is a key evidence on the existence of either magnetic order (domains with their walls, for example) or vortices/fluxons and therefore on the existence of superconductivity. because defects as well as hydrogen can trigger magnetic order in graphite, a first attempt would be to relate the measured hysteresis in the mr with the existence of magnetic order and magnetic domains, for example. however, the data exhibited anomalous hysteresis loops in the mr [38], similar to those observed in granular superconductors with josephson-coupled grains [39–41]. the anomalous hysteresis was observed only for magnetic fields perpendicular to the planes, whereas in the parallel to the planes direction, the mr remains negligible. this fact already points out to a remarkable large anisotropic response of the superconducting phase(s) in agreement with the hypothesis that these superconducting regions might be embedded in some of the interfaces found inside some bulk graphite samples [7, 8]. the amplitude of the hysteresis in the mr reported in ref. [38] vanishes at temperatures t ∼ 10 k, clearly below the temperature at which the resistance shows a maximum, as it is the case for samples hopg-1 in fig. 2(a) or sample l2a in fig. 2(b). it is clear that thermal fluctuations can prevent the establishment of a coherent superconducting state in parts of the sample and therefore zero resistance state is not so simple to be achieved if the superconducting distribution is a mixture of superconducting patches at the interfaces and these are embedded in a multigraphene semiconducting matrix. moreover, we should take also into account that the voltage electrodes are usually connected at the top surface of the graphite samples picking the voltage difference coming from a non-negligible normal conducting path. one possibility to increase the sensitivity of the measured voltage to the field hysteresis these regions produce is to make a constriction in the middle of the two voltage electrodes, see inset in fig. 3(a). in this case, we expect a locally narrower distribution of superconducting and normal regions at the constriction such that averaging effects should be less important. simultaneously, through the constriction the main part of the voltage drop depends mostly on the region at the constriction, see fig. 2(c) in ref. [16]. then, a higher sensitivity to the superconducting paths can be achieved in case they remain at or near the constriction. this idea has been successfully realized in [42] and its main results will be reviewed in this section. let us take two slightly different samples, 1 and 2, with r(t)−curves as shown in fig. 3(a). the aim of the experiment is to study the hysteresis in the mr those samples might show below the temperature at which a maximum in the resistance 050007-6 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi -200 -100 0 100 200 0 1 2 3 r (b )r (0 )/ r (0 ) (1 0 -3 ) sample 2 w = 3 µm t= 10 k applied field h(oe) (d) 0 10 20 30 40 50 0.00 0.05 0.10 0.15 0.20 0.25 0.30 without constriction without constriction 6 µ m 4 µ m ∆ m r ( 1 0 -3 ) temperature t(k) (c) -200 -100 0 100 200 0 4 8 12 16 -400 -200 0 200 400 -3 -2 -1 0 1 2 3 ∆ m r ( 1 0 -3 ) sample 1 w= 4 µm t= 2 k m r = r (b )r (0 )/ r (0 ) (1 0 -3 ) applied field h(oe) (b) 0 50 100 150 200 250 300 44 46 48 50 52 54 56 58 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 r e s is ta n c e r (ω ) temperature t(k) sample 2 sample 1 (a) figure 3: (a) resistance vs. temperature, without constrictions and at zero applied field, for two graphite flakes of size (distance between voltage electrodes × width × thickness) for sample 1 (2): 13×16×0.015 (2.6×6×0.040) µm3. the observed temperature dependence remains for all constrictions widths. the inset shows a scanning electron microscope picture of sample 1 with a constriction width of 4.3 µm between the two voltage electrodes. the scale bar is 5 µm. (b) magnetoresistance (mr) vs. applied magnetic field for sample 1 with a 4 µm constriction width and at 2 k. the input current was 1 µa. note the clear hysteresis in the mr when the field is swept from |hmax| = 1000 oe. the inset shows the difference ∆mr between the curve obtained starting from hmax = +1000 oe and the return curve measured from hmin = −1000 oe. (c) the absolute difference between the two mr curves of the hysteresis loop obtained at a fixed magnetic field of 16.6 oe for sample 1 without constriction (?) and for two different constriction widths. the figure also shows the corresponding data for another graphite sample without constrictions (�) from [38]. (d) magnetoresistance measured from a starting maximum field of 1.4 koe at 10 k for sample 2 with a constriction width of 3 µm. some of the figures and the data were taken from [42]. is measured, in case that maximum is related to the josephson coupling between superconducting regions. figure 3(b) shows one example of the anomalous hysteresis loop in the mr. the going down curve (from high, positive to low, negative fields, red arrow), for example, runs below the going up curve (green arrow) in the same quadrant as the field sweep was started, showing a minimum at positive fields of the order of 20 oe, see also similar curves in ref. [38]. to present the anomalous behavior clearly, the inset in fig. 3(b) shows the difference between the two curves. this difference 050007-7 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi is in clear contrast to the usual hysteresis in superconductors as well as ferromagnets [38, 39], where the minimum (or maximum) in the mr is observed in the opposite field quadrant, and the increasing field resistance curve is usually below the decreasing field one. figure 3(c) shows the temperature dependence of the difference in the mr between the decreasing and increasing field curves at a fixed magnetic field for sample 1, without and with two constrictions. the results show that the smaller the constriction width, the higher the temperature at which the anomalous hysteresis is observed, decreasing below the sensitivity limit at t > 50 k for a constriction width of 4 µm, whereas the maximum in the r(t) curve is at ∼ 70 k, see fig. 3(a). the absence of any hysteresis in the mr for sample 2 with a constriction width of 3 µm and at t = 10 k indicates that the hysteresis does not come from some artifact due to the used focused ion beam method [42, 43] or due to an artifact in the measurement of the real field applied to the sample. as expected from the r(t) curve, see fig. 3(a), sample 2 shows the anomalous hysteresis in the mr at lower temperatures than sample 1 [42]. summarizing this section, the observation of the anomalous hysteresis in the mr – together with the mit and the relatively large mr at temperatures below the maximum in r(t) – provides already striking hints that granular superconductivity is at work in some regions of these samples. the increase in the temperature region where the hysteresis is observed, decreasing the constriction width, demonstrates the problem of current averaging and voltage sensitivity limits usual experiments with large samples have. iv. direct evidence for josephson behavior in the transport properties of graphite: measurements in tem lamellae if the embedded interfaces (or some other quasi two dimensional regions) inside the measured graphite samples have superconducting properties, the best way to check them would be contacting electrodes as near as possible to those interfaces or interface regions and study the behavior as a function of any useful parameter one can take to influence their response. it should be clear that one cannot simply open the graphite sample at the interface and put voltage electrodes at the open surfaces of the interface, simply because it will not remain anymore. a tentative approach to put the contacts as near as possible to an interface has been done in ref. [46]. indeed, the observed behavior at low temperatures and as a function of magnetic field appeared to be superconducting-like. a better and appealing evidence for the superconducting behavior embedded in some graphite ordered samples can be obtained by trying to locate the voltage electrodes directly at the inner edges of the interfaces. the work in ref. [45] prepared tem lamellae from bulk hopg samples and using lithography and focused ion beam techniques, current and voltages electrodes at different positions of the samples were prepared. in this way, one tries to contact several of those interface edges simultaneously, as shown in fig. 4(a). we note, however, that a thin surface layer of disordered graphite exists due to the ga+ ion irradiation used to cut the lamella from the bulk hopg sample. this layer has a much larger resistance than the one of the graphene layers or of the interfaces [43] and therefore the input current goes through the lowest resistance path as well as the voltage electrodes pick up the response of the graphite sample with its interfaces. one can see this comparing first the r(t) curves obtained at large enough currents in the lamellae (fig. 4(c)) with those of graphite samples with top electrodes (fig. 2). the fact that a zero resistance state (minimum voltage noise ± 5 nv upon sample) is obtained at low currents with i−v characteristic curves that resemble the one expects for josephson coupled grains leaves little doubt about the origin of the obtained signals. in the tem picture of fig. 4(b), one can see the graphite single crystalline regions (different gray colors) oriented differently between them about the common c−axis and having well defined two dimensional interfaces, as high resolution tem studies revealed [47]. figure 4(c) shows the voltage vs. temperature measured in a tem lamella of oriented graphite [45] at different input dc currents, from 100 na to 10 µa. the clear sharp transition, observed at ∼ 150 k at the lowest current, shifts to lower temperatures increasing the input dc current. for the largest input currents, the temperature dependence of the resistance of the contacted lamella shows a 050007-8 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi o (c) (d) figure 4: (a) scanning electron microscopy (sem) image of a lamella of 300 nm thickness on a si/sin substrate where the yellowish colored areas are the electrodes. a four-point configuration has been prepared with the outer electrodes used to apply current and the inner ones to measure the voltage drop. the c−axis runs parallel to the substrate surface and normal to the current direction. (b) transmission electron microscopy (tem) image of a hopg lamella. the different brightness corresponds to a different orientation within the a − b plane of the crystalline regions with thickness > 30 nm. (c) voltage vs. temperature at different input currents for a lamella of ∼ 800 nm thickness and with van der pauw contact configuration. (d) current-voltage characteristics at different temperatures for a lamella of ∼ 300 nm thickness in reduced coordinates, where r is the normal state resistance, i the input current, and ic the critical josephson current. the continuous curves are fitted to the model proposed in ref. [44] with ic(t) as the only free parameter. figures taken from [45]. maximum or follows the intrinsic semiconducting behavior of the graphene layers. this behavior already suggests the existence of high temperature granular superconductivity at some parts of the sample. the study reported in ref. [45] shows that the transition temperature depends on the prepared sample. this indicates a sample dependent distribution of the superconducting regions and/or some influence of the preparation process or sample size on the superconductivity [47]. we also note that the observed sharp decrease in the measured voltage does not necessarily indicate the criti050007-9 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi cal temperature of the superconducting regions but the temperature below which a percolative granular system shows negligible resistance due to the josephson coupling at the used input current. current-voltage characteristic curves at different temperatures and in different lamellae obtained from different hopg samples have been studied in ref. [45]. an example of this i − v curves at three temperatures is shown in fig. 4(d) obtained for a different lamella. the curves follow the expected dependence for a josephson junction [44] with a temperature dependent critical current, the only free parameter in the fit. further evidence that speaks for a superconducting origin of the i − v curves is given by the expected detrimental effect of a magnetic field on the superconducting state. this effect can be due to an orbital depairing effect or due to the alignment of the electron spins at much higher fields, in case of singlet coupling. the effect of a magnetic field applied normal and parallel to the interfaces has been studied in detail for thick and thin lamellae in ref. [45]. upon sample size (thickness, i.e., width of the graphene planes inside the lamella) the observed effects are from the usual vanishing of the zero resistance state or no effect at all for thin lamellae. a magnetic field of a few koe applied normal to the interfaces is enough to destroy the josephson coupling at low temperatures, an effect compatible with the mit observed in several graphite samples, see section ii. whereas a field applied parallel to them does not influence the i −v curves at all, a fact that speaks for the two dimensionality of the superconducting regions. nevertheless, the influence of a magnetic field in hopg samples with interfaces is not as “simple” as in conventional superconductors. for high fields applied normal to the interfaces, the i −v curves show a recovery to the zero resistance state. the observed reentrance appears to be related to the magnetic-field driven reentrance observed at low temperatures in the longitudinal resistance at high enough magnetic fields [48]. this interesting behavior as well as the insensitivity of the i −v curves to magnetic fields in very thin lamellae [45] deserve further studies. we would like to note here that the possible effects of a magnetic field on the superconducting state of quasi two-dimensional superconductors, or in case the coupling does not correspond to a singlet state, are not that clear as in conventional superconductors. for example, results in two different two-dimensional superconductors, including one produced at the interfaces between non superconducting regions [49], show that superconductivity can even be enhanced by a parallel magnetic field. in case the pairing is p−type [50], the influence of a magnetic field is expected to be qualitatively different from the conventional, singlet coupling behavior [51,52] with even an enhancement of the superconducting state at intermediate fields. in case the london penetration depth is much larger than the size of the superconducting regions at the interfaces of our lamellae or if the superconducting coherence length is of the order or larger than the thickness of the lamella, the influence of a magnetic field should be less detrimental. through these studies, and taking into account that in samples without these interfaces no signature of superconducting or metallic-like behavior has been observed (see also section v) it is appealing to suggest that superconductivity is somewhere hidden at some of those interfaces or interface regions. it should be also clear that not all those interfaces have superconducting regions with similar critical parameters. those interfaces are formed during the preparation of the hopg samples based on treatments at very high temperatures (t > 3400◦c) and high pressures (p ∼10 kg/cm3) and in a non-systematic way [8]. actually, they are not at all an aim of the production but actually the opposite, they should be avoided in order to enhance the crystal perfection of the bulk hopg material. it is even possible that, upon the procedure used to control the structure and texture of the graphite sample, the near surface region, for example, can have a different degree of graphitization as inside the bulk hopg sample [8]. this means that one may obtain different results from different parts of the same hopg sample. therefore, disconcerting situations and an apparent lack of reproducibility are preprogrammed in case the research studies are done without taking care of the internal microstructure of the studied samples. v. magnetization measurements in this section, we present and discuss magnetization measurements done in bulk hopg samples 050007-10 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi with and without embedded interfaces and in water treated graphite powders. one of the main problems in interpreting magnetization data for fields applied parallel to the c−axis of the graphite structure, i.e., normal to the graphene layers and interfaces, is the need of subtraction of a large diamagnetic background. due to the small amplitude of the superconducting-like signals in the studied samples, the subtraction of this linear in field background is not so simple, because it is not known with enough certainty to obtain the true field hysteresis after its subtraction. that means that we always have a certain arbitrariness in the shape of the obtained field hysteresis, a situation that will improve with the increase of the amount of material responsible for those superconducting-like signals. the small squid signals of interest imply that one should take additional efforts to minimize or rule out possible squid artifacts [53–55]. therefore, systematic studies of samples of different or equal geometry and magnetic background, with and without interfaces, are necessary. taking into account: the overall shape of the hysteresis, the slope of the virgin curve at low fields where the subtraction does not affect too much, and the overall experience with ferromagnetic graphite [22, 56], one can rely to a certain extent on the obtained hysteresis shape. certainly, not only the field hysteresis but also other evidence one gets from magnetization measurements as, e.g., the remanence at zero field as a function of the maximal field applied (see for example measurements for yba2cu3o7 in ref. [57]) and the hysteresis in temperature dependent measurements helps to convince oneself about the existence of some kind of granular superconductivity. the hysteresis between the field cooled (fc) and zero-field cooled (zfc) curves can help to discern between a superconducting or ferromagnetic-like behavior. the most obvious evidence that speaks against a simple ferromagnetic order of the hysteresis observed as a function of temperature and field is the two dimensionality of the obtained hysteretic signals [29], i.e., the superconducting-like signals are mainly measured for fields normal to the interfaces. this fact is not compatible with any kind of magnetic order including shape or magneto crystalline anisotropy, whatever large they might be. we note that the ferromagnetic response of graphite due to dim is mostly measured for fields parallel to the graphene -60 -40 -20 0 20 40 60 -3 -2 -1 0 1 2 3 applied field µ 0 h (mt) m ag ne tiz at io n m ( 10 4 e m u/ g) hopg-1 hopg-2 t = 300 k (a) -5 0 5 wtgp m (10 5 em u/g) 0 100 200 300 400 -2 0 2 4 6 8 10 12 (b) 0.5 t (b.a.) 0.5 t (a.a.) m f c -m z f c ( 10 -6 e m u) temperature t(k) 0 20 40 60 80 4 t (b.a.) 4 t (a.a.) figure 5: (a) magnetization of two hopg bulk samples (hopg-1 and hopg-2) after subtraction of a diamagnetic background and of water treated graphite powder (wtgp, right y−axis) at 300 k. the hopg-2 sample shows no hysteresis in contrast to the other two samples. (b) temperature dependence of the difference between fc and zfc magnetic moments of the hopg-1 sample before (b.a.) and after (a.a.) warming the sample up to ' 600 k, at two constant applied fields, 0.5 t (left y−axis) and 4 t (right y-axis). the field was applied always normal to the interfaces or graphene planes of the samples. data taken from [29]. layers, parallel to the main area of the samples [22]. i. bulk graphite samples figure 5(a) shows the field hysteresis, after subtraction of the corresponding diamagnetic linear background, at 300 k of two bulk hopg samples, hopg-1 and hopg-2 and a water treated graphite powder (wtgp) (right y−axis). a tem characterization of the internal microstructure of 050007-11 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi the hopg-1 sample shows clear evidence for well defined interfaces running parallel to the graphene layers, in contrast to the hopg-2 sample [29], see fig. 1(b). these results clearly indicate that the origin of the hysteresis is related to the existence of the interfaces in the hopg-1 sample. the absence of the hysteresis in the hopg-2 sample, which has a similar diamagnetic background and overall geometry as the hopg-1 sample, also indicates that the hysteresis is not due to an obvious squid artifact or an artifact in the background subtraction. the field hysteresis is similar to that of wtgp. the narrowing of the hysteresis observed at high fields is expected for granular superconductors [58, 60–62]. from the hysteresis, as well as measuring the remanent magnetic moment as a function of the applied field [29], one obtains the characteristic josephson critical fields hjc1(t) and hjc2(t) with values similar to the wtgp [58] and a similar ratio hjc2(t)/h j c1(t) ∼ 3 [29]. figure 5(b) shows the magnetic moment hysteresis in temperature (fc minus the zfc curve) for the hopg-1 sample as received (b.a.) and after sweeping the temperature up to 500 k (a.a.) [29], at two applied fields. we would like to stress the following features: the hysteresis for the as-received sample starts from the turning point (390 k) and it is positive. the hysteresis in temperature at both applied fields are qualitatively similar, showing a crossing to negative values at low temperatures. larger zfc values (smaller in absolute value) than fc ones in the magnetic moment are usually not observed, neither in superconductors nor in ferromagnets and it does appear to be a squid artifact [29]. this negative hysteresis in temperature would suggest that the superconducting properties can be enhanced to some extent under a magnetic field, an effect that might be related to the reentrance we have shortly mentioned in section iv. a slight annealing of the hopg-1 sample of less than one hour at ∼ 500 k changes drastically the observed hysteresis for both fields (open symbols in fig. 5(b)). the hysteresis appears to be shifted to lower temperatures but with negative values at high temperatures and high fields. we note that annealing at similar temperatures for several hours produced a decrease in the overall hysteresis observed in wtgp (see supporting information of ref. [58]). at the state of this research, it is unclear whether pinning properties of vortices and/or of fluxons or 0 100 200 300 -2 0 2 4 6 8 10 0 10 20 30 m f c m z f c ( 10 -5 e m u) temperature t(k) 10 mt 20 mt 30 mt 50 mt 0.1 t 0.15 t 0.2 t 0.3 t 40 (b) 0.5 t 1 t -40 -20 0 20 40 -4 -2 0 2 4 m ag ne tiz at io n m ( 10 -5 e m u/ g) applied field µ 0 h (mt) s1 s2 s3 5 k (a) figure 6: (a) field hysteresis at 5 k for a maximum applied field of 40 mt for the water treated graphite powder (s1), the same powder but after pressing it in a pellet with a pressure of 18±5 mpa (s2) and after pressing it again with a pressure of 60±20 mpa (s3). the corresponding diamagnetic linear backgrounds were subtracted from the measured data. (b) difference between the fc and zfc curve at different applied fields for a water treated graphite powder. data taken from ref. [58]. the existence of different superconducting phases play a main role in the hysteresis that is observed for fields applied normal to the interfaces. ii. water treated graphite powder the work of ref. [58] reports on the magnetic response of wtgps. the main message of that work is that the wtgp shows a hysteretic behavior in field and temperature compatible with granular superconductivity. as an example, we show in fig. 6(a) the field hysteresis at 5 k of a wtgp (s1, lose powder without applying significant pres050007-12 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi sure) and the same wtgp but after pressing it into a pellet with two different pressures (s2,s3). after the diamagnetic background subtraction, the field hysteresis is similar to that obtained for bulk hopg sample with interfaces, see fig. 5(a) for similar data but at 300 k. the fact that the hysteresis vanishes after applying pressure to the powder rules out simple squid artifacts (the diamagnet background does not diminish after making a pellet from the graphite powder, but the contrary) and also it rules out that the hysteresis is due to a ferromagnetic response due to impurities. figure 6(b) shows the difference in the magnetic moment between the zfc and fc curves, as in fig. 5(b). the behavior of this difference as a function of the applied field appears to be compatible with the one expected for granular superconductors [58]. note the following features: the hysteresis increases at all t for fields µ0h . 50 mt, showing a maximum near the turning point of 300 k, similar to the hopg-1 sample in the as-received state, see fig. 5(b). at fields 0.1 t . µ0h . 0.2 t the difference decreases at all t and remains rather field independent. at higher fields, however, it increases showing a shift of the crossing point (from negative to positive values) to higher t . this behavior is at odds to the one expected for ferromagnets, even for ferromagnetic nanoparticles [63] as well as for superconductors with a pinning force that decreases with applied field in the shown field range. from the results in [58], and using basic concepts of vortex pinning, we would then conclude that if an upper critical field exists, then it should be clearly larger than 7 t in the temperature range of the figure. in spite of some interesting differences between the behavior obtained for bulk hopg and wtgp, the similarities already suggest that the water treatment helps to produce a certain amount of interfaces between graphite grains, being the origin for the whole hysteresis. thermal annealing as well as pressing the wtgp are detrimental indicating that defects and/or hydrogen or oxygen at the interfaces could play an important role in the observed phenomena. vi. discussion superconductivity in carbon-based systems is a rather old, well recognized fact. this phenomenon was probably first observed in the potassium intercalated graphite c8k [64] back in 1965. since then, a considerable amount of studies reported this phenomenon in carbon-based systems, reaching critical temperatures tc ∼ 10 k in intercalated graphite [65, 66] and above 30 k though not percolative in some hopg samples [59] as well as in doped graphite and amorphous carbon systems [67–70]. traces of superconductivity at tc = 65 k have been recently reported in amorphous carbon powder that contained a small amount of sulfur [71]. superconductivity was found also in carbon nanotubes with tc = 0.55 k [72] and 12 k [73] or possibly even higher critical temperatures [74, 75]. superconductivity with tc ∼ 4 k in boron-doped diamond [76] and in diamond films with tc ∼ 7 k [77] belong also to the recently published list of carbon-based superconductors. we should note, however, that superconductivity at room temperature in a disordered graphite powder has been already reported in 1974 [78], see also [79], a work that did not attract the necessary attention in the community. whether quasi two dimensional interfaces play a role in the above mentioned carbon-based superconductors, one can probably rule out only for the intercalated graphite and doped diamond compounds, where the three dimensional superconductivity is characterized by a relatively low critical temperature. we may speculate that the traces of superconductivity found in doped amorphous carbon, disordered or ordered graphite powders may be related to some interfaces between well ordered graphite regions. the experience of the high temperature superconducting oxides already suggests that two dimensionality is advantageous to achieve higher critical temperatures. apart from the usual transport and magnetization measurements used to characterize the superconducting state, there are scanning tunneling spectroscopy (sts) results obtained on certain disordered regions of a hopg surface at t = 4.2 k that revealed an apparent energy gap ∼ 100 mev [80]. although the overall curves resemble a superconducting-like density of states, the authors suggested that the gap originates from charging effects. see further sts results and the discus050007-13 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi sion in [70]. theoretical works that deal with superconductivity in graphite as well as in graphene have been published in recent years. for example, p-type superconductivity has been predicted to occur in inhomogeneous regions of the graphite structure [50] or d−wave high-tc superconductivity [81] based also on resonance valence bonds [82], or at the graphite surface region with rhombohedral stacking due to a topologically protected flat band [83]. for the graphite structure, the experimental evidence obtained in the last years suggests that high temperature superconductivity exists at certain interfaces or interface regions within the usual bernal structure although the structure of the superconducting regions remains unknown. one can further speculate that due to the high carrier concentration that can be localized at those interfaces, they should be predestined to play a role in triggering superconductivity. following a bcs approach in two dimensions (with anisotropy), for example, a critical temperature tc ∼ 60 k has been estimated if the density of conduction electrons per graphene plane increases to n ∼ 1014 cm−2, a density that might be induced by defects and/or hydrogen adatoms [84] at the interfaces, or by li deposition [85]. further predictions for superconductivity in graphene support the premise that n > 1013 cm−2 in order to reach tc > 1 k [86, 87]. on the other hand, the possibility to have high temperature superconductivity at the surface of or in the rhombohedral graphite phase [21, 83] – a phase that sometimes is found in graphite samples [19, 20] – stimulates further careful studies of these hidden interfaces. in the last years, superconductivity has been found at the interfaces between oxide insulators [88] as well as between metallic and insulating copper oxides with tc & 50 k[89]. also, interfaces in different bi bicrystals show superconductivity up to 21 k, although bi bulk is not a superconductor [90, 91]. finally, we think that some of the interfaces are also the origin for the metallic-like behavior of graphite samples as well as for the quantum hall effect (qhe) found in some hopg samples [48,92]. because the existence, density as well as the intrinsic properties of these interfaces depend on sample, we can now understand why the reproducibility of the qhe in bulk hopg samples is rather poor. vii. conclusion in this review, we have discussed the following experimental evidence: firstly, the temperature and magnetic field dependence of the electrical resistance of bulk and thin films of graphite samples and its relation with the existence of two dimensional interfaces. secondly, the josephson behavior of the currentvoltage curves with an apparent zero resistance state at high temperatures in especially made tem lamellae. thirdly, the anomalous hysteresis in the magnetoresistance observed in graphite thin samples as well as its enhancement restricting the current path within the sample. finally, the overall magnetization of bulk graphite samples, with and without interfaces, as well as water treated graphite powders. all this experimental evidence as a whole indicates the existence of superconductivity located at certain interfaces inside graphite samples. although we cannot rule out other interpretations for some of the observations discussed in this work, the whole evidence suggests that superconductivity should be the origin for all the phenomena discussed here. clearly, the situation is still highly unsatisfactory because several open questions remain, namely, the characteristics of the superconducting phase(s), from the structure to the main superconducting parameters, as “simple” as the critical temperature and critical fields, the coherence and penetration lengths, etc. it is clear that further studies are necessary in the future but the overall work done until now shows us the way to go. acknowledgements the author acknowledges the support provided by the deutsche forschungsgemeinschaft under contract dfg es 86/16-1 and the esf-nano under the graduate school of natural sciences “buildmona”. the results presented in this review were part of the ph.d. thesis of heiko kempa (section ii), srujana dusari (section iii) and ana ballestar (section iv) as well as the master thesis of thomas scheike (section v) done in the division of superconductivity and magnetism of the institute for experimental physics ii of the university of leipzig. the author thanks dipl. 050007-14 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi kris. annette setzer, dr. josé barzola-quiquia and dr. winfried böhlmann for their experimental assistance and support. the permanent support as well as the discussions with nicolás garćıa are gratefully acknowledged. special thanks go to yakov kopelevich with whom we started in the year 1999 and in a rather naive way the research of a new and unexpected world behind graphite. note added in proof since the submission of this manuscript, some new works related to the subject of this review were published. tight-binding simulations done in ref. [93] support the work done in ref. [21] and found that surface superconductivity is robust for abc stacked multilayer graphene, even at very low pairing potentials. through the observation of persistent currents in a graphite filled ring-shaped container immersed in alkanes, the author in ref. [94] claimed possible room temperature superconductivity. for completeness, we include them in the reference list. [1] j w mcclure, energy band structure of graphite, ibm j. res. dev. 8, 255 (1964). [2] b t kelly, physics of graphite, applied science publishers, london (1981). [3] a grüneis, c attaccalite, t pichler, v zabolotnyy, h shiozawa, s l molodtsov, d inosov, a koitzsch, m knupfer, j schiessling, r follath, r weber, p rudolf, r wirtz, a rubio, electron-electron correlation in graphite: a combined angle-resolved photoemission and first-principles study, phys. rev. lett. 100, 037601 (2008). [4] i m tsidilkovski, electron spectrum of gapless semiconductors. springer series in solid-state sciences vol. 116, springer verlag (1997). [5] r o dillon, i l spain, j w mcclure, electronic energy band parameters of graphite and their dependence on pressure, temperature and acceptor concentration, j. phys. chem. solids 38, 635 (1977). [6] j m schneider, m orlita, m potemski, d k maude, consistent interpretation of the lowtemperature magnetotransport in graphite using the slonczewski-weiss-mcclure 3d bandstructure calculations, phys. rev. lett. 102, 166403 (2009). [7] j barzola-quiquia, j l yao, p rödiger, k schindler, p esquinazi, sample size effects on the transport properties of mesoscopic graphite samples, physica status solidi a 205, 2924 (2008). [8] m inagaki, new carbons: control of structure and functions, elsevier (2000). [9] a arndt, d spoddig, p esquinazi, j barzolaquiquia, s dusari, t butz, electric carrier concentration in graphite: dependence of electrical resistivity and magnetoresistance on defect concentration, phys. rev. b 80, 195402 (2009). [10] n garćıa, p esquinazi, j barzola-quiquia, s dusari, evidence for semiconducting behavior with a narrow band gap of bernal graphite, new j. phys. 14, 053015 (2012). [11] y ohashi, k yamamoto, t kubo, shubnikov de haas effect of very thin graphite crystals, in: carbon’01, an international conference on carbon, pag. 568, the american carbon society, lexington, ky, united states, (2001). [12] b c camargo, y kopelevich, s b hubbard, a usher, w böhlmann, p esquinazi, effect of structural disorder on the quantum oscillations in graphite (unpublished). in this work the authors show that in certain hopg samples (spi) of high grade, the density of interfaces is much lower than in, for example, advanced ceramics hopg zya samples. in this new hopg samples basically no sdh oscillations are found and the temperature dependence of the resistance shows a semiconducting behavior with saturation a low temperatures (2013). [13] m orlita, c faugeras, g martinez, d k maude, m l sadowski, j m schneider, m potemski, magneto-transmission as a probe of dirac fermions in bulk graphite, j. phys: cond. mat. 20, 454223 (2008). 050007-15 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi [14] n a goncharuk, l nádvorńık, c faugeras, m orlita, l smrčka, infrared magnetospectroscopy of graphite in tilted fields, phys. rev. b 86, 155409 (2012). [15] j c gonzález, m muñoz, n garćıa, j barzolaquiquia, d spoddig, k schindler, p esquinazi, sample-size effects in the magnetoresistance of graphite, phys. rev. lett. 99, 216601 (2007). [16] n garćıa, p esquinazi, j barzola-quiquia, b ming, d spoddig, transition from ohmic to ballistic transport in oriented graphite: measurements and numerical simulations, phys. rev. b 78, 035413 (2008). [17] s dusari, j barzola-quiquia, p esquinazi, n garćıa, ballistic transport at room temperature in micrometer-size graphite flakes, phys. rev. b 83, 125402 (2011). [18] p esquinazi, j barzola-quiquia, s dusari, n garćıa, length dependence of the resistance in graphite: influence of ballistic transport, j. appl. phys. 111, 033709 (2012). [19] q lin, t li, z liu, y song, l he, z hu, q guo, h ye, high-resolution tem observations of isolated rhombohedral crystallites in graphite blocks, carbon 50, 2369 (2012). [20] c h lui, z li, z chen, p v klimov, l e brus, t f heinz, imaging stacking order in few-layer graphene, nano lett. 11, 164 (2011). [21] n b kopnin, m ijäs, a harju, t t heikkilä, high-temperature surface superconductivity in rhombohedral graphite, phys. rev. b 87, 140503 (2013). [22] p esquinazi, j barzola-quiquia, d spemann, m rothermel, h ohldag, n garćıa, a setzer, t butz, magnetic order in graphite: experimental evidence, intrinsic and extrinsic difficulties, j. magn. magn. mat. 322, 1156 (2010). [23] h kempa, y kopelevich, f mrowka, a setzer, j h s torres, r höhne, p esquinazi, magnetic field driven superconductor-insulatortype transition in graphite, solid state commun. 115, 539 (2000). [24] y kopelevich, p esquinazi, j h s torres, r r da silva, h kempa, graphite as a highly correlated electron liquid, in: advances in solid state physics, vol. 43, ed. b kramer, pag. 207, springer-verlag, berlin (2003). [25] t tokumoto, e jobiliong, e choi, y oshima, j brooks, electric and thermoelectric transport probes of metal-insulator and two-band magnetotransport behavior in graphite, solid state commun. 129, 599 (2004). [26] x du, s w tsai, d l maslov, a f hebard, metal-insulator-like behavior in semimetallic bismuth and graphite, phys. rev. lett. 94, 166601 (2005). [27] y zhang, j p small, w v pontius, p kim, fabrication and electric-field-dependent transport measurements of mesoscopic graphite devices, appl. phys. lett. 86, 073104 (2005). [28] see several reviews in: graphite and precursors, world of carbon series, vol. 1, ed. p delhaes, gordon and breach science publishers (2001). [29] t scheike, p esquinazi, a setzer, w böhlmann, granular superconductivity at room temperature in bulk highly oriented pyrolytic graphite samples, carbon 59, 140 (2013). [30] y kopelevich, v lemanov, s moehlecke, j torres, landau level quantization and possible superconducting instabilities in highly oriented pyrolitic graphite, phys. solid state 41, 1959 (1999). [31] h kempa, h c semmelhack, p esquinazi, y kopelevich, absence of metal-insulator transition and coherent interlayer transport in oriented graphite in parallel magnetic fields, solid state commun. 125, 1 (2003). [32] y kopelevich, p esquinazi, j torres, r da silva, h kempa, f mrowka, r ocaña, metal-insulator-metal transitions, superconductivity and magnetism in graphite, in: studies of high temperature superconductors, vol. 45, chap. 3, pag. 59, nova science publishers inc. (2003). 050007-16 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi [33] m p a fisher, quantum phase transitions in disordered two-dimensional superconductors, phys. rev. lett. 65, 923 (2000). [34] e abrahams, s v kravchenko, m p sarachik, metallic behavior and related phenomena in two dimensions, rev. mod. phys. 73, 251 (2001). [35] y shapira, g deutscher, semiconductorsuperconductor transition in granular al-ge, phys. rev. b 27, 4463 (1983). [36] j s langer, v ambegaokar, intrinsic resistive transition in narrow superconducting channels, phys. rev. 164, 498 (1967). [37] d e mccumber, b i halperin, time scale of intrinsic resistive fluctuations in thin superconducting wires, phys. rev. b 1, 1054 (1970). [38] p esquinazi, n garćıa, j barzola-quiquia, p rödiger, k schindler, j l yao, m ziese, indications for intrinsic superconductivity in highly oriented pyrolytic graphite, phys. rev. b 78, 134516 (2008). [39] l ji, m s rzchowski, n anand, m thinkam, magnetic-field-dependent surface resistance and two-level critical-state model for granular superconductors, phys. rev. b 47, 470 (1993). [40] y kopelevich, c dos santos, s moehlecke, a machado, current-induced superconductorinsulator transition in granular high-tc superconductors, arxiv:0108311 (2001). [41] i felner, e galstyan, b lorenz, d cao, y s wang, y y xue, c w chu, magnetoresistance hysteresis and critical current density in granular rusr2gd2−xcexcu2o10−δ, phys. rev. b 67, 134506 (2003). [42] s dusari, j barzola-quiquia, p esquinazi, superconducting behavior of interfaces in graphite: transport measurements of microconstrictions, j. supercond. nov. magn. 24, 401 (2011). [43] j barzola-quiquia, s dusari, g bridoux, f bern, a molle, p esquinazi, the influence of ga+ irradiation on the transport properties of mesoscopic conducting thin films, nanotech. 21, 145306 (2010). [44] v ambegaokar, b i halperin, voltage due to thermal noise in the dc josephson effect, phys. rev. lett. 22, 1364 (1969). [45] a ballestar, j barzola-quiquia, t scheike, p esquinazi, evidence of josephson-coupled superconducting regions at the interfaces of highly oriented pyrolytic graphite, new j. phys. 15, 023024 (2013). [46] j barzola-quiquia, p esquinazi, ferromagneticand superconducting-like behavior of the electrical resistance of an inhomogeneous graphite flake, j. supercond. nov. magn. 23, 451 (2010). [47] a ballestar, p esquinazi, highly oriented pyrolytic graphite tem lamellae preparation to study transport properties of the internal interfaces, j. visual. exp. (in press). [48] y kopelevich, j h s torres, r r da silva, f mrowka, h kempa, p esquinazi, reentrant metallic behavior of graphite in the quantum limit, phys. rev. lett. 90, 156402 (2003). [49] h j gardner, a kumar, l yu, p xiong, m p warusawithana, l wang, o vafek, d g schlom, enhancement of superconductivity by a parallel magnetic field in two-dimensional superconductors, nat. phys. 7, 895 (2011). [50] j gonzález, f guinea, m a h vozmediano, electron-electron interactions in graphene sheets, phys. rev. b 63, 134421 (2001). [51] k scharnberg, r a klemm, p-wave superconductors in magnetic fields, phys. rev. b 22, 5233 (1980). [52] a knigavko, b rosenstein, spontaneous vortex state and ferromagnetic behavior of typeii p-wave superconductors, phys. rev. b 58, 9354 (1998). [53] n casan-pastor, p gomez-romero, l c baker, magnetic measurements with a squid magnetometer: possible artifacts induced by sample holder off centering, j. appl. phys. 69, 5088 (1991). [54] a ney, t kammermeier, v ney, k ollefs, s ye, limitations of measuring small magnetic 050007-17 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi signals of samples deposited on a diamagnetic substrate, j. magn. magn. mater. 320, 3341 (2008). [55] m sawicki, w stefanowicz, a ney, sensitive squid magnetometry for studying nanomagnetism, semicond. sci. tech. 26, 064006 (2011). [56] j barzola-quiquia, w böhlmann, p esquinazi, a schadewitz, a ballestar, s dusari, l schultze-nobre, b kersting, enhancement of the ferromagnetic order of graphite after sulphuric acid treatment, appl. phys. lett. 98, 192511 (2011). [57] m w mcelfresh, y yeshurun, a p malozemoff, f holtzberg, remanent magnetization, lower critical fields and surface barriers in an yba2cu3o7 crystal, physica a 168, 308 (1990). [58] t scheike, w böhlmann, p esquinazi, j barzola-quiquia, a ballestar, a setzer, can doping graphite trigger room temperature superconductivity? evidence for granular hightemperature superconductivity in water-treated graphite powder, adv. mater. 24, 5826 (2012). [59] y kopelevich, p esquinazi, j torres, s moehlecke, ferromagneticand superconducting-like behavior of graphite, j. low temp. phys. 119, 691 (2000). [60] s senoussi, c aguillon, s hadjoudj, the contribution of the intergrain currents to the low field hysteresis cycle of granular superconductors and the connection with the microand macrostructures, physica c 175, 215 (1991). [61] m borik, m chernikov, v veselago, v stepankin, anomalies of the magnetic properties of granular oxide superconductor bapbl−xbixo3, j. low temp. phys. 85, 283 (1991). [62] b andrzejewski, e guilmeau, c simon, modelling of the magnetic behaviour of random granular superconductors by the single junction model, supercond. sci. tech. 14, 904 (2001). [63] r prozorov, y yeshurun, t prozorov, a gedanken, magnetic irreversibility and relaxation in assembly of ferromagnetic nanoparticles, phys. rev. b 59, 6956 (1999). [64] n b hannay, t h geballe, b t matthias, k andres, p schmidt, d macnair, superconductivity in graphitic compounds, phys. rev. lett. 14, 225 (1965). [65] t e weller, m ellerby, s s siddharth, r p smith, t skippe, superconductivity in the intercaled graphite compounds c6yb and c6ca, nat. phys. 1, 39 (2005). [66] n emery, c hérold, m d’astuto, v garcia, c bellin, j f marêché, p lagrange, g loupias, superconductivity of bulk cac6, phys. rev. lett. 95, 035413 (2005). [67] r r da silva, j h s torres, y kopelevich, indication of superconductivity at 35 k in graphitesulfur composites, phys. rev. lett. 87, 147001 (2001). [68] y kopelevich, r r da silva, j h s torres, s moehlecke, m b maple, high-temperature local superconductivity in graphite-sulfur composites, physica c 408, 77 (2004). [69] i felner, y kopelevich, magnetization measurement of a possible high-temperature superconducting state in amorphous carbon doped with sulfur, phys. rev. b 79, 233409 (2009). [70] y kopelevich, p esquinazi, ferromagnetism and superconductivity in carbon-based systems, j. low temp. phys. 146, 629 (2007). [71] i felner, o wolf, o millo, high-temperature superconductivity in sulfur-doped amorphous carbon systems, j. supercond. nov. magn. 25, 7 (2012). [72] m kociak, a y kasumov, s guéron, b reulet, i i khodos, y b gorbatov, v t volkov, l vaccarini, h bouchiat, superconductivity in ropes of single-walled carbon nanotubes, phys. rev. lett. 86, 2416 (2001). [73] i takesue, j haruyama, n kobayashi, s chiashi, s maruyama, t sugai, h shinohara, superconductivity in entirely end-bonded multiwalled carbon nanotubes, phys. rev. lett. 96, 057001 (2006). 050007-18 papers in physics, vol. 5, art. 050007 (2013) / p. esquinazi [74] z k tang, l zhang, n wang, x x zhang, g h wen, g d li, j n wang, c t chan, p sheng, superconductivity in 4 angstrom single-walled carbon nanotubes, science 292, 2462 (2001). [75] g m zhao, y s wang, possible superconductivity above 400 k in carbon-based multiwall nanotubes, arxiv:0111268 (2001). [76] e a ekimov, v a sidorov, e d bauer, n n mel’nik, n j curro, j d thompson, s m stishov, superconductivity in diamond, nature 428, 542 (2004). [77] y takano, m nagao, i sakaguchi, m tachiki, t hatano, k kobayashi, h umezawa, h kawarada, superconductivity in diamond thin films well above liquid helium temperature, appl. phys. lett. 85, 2851 (2004). [78] k antonowicz, possible superconductivity at room temperature, nature 247, 358 (1974). [79] k antonowicz, the effect of microwaves on dc current in an al-carbon-al sandwich, physica status solidi a 28, 497 (1975). [80] n agrait, j rodrigo, s vieira, on the transition from tunneling regime to point-contact: graphite, ultramicroscopy 42–44, part 1, 177 (1992). [81] r nandkishore, l s levitov, a v chubukov, chiral superconductivity from repulsive interactions in doped graphene., nat. phys. 8, 158 (2012). [82] a m black-schaffer, s doniach, resonating valence bonds and mean-field d-wave superconductivity in graphite, phys. rev. b 75, 134512 (2007). [83] n b kopnin, t t heikkilä, g e volovik, hightemperature surface superconductivity in topological flat-band systems, phys. rev. b 83, 220503 (2011). [84] n garćıa, p esquinazi, mean field superconductivity approach in two dimensions, j. supercond. nov. magn. 22, 439 (2009). [85] g profeta, m calandra, f mauri, phononmediated superconductivity in graphene by lithium deposition, nat. phys. 8, 131 (2012). [86] b uchoa, a h c neto, superconducting states of pure and doped graphene, phys. rev. lett. 98, 146801 (2007). [87] n b kopnin, e b sonin, bcs superconductivity of dirac electrons in graphene layers, phys. rev. lett. 100, 246808 (2008). [88] n reyren, s thiel, a d caviglia, l f kourkoutis, g hammerl, c richter, c w schneider, t kopp, a s rüetschia, d jaccard, m gabay, d a muller, j m triscone, j mannhart, superconducting interfaces between insulating oxides, science 317, 1196 (2007). [89] a gozar, g logvenov, l f kourkoutis, a t bollinger, l a giannuzzi, l a muller, i bozovic, high-temperature interface superconductivity between metallic and insulating copper oxides, nature 455, 782 (2008). [90] f muntyanua, a gilewski, k nenkov, j warchulska, a zaleski, experimental magnetization evidence for two superconducting phases in bi bicrystals with large crystallite disorientation angle, phys. rev. b 73, 132507 (2006). [91] f muntyanua, a gilewski, k nenkov, a zaleski, v chistol, superconducting crystallite interfaces with tc up to 21 k in bi and bisb bicrystals of inclination type, solid state commun. 147, 183 (2008). [92] y kopelevich, p esquinazi, graphene physics in graphite, adv. mater. (weinheim, ger.) 19, 4559 (2007). [93] w a muñoz, l covaci, f peeters, tightbinding description of intrinsic superconducting correlations in multilayer graphene, phys. rev. b 87, 134509 (2013). [94] y kawashima, possible room temperature superconductivity in conductors obtained by bringing alkanes into contact with a graphite surface, aip advances 3, 052132 (2013). 050007-19 papers in physics, vol. 6, art. 060012 (2014) received: 3 august 2014, accepted: 28 october 2014 edited by: g. mindlin licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.060012 www.papersinphysics.org issn 1852-4249 nonlinearity arising from noncooperative transcription factor binding enhances negative feedback and promotes genetic oscillations iván m. lengyel,1 daniele soroldoni,2, 3 andrew c. oates,2, 3 luis g. morelli1∗ we study the effects of multiple binding sites in the promoter of a genetic oscillator. we evaluate the regulatory function of a promoter with multiple binding sites in the absence of cooperative binding, and consider different hypotheses for how the number of bound repressors affects transcription rate. effective hill exponents of the resulting regulatory functions reveal an increase in the nonlinearity of the feedback with the number of binding sites. we identify optimal configurations that maximize the nonlinearity of the feedback. we use a generic model of a biochemical oscillator to show that this increased nonlinearity is reflected in enhanced oscillations, with larger amplitudes over wider oscillatory ranges. although the study is motivated by genetic oscillations in the zebrafish segmentation clock, our findings may reveal a general principle for gene regulation. i. introduction cells can generate temporal patterns of activity by means of genetic oscillations [1, 2]. genetic oscillations are biochemical oscillations in the levels of gene products [3–14]. they can be produced by negative feedback regulation of gene expression, in which a gene product inhibits its own production directly or indirectly [15]. such autoinhibition is often performed by transcriptional repressors, proteins or protein complexes that bind the promoter of a gene and inhibit the transcription of new mrna molecules [16,17], see fig. 1. a theoret∗email: morelli@df.uba.ar 1 departamento de f́ısica, fceyn uba and ifiba, conicet, pabellón 1, ciudad universitaria, 1428 buenos aires, argentina. 2 mrc-national institute for medical research, the ridgeway, mill hill, nw7 1aa london, uk. 3 department of cell and developmental biology, university college london, gower street, wc1e 6bt london, uk. ical description of biochemical oscillations requires a delayed negative feedback, together with sufficient nonlinearity and a balance of the timescales of the different processes involved [18]. delays may occur naturally in transcriptional regulation, since the assembly of gene products involves intermediate steps [16,19,20]. nonlinearity refers to the presence of nonlinear terms in the equations describing the dynamics. such nonlinear terms may occur in the equations due to the presence of cooperative biochemical processes, where cooperativity is understood as a phenomenon in which several components act together to orchestrate some collective behavior [21]. although some processes giving rise to nonlinear terms are known, in general it is still an open question how nonlinearity is built into genetic oscillators. a compelling model system for genetic oscillations is the vertebrate segmentation clock [22–25]. this is a tissue-level pattern generator that controls the formation of vertebrate segments during embryonic development [25, 26]. the spatiotemporal patterns generated by the segmentation clock are 060012-1 papers in physics, vol. 6, art. 060012 (2014) / i. m. lengyel et al. thought to be initiated by a genetic oscillator at the single cell level [7,9,10,27–29]. in this genetic oscillator, negative feedback is provided by genes of the her family, which code proteins that form dimers and can act as transcriptional repressors [30–34]. the time taken from transcription, translation and splicing may introduce the necessary feedback delays in zebrafish and mouse [19, 34–37]. one source of nonlinearity in the segmentation clock oscillator is the dimerization of gene products that bind the promoter of cyclic genes [32]. however, this may be insufficient to generate the observed oscillations in the levels of gene products. one way to increase nonlinearity would be cooperative binding of repressors to regulatory binding sites at the promoter [21, 38, 39]. cooperative binding to multiple binding sites can make the negative feedback steeper [40]. a similar effect occurs with ultrasensitivity in phosphorylation cascades [41]. the presence of clusters of binding sites for transcription factors may be a common motif in gene regulation [42], and cooperativity in transcription factor binding has been reported for some systems [43]. in zebrafish, multiple binding sites for her dimers have been identified in the promoter region of her1, her7, and other genes of the notch pathway [32, 44, 45]. however, there is no evidence for ❜� c a b c 0 50 1 2 d im e r c o n c e n tr a ti o n time 1 2 0.6 1.0 1.4 x (t ) x(t ✁) ✁ x x gene figure 1: delayed autoinhibition can produce genetic oscillations. (a) the gene (light blue box) is transcribed and translated into gene products x, with a delay τ. in this example, gene products form dimers that act as transcriptional repressors, inhibiting transcription of the gene (blunted arrow), and decay at a rate c. (b) and (c) numerical solutions of eq. (9) describing the oscillator in a. (b) phase space: monomer concentration x(t) vs. the delayed concentration x(t−τ) settled to a limit cycle. (c) dimer concentration oscillates as a function of time. parameters in b, c: bp = 2, τ = 1, c = 1, x0 = 1, n = 12, m = 6. cooperative binding of transcriptional regulators in the case of the zebrafish segmentation clock. therefore, although cooperative binding is not ruled out, this lack of evidence raises the general question of what contribution could be expected from the multiple binding sites reported in the promoter of segmentation clock cyclic genes. here we use theory to study how multiple binding sites affect nonlinearity and biochemical oscillations in a generic description of a genetic oscillator. ii. a promoter with multiple binding sites we first evaluate the regulatory function of a promoter that contains n binding sites for a transcriptional repressor, see fig. 2. we consider transcriptional repressors although the results in this section are more general and would apply to other types of transcription factors. we focus on a single transcriptional repressor for the sake of simplicity, and assume that all binding sites are identical. that is, binding and unbinding of repressors to the different sites occur at the same rates for all sites. moreover, we assume that there is no cooperativity in repressor binding. this means that binding and unbinding rates are not affected by the presence of bound factors to any of the other sites. a b 1 2 3 4 n 1 2 3 4 n k � k ✁ figure 2: schematic representation of a promoter with multiple binding sites, numbered from 1 to n. transcriptional repressors (orange squares) bind and unbind from binding sites (numbered platforms) at the promoter of a gene (light blue stretch). (a) and (b) illustrate two equivalent configurations of the promoter with identical number of bound transcriptional repressors having the same inhibiting strength. 060012-2 papers in physics, vol. 6, art. 060012 (2014) / i. m. lengyel et al. the state pi of the promoter at any time can be characterized by the number i of bound factors, which goes from 0 to n. with a rate k+, a free repressor binds to an empty site at the promoter, which steps from state pi to state pi+1; with a rate k−, a bound repressor falls off the promoter, which steps from state pi to state pi−1: p0 k+−−⇀↽−− k− · · · −⇀↽−pi−1 k+−−⇀↽−− k− pi k+−−⇀↽−− k− pi+1 −⇀↽− ··· k+−−⇀↽−− k− pn. (1) denoting by pi the promoter occupation probabilities which describe the fraction of time that the promoter spends at state pi in the thermodynamic limit [46], the kinetics of binding and unbinding of transcriptional repressors to the binding sites is given by the set of ordinary differential equations ṗ0 = −k+nxp0 + k−p1 ... ṗi = −k+(n − i)xpi −k−ipi (2) +k+(n − i + 1)xpi−1 + k−(i + 1)pi+1 ... ṗn = −k−npn + k+xpn−1 , together with the conservation law p = n∑ i=0 pi , (3) where x is the repressor concentration and p is proportional to the number of gene copies. we assume here that binding and unbinding of repressors to the promoter occur much faster than other processes like transcription, translation, transport and decay of molecules. this means that the promoter occupation probabilities quickly reach equilibrium with a given concentration of transcriptional repressors [46,47]. this situation is described by ṗi = 0 for all i in eq. (3), and the resulting algebraic equations can be solved by induction to obtain pi = ( n i )( x x0 )i p0 , (4) where x0 = k −/k+ is the equilibrium constant for binding of factors. using the constraint eq. (3) we express p0 in terms of p p0 = p ( 1 + x x0 )−n . (5) equation (4) and eq. (5) describe the equilibrium occupation of the promoter in terms of the concentration x of the transcriptional repressor. iii. abrupt inhibition in the previous section, we evaluated the kinetics of noncooperative binding to a promoter with multiple binding sites. how does the presence of bound transcriptional repressors affect the transcription rate of the gene downstream of the promoter? in general, the strength of inhibition will depend on the number of bound repressors, and the regulatory function f(x) will have the form f(x) = n∑ i=0 aipi , (6) where a0 = b is the basal transcription rate in the absence of bound repressors and ai is the transcription rate in the presence of i bound repressors. here we assume, for the sake of simplicity, that transcription proceeds at its basal rate b while there are m or less sites occupied by the repressors, and drops to zero when the number of occupied sites is larger than m, see fig. 3. we shall consider an alternative scenario below. in this situation, eq. (6) becomes f(x) = b m∑ i=0 pi . (7) using eq. (4) and eq. (5), the resulting regulatory function for a promoter with n binding sites is fn,m (x) = bp ( 1 + x x0 )−n × m∑ i=0 ( n i )( x x0 )i . (8) 060012-3 papers in physics, vol. 6, art. 060012 (2014) / i. m. lengyel et al. 0 1 2 3 0 1 x re g u la to ry f u n c ti o n f 1 2 ,m 0 n b 0 m number of occupied sites tr a n s c ri p ti o n r a te a b figure 3: abrupt inhibition. full inhibition occurs when more than m sites are bound by transcriptional repressors. (a) transcription rate as a function of the number of bound transcriptional repressors. (b) normalized regulatory function, eq. (8), as a function of repressor concentration x, with n = 12 and m = 0, . . . , 11 (dark blue to dark red). the zebrafish segmentation clock may be an interesting system to evaluate these results. in the her1/her7 locus of zebrafish, an estimated number of about 12 binding sites has been reported [32,45]. the regulatory functions resulting from eq. (8) for n = 12 are displayed in fig. 3b. although it is clear that inhibition of the promoter for a given level of repressors shifts to the right as m increases, it is less obvious how the value of m affects the steepness —that is the nonlinearity— of the negative feedback. we use hill functions to parametrize the regulatory functions eq. (8) in a more transparent form, see appendix. hill functions are parametrized by a hill coefficient h characterizing the steepness of the curve, and an inhibition threshold k that describes the concentration of repressors that halves the production rate. we fit hill functions to fn,m and obtain an effective hill coefficient h and effective inhibition threshold k, for each value of n and m, fig. 4a,b. increasing the number of binding sites n while keeping m fixed can increase the hill coefficient. for fixed n, increasing m changes the hill coefficient in a nonmonotonic way: there is an optimal value of m that maximizes the hill coefficient and therefore nonlinearity. the effective inhibition threshold k changes in a simpler form, increasing both with n and m. in conclusion, multiple binding sites can effectively increase the nonlinearity of the feedback via the regulatory function. as discussed above, nonlinearity is an essen1.5 2.0 2.5 3.0 m 0 5 10 15 n 4 6 8 10 12 14 a 1 2 3 4 m 0 5 10 15 n 4 6 8 10 12 14 b c d m 0 5 10 15 n 4 6 8 10 12 14 n 4 6 8 10 12 14 m 0 5 10 15 3.1 3.2 0.1 0.5 1.0 figure 4: multiple binding sites can increase nonlinearity and enhance oscillations. (a) and (b) effective hill parameters for the regulatory function fn,m , eq. (8), with bp = 1 and x0 = 1. (a) effective hill coefficient h. (b) effective inhibition threshold k. (c) and (d) oscillations described by eq. (9) with bp = 2, τ = 1, c = 1 and x0 = 1. (c) amplitude of oscillations. (d) period of oscillations. in c and d, the white region represents the nonoscillatory regime, in which the system settles to a fixed point. color bar labels indicate values in each panel. tial ingredient in a theory of biochemical oscillations [18]. we therefore ask how multiple binding sites in the promoter affect a biochemical oscillator. we use the regulatory function eq. (8) in a generic model for genetic oscillations. we consider a gene that encodes a protein that forms dimers, and these dimers can bind to multiple binding sites at the promoter to inhibit transcription, fig. 1a. we introduce an explicit delay τ to account for transcription, translation, splicing, and other processes involved in the assembly of the gene product and its dimerization. we assume that dimerization is a fast reaction, with a separation of timescales from other processes. therefore, at any time the dimer concentration can be approximated by that of the monomers squared. the dynamics of the product concentration x(t) is given by 060012-4 papers in physics, vol. 6, art. 060012 (2014) / i. m. lengyel et al. dx dt = bp m∑ i=0 ( n i )( x(t−τ) x0 )2i ( 1 + ( x(t−τ) x0 )2)n − cx(t), (9) where b is the basal production rate, c is the decay rate of products, p relates to the number of gene copies, τ is the total delay for product assembly, and x0 is the product concentration that halves the basal transcription rate. the regulatory function eq. (8) is parametrized by n and m. this genetic oscillator is a reduction of the models proposed by lewis [19], and other authors [48, 49]. it describes the protein concentration, but does not include the mrna; the duration of transcription and translation are both included in the delay τ. furthermore, it does not describe effects present in theories that include more than one regulator [32, 50, 51], but here it is enough to illustrate the effects of multiple binding sites in a simpler context. we integrate eq. (9) numerically and evaluate the resulting dynamics by calculating the amplitude and period of oscillations. in all numerical simulations, we use the function dde23 from matlab [52]. scanning the values of n and m, we determine whether the system oscillates in steady state: when the difference in the maxima over the last ten cycles falls below 0.01, the simulation is stopped and we record the output. the amplitude of oscillations grows with the number of binding sites n, fig. 4c. the range where the system oscillates grows with the number of binding sites n. the amplitude is nonmonotonic in m: for a fixed number of binding sites n, there is an optimal value for m that maximizes the amplitude of oscillations, fig.4c. the period of oscillations grows with n and decreases with m. these results show how the change in nonlinearity observed in the regulatory functions is reflected in the oscillations as the number of binding sites change. iv. gradual inhibition there is strong evidence for the products of some cyclic genes of the zebrafish segmentation clock binding their own promoters and acting as transcriptional repressors [23,27,32,33,35,44]. however, we do not have detailed knowledge of how these transcriptional repressors affect transcription rates when bound to the promoters of cyclic genes. there is evidence from transcriptional analysis of the hes1 gene in mouse indicating that inhibition is gradual [30]. while the wildtype hes1 promoter containing all three n box elements that are bound by hes1 proteins showed a 30-fold inhibition of transcription in the presence of hes1, mutations in one, two, and three of the n box elements showed impaired inhibition with 14-, 7-, and 2-fold inhibition, respectively [30]. motivated by these results, we consider here a scenario where additional bound repressors gradually reduce transcription rate until it drops to zero, fig. 5a. for the sake of simplicity, we assume that transcription rate drops linearly as a function of bound repressors, from a basal rate b, to zero for k + 1 occupied sites ai = { b (1 − i/(k + 1)) if i ≤ k + 1 0 if i > k + 1 , (10) see fig. 5a. using this gradual inhibition in eq. (6) together with eq. (4) and (5), we obtain regulatory functions fn,k(x) fn,k(x) = bp ( 1 + x x0 )−n × k+1∑ i=0 ( 1 − i k + 1 )( n i )( x x0 )i , (11) see fig. 5b. as in the previous case, it is clear that the effective inhibition threshold shifts to the right as k increases, but it is not so clear if the steepness of the regulatory function changes and, if so, how. performing fits to hill functions, we find that the nonmonotonic behavior of the effective hill exponent h is observed again as k increases, fig. 6a,b. oscillations are similarly affected by noncooperative binding with gradual inhibition, fig. 6c,d. these results show that the prediction of an optimal value for the number of bound repressors that fully inhibits the promoter is robust with respect to the details of how multiple bound repressors reduce the transcription rate. 060012-5 papers in physics, vol. 6, art. 060012 (2014) / i. m. lengyel et al. 0 n b 0 k 0 1 2 3 0 1 x a b number of occupied sites re g u la to ry f u n c ti o n f 1 2 ,k tr a n s c ri p ti o n r a te figure 5: gradual inhibition. binding of multiple transcriptional repressors gradually inhibits transcription. (a) transcription rate as a function of the number of bound transcriptional repressors. transcription occurs at the basal rate b in the absence of bound repressors, and decreases linearly to zero for more than k bound repressors. (b) normalized regulatory function eq. (11) as a function of repressor concentration x, with n = 12 and k = 0, . . . , 11 (dark blue to dark red). v. discussion we studied the effects of multiple noncooperative binding sites in the promoter of a genetic oscillator. we evaluated the behavior of a promoter with multiple binding sites when binding of transcriptional repressors is noncooperative, fig. 2. we considered two different hypotheses for how bound transcriptional repressors affect transcription rates, figs. 3 and 5. in both cases, we calculated how the number of binding sites and the number of bound repressors required to produce full inhibition affect the nonlinearity of regulatory functions. we showed that there is an optimal value of the number of repressors needed to fully inhibit transcription that maximizes nonlinearity, figs. 4ab and 6ab. this increased nonlinearity is reflected in the behavior of a genetic oscillator controlled by such regulatory functions, figs. 4cd and 6cd. cooperative binding is a well known means to increase the nonlinearity of a biological dynamical system [21, 39]. here we show that the nonlinearity of a regulatory function can be increased by multiple binding sites, even if binding is noncooperative. this idea may have application in other biological control systems as well. using the same formulation as we did here, one can also describe nonlinearity in the transcriptional activation of gene expression, thereby creating effective on-switches 1.0 1.2 1.4 1.6 k n 0 5 10 4 6 8 10 12 14 15 0.3 0.4 0.5 0.6 0.7 0.8 0.9 k n 0 5 10 4 6 8 10 12 14 15 a b c d k n 0 5 10 4 6 8 10 12 14 15 0.08 0.24 0.4 0.56 k n 0 5 10 4 6 8 10 12 14 15 3.12 3.15 3.18 3.21 figure 6: multiple binding sites can increase nonlinearity and enhance oscillations in the presence of gradual inhibition. (a) and (b) effective hill parameters for the regulatory function fn,k, eq. (11), with bp = 1 and x0 = 1. (a) effective hill coefficient h. (b) effective inhibition threshold k. (c) and (d) oscillations obtained using regulatory functions eq. (11) with bp = 2, τ = 1, c = 1 and x0 = 1. (c) amplitude of oscillations. (d) period of oscillations. in c and d the white region represents the nonoscillatory regime, in which the system settles to a fixed point. color bar labels indicate values in each panel. in developmental and physiological regulatory networks. a similar effect has been reported in a theoretical study of enzymes with multiple phosphorylation sites [53–55]. it was found that nonessential phosphorylation sites give rise to an increase in effective hill coefficients, enhancing ultrasensitivity in signal transduction [55]. previous work on the segmentation clock addressed the case of three binding sites [40], motivated by experimental observations of the hes7 mouse promoter [31]. the authors assumed that a single bound dimer inhibits transcription completely, corresponding to the particular case m = 1 of the theory developed here. they considered both noncooperative and cooperative binding, and showed that cooperativity increases the effective hill coefficient as expected. in a follow-up [56], the authors addressed the case of hes1 regulation based on the report of four binding sites in the hes1 mouse promoter [30]. again assuming that a sin060012-6 papers in physics, vol. 6, art. 060012 (2014) / i. m. lengyel et al. gle bound dimer inhibits transcription completely, they used the data from the transcriptional analysis of the hes1 gene to estimate an effective hill coefficient for the mouse hes1 oscillator, obtaining an upper bound of about 3. more recently, the effect of two binding sites as compared to a single binding site was discussed together with differential decay of the monomers [57]. our theory predicts how changing the number of binding sites n, and the number m of bound repressors that produce full inhibition affects a single cell oscillator. although an experiment that changes the value of m may currently be challenging in a cellular system, the number of binding sites n is more amenable to experimental manipulation. for example, binding sites could be mutated [30], or deleted from the promoter, or they could be interfered with using genome editing strategies such as talen [58] or crispr [59] to alter or delete specific binding sites. to assess the effects of these perturbations in experiments may also pose some challenges. dropping the number of binding sites from n = 12 to n = 6 introduces a period change of about 2.5%, while amplitude halves over the same range, fig. 4c,d. experiments will require at least such precision to reliably detect changes. our results suggest a possible evolutionary mechanism to increase nonlinearity in gene regulatory systems. in this mechanism, point mutations in the promoter that increase the number of binding sites for transcription factors may increase the steepness of regulatory functions. if the resulting steeper regulation performs some function better, such mutations would have a good chance to be conserved by natural selection. in the case of the segmentation clock, the amplitude of oscillations could increase with the number of binding sites, possibly reducing the signal to noise ratio. furthermore, the range for oscillations would be wider, making the oscillatory regime less sensitive to slow extrinsic fluctuations of parameter values. remarkably, it may happen that after an increase in n, an increase in m also raises nonlinearity, see fig. 4a. this means that weaker repressors would result in better oscillations. this evolutionary mechanism would provide a simple way to gradually increase the nonlinearity of a feedback or other regulatory function. the theory for the zebrafish regulatory function could be refined using the experimentally measured 0 10 0 1 x0 10 0 1 x k = 5 a b h = 3h il l fu n c ti o n f (x ) h il l fu n c ti o n f (x ) figure 7: hill functions are characterized by two parameters, the hill coefficient h and the inhibition threshold k. (a) hill functions with k = 5 and h = 1 (red), h = 3 (green) and h = 7 (blue). (b) hill functions with h = 3 and k = 3 (red), k = 5 (green) and k = 7 (blue). relative affinities of the binding sites at the her1 and her7 promoters [32]. apart from the effects reported here, the number of binding sites may have additional roles. for example, it could serve as a buffer for fluctuations in gene expression [20,60–64], augmenting the precision of genetic oscillations. this will be the topic of future work. acknowledgements we thank saúl ares, luciana bruno, ariel chernomoretz and hernan grecco for helpful comments on an early version of this work, and james ferrell for calling our attention to ref. [53]. thanks to e. aikau for inspiration. lgm acknowledges support from mincyt argentina (pict 2012 1952). a.c.o. and d.s. were supported by the medical research council uk (mc up 1202/3) and the wellcome trust (wt098025ma). appendix: effective hill functions hill functions are often used to describe the nonlinearities present in gene regulatory networks [17]. hill functions are sigmoidal step functions defined by fh(x) = 1 1 + (x/k) 2h , (12) where the steepnes of the step is characterized by the exponent 2h, and the inhibition threshold k is the concentration of repressor that halves the production rate, here scaled to unity, fig. 7. here 060012-7 papers in physics, vol. 6, art. 060012 (2014) / i. m. lengyel et al. 0 0.5 1 1.5 0 1 r = 0.9993 fit 0 1 2 3 r = 0.99992 fit 0 50 100 150 r = 0.99859 fit 0 6 12 0 1 2 3 4 m e ff e c ti v e t h re s h o ld k 0 6 12 2 3 m e ff e c ti v e h il l c o e ff ic ie n t h a b c d e 0 1 0 1 f12,1 f12,6 f12,11 x x x figure 8: effective hill functions eq. (12) can fit regulatory functions of the multiple binding site regulatory function, eq. (8) for n = 12, with bp = 1 and x0 = 1. (a-c) examples of fits of hill functions (solid lines) to regulatory functions (dashed lines) for (a) m = 1 (green), (b) m = 6 (light blue) and (c) m = 11 (purple). (d) effective hill coefficient h as a function of m. (e) effective inhibition threshold k as a function of m. dots in d and e correspond to fits from panels a, b and c. we include an explicit factor 2 in the exponent to account for the dimerization of the transcriptional repressors. one advantage of hill functions is that they are very simply parametrized. in the main text, we fit hill functions to the more complex regulatory functions eqs. (8) and (11). some fits of eq. (8) in the case n = 12 are displayed in fig. 8 for illustration. [1] a goldbeter, biochemical oscillations and cellular rhythms: the molecular bases of periodic and chaotic behaviour, cambridge university press, cambridge (1997). [2] l glass, m c mackey, from clocks to chaos: the rhythms of life, princeton university press, princeton (1988). [3] y liu, n f tsinoremas, c h johnson, n v lebedeva, s s g m ishiura, t kondo, circadian orchestration of gene expression in cyanobacteria, gene. dev. 9, 1469 (1995). [4] e nagoshi, c saini, c bauer, t laroche, f naef, u schibler, circadian gene expression in individual fibroblasts: cell-autonomous and self-sustained oscillators pass time to daughter cells, cell 119, 693 (2004). [5] i mihalcescu, w hsing, s leibler, resilient circadian oscillator revealed in individual cyanobacteria, nature 430, 81 (2004). [6] a goldbeter, c gérard, d gonze, j c leloup, g dupont, systems biology of cellular rhythms, febs lett. 586, 2955 (2012). [7] i palmeirim, d henrique, d ish-horowicz, o pourquié, avian hairy gene expression identifies a molecular clock linked to vertebrate segmentation and somitogenesis, cell 91, 639 (1997). [8] a aulehla, w wiegraebe, v baubet, m b wahl, c deng, m taketo, m lewandoski, o pourquié, a β-catenin gradient links the clock and wavefront systems in mouse embryo segmentation, nat. cell biol. 10, 186 (2008). [9] y masamizu, t ohtsuka, y takashima, h nagahara, y takenaka, k yoshikawa, h okamura, r kageyama, real-time imaging of the segmentation clock: revelation of unstable oscillators in the individual presomitic mesoderm cells, proc. natl. acad. sci. usa 103, 1313 (2006). [10] a j krol, d roellig, m l dequéante, o tassy, e glynn, g hattem, a mushegian, a c oates, o pourquié, evolutionary plasticity of segmentation clock networks, development 138, 2783 (2011). [11] h shimojo, t ohtsuka, r kageyama, oscillations in notch signaling regulate maintenance of neural progenitors, neuron 58, 52 (2008). [12] n geva-zatorsky, n rosenfeld, s itzkovitz, r milo, a sigal, e dekel, t yarnitzky, y liron, p polak, g lahav, u alon, oscillations and variability in the p53 system, mol. syst. biol. 2, 2006.0033 (2006). [13] m b elowitz, s leibler, a synthetic oscillatory network of transcriptional regulators, nature 403, 335 (2000). 060012-8 papers in physics, vol. 6, art. 060012 (2014) / i. m. lengyel et al. [14] j stricker, s cookson, m r bennett, w h mather, l s tsimring, j hasty, a fast, robust and tunable synthetic gene oscillator, nature 456, 516 (2008). [15] j j tyson, computational cell biology, chap. 9, pag. 230, springer, berlin (2002). [16] b alberts, a johnson, j lewis, m raff, k roberts, p walter, molecular biology of the cell, 4th ed., garland science, new york (2002). [17] u alon, an introduction to systems biology: design principles of biological circuits, chapman & hall/crc press, boca raton, florida (2006). [18] b novák, j j tyson, design principles of biochemical oscillators, nat. rev. mol. cell biol. 9, 981 (2008). [19] j lewis, autoinhibition with transcriptional delay: a simple mechanism for the zebrafish somitogenesis oscillator, curr. biol. 13, 1398 (2003). [20] l g morelli, f jülicher, precision of genetic oscillators and clocks, phys. rev. lett. 98, 228101 (2007). [21] j e ferrell, q&a: cooperativity, j. biol. 8, 53 (2009). [22] o pourquié, vertebrate segmentation: from cyclic gene networks to scoliosis, cell 145, 650 (2011). [23] a c oates, l g morelli, s ares, patterning embryos with oscillations: structure, function and dynamics of the vertebrate segmentation clock, development 139, 625 (2012). [24] y saga, the synchrony and cyclicity of developmental events, cold spring harb. perspect. biol. 4, a008201 (2012). [25] d roellig, l g morelli, s ares, f jülicher, a c oates, snapshot: the segmentation clock, cell 145, 800 (2011). [26] y saga, the mechanism of somite formation in mice, curr. opin. genet. dev. 22, 331 (2012). [27] a c oates, r k ho, hairy/e(spl)-related (her) genes are central components of the segmentation oscillator and display redundancy with the delta/notch signaling pathway in the formation of anterior segmental boundaries in the zebrafish, development 129, 2929 (2002). [28] h hirata, s yoshiura, t ohtsuka, y bessho, t harada, k yoshikawa, r kageyama, oscillatory expression of the bhlh factor hes1 regulated by a negative feedback loop, science 298, 840 (2002). [29] s a holley, d jülich, g j rauch, r geisler, c nüsslein-volhard, her1 and the notch pathway function within the oscillator mechanism that regulates zebrafish somitogenesis, development 129, 1175 (2002). [30] k takebayashi, y sasai, y sakai, t watanabe, s nakanishi, r kageyama, structure, chromosomal locus, and promoter analysis of the gene encoding the mouse helix-loop-helix factor hes-1. negative autoregulation through the multiple n box elements, j. biol. chem. 269, 5150 (1994). [31] y bessho, g miyoshi, r sakata, r kageyama, hes7: a bhlh-type repressor gene regulated by notch and expressed in the presomitic mesoderm, genes cells 6, 175 (2001). [32] c schröter, s ares, l g morelli, a isakova, k hens, d soroldoni, m gajewski, f jülicher, s j maerkl, b deplancke, a c oates, topology and dynamics of the zebrafish segmentation clock core circuit, plos biol. 10, e1001364 (2012). [33] a trofka, j schwendinger-schreck, t brend, w pontius, t emonet, s a holley, the her7 node modulates the network topology of the zebrafish segmentation clock via sequestration of the hes6 hub, development 139, 940 (2012). [34] a hanisch, m v holder, s choorapoikayil, m gajewski, e m özbudak, j lewis, the elongation rate of rna polymerase ii in zebrafish and its significance in the somite segmentation clock, development 140, 444 (2013). [35] f giudicelli, e m özbudak, g j wright, j lewis, setting the tempo in development: 060012-9 papers in physics, vol. 6, art. 060012 (2014) / i. m. lengyel et al. an investigation of the zebrafish somite clock mechanism, plos biol. 5, 1309 (2007). [36] e m özbudak, j lewis, notch signalling synchronizes the zebrafish segmentation clock but is not needed to create somite boundaries, plos genet. 4(2), e15 (2008). [37] y harima, y takashima, y ueda, t ohtsuka, r kageyama, accelerating the tempo of the segmentation clock by reducing the number of introns in the hes7 gene, cell rep. 3, 1 (2013). [38] j keener, j sneyd, mathematical physiology i: cellular physiology, 2nd ed. springer, berlin (2008). [39] h qian, cooperativity in cellular biochemical processes: noise-enhanced sensitivity, fluctuating enzyme, bistability with nonlinear feedback, and other mechanisms for sigmoidal responses, ann. rev. biophys. 41, 179 (2012). [40] s zeiser, h v liebscher, h tiedemann, i rubio-aliaga, g k h przemeck, m h de angelis, g winkler, number of active transcription factor binding sites is essential for the hes7 oscillator, theor. biol. med. model. 3, 11 (2006). [41] j gunawardena, multisite protein phosphorylation makes a good threshold but can be a poor switch, p. natl. acad. sci. usa 102, 14617 (2005). [42] v gotea, a visel, j m westlund, m a nobrega, l a pennacchio, i ovcharenko, homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers, genome res. 20, 565 (2010). [43] d s burz, r rivera-pomar, h jäckle, s d hanes, cooperative dna-binding by bicoid provides a mechanism for threshold-dependent gene activation in the drosophila embryo, embo j. 17, 5998 (1998). [44] t brend, s a holley, expression of the oscillating gene her1 is directly regulated by hairy/enhancer of split, t-box, and suppressor of hairless proteins in the zebrafish segmentation clock, dev. dynam. 238, 2745 (2009). [45] d soroldoni, personal communication (2014). [46] l bintu, n e buchler, h g garcia, u gerland, t hwa, j kondev, r phillips, transcriptional regulation by the numbers: models, curr. opin. genet. dev. 15, 116 (2005). [47] h g garcia, a sanchez, t kuhlman, j kondev, r phillips, transcription by the numbers redux: experiments and calculations that surprise, trends cell biol. 20, 723 (2010). [48] n a m monk, oscillatory expression of hes1, p53, and nf-κb driven by transcriptional time delays, curr. biol. 13, 1409 (2003). [49] m h jensen, k sneppen, g tiana, sustained oscillations and time delays in gene expression of protein hes1, febs lett. 541, 176 (2003). [50] o cinquin, repressor dimerization in the zebrafish somitogenesis clock, plos comp. biol. 3, e32 (2007). [51] a ay, s knierer, a sperlea, j holland, e m özbudak, short-lived her proteins drive robust synchronized oscillations in the zebrafish segmentation clock, development 140, 3244 (2013). [52] l f shampine, s thompson, solving ddes in matlab, appl. numer. math. 37, 441 (2001). [53] l wang, q nie, g enciso, nonessential sites improve phosphorylation switch, biophys. j. 99, l41 (2010). [54] s ryerson, g enciso, ultrasensitivity in independent multisite systems, j. math. biol. 69, 977 (2014). [55] g enciso, nonautonomous and random dynamical systems in life sciences lecture notes in mathematics (mathematical biosciences subseries) no 2102, springer verlag, berlin (2013). [56] s zeiser, j müller, v liebscher, modeling the hes1 oscillator, j. comput. biol. 14, 984 (2007). [57] m campanelli, t gedeon, somitogenesis clock-wave initiation requires differential decay and multiple binding sites for clock protein, plos comp. biol. 6, e1000728 (2010). 060012-10 papers in physics, vol. 6, art. 060012 (2014) / i. m. lengyel et al. [58] v m bedell, y wang, j m campbell, t l poshusta, c g starker, r g k ii, w tan, s g penheiter, a c ma, a y h leung, s c fahrenkrug, d f carlson, d f voytas, k j clark, j j essner, s c ekker, in vivo genome editing using a high-efficiency talen system, nature 491, 114 (2012). [59] e pennisi, the crispr craze, science 341, 833 (2013). [60] n barkai, s leibler, biological rhythms: circadian clocks limited by noise, nature 403, 267 (2000). [61] m b elowitz, a j levine, e d siggia, p s swain, stochastic gene expression in a single cell, science 297, 1183 (2002). [62] j raser, e o’shea, noise in gene expression: origins, consequences, and control, science 309, 2010 (2005). [63] b munsky, g neuert, a v oudenaarden, using gene expression noise to understand gene regulation, science 336, 183 (2012). [64] l s tsimring, noise in biology, rep. prog. phys. 77, 026601 (2014). 060012-11 papers in physics, vol. 9, art. 090008 (2017) received: 19 september 2017, accepted: 12 october 2017 edited by: k. hallberg licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.090008 www.papersinphysics.org issn 1852-4249 dilute antiferromagnetism in magnetically doped phosphorene a. allerdt,1 a. e. feiguin1∗ we study the competition between kondo physics and indirect exchange on monolayer black phosphorous using a realistic description of the band structure in combination with the density matrix renormalization group (dmrg) method. the hamiltonian is reduced to a one-dimensional problem via an exact canonical transformation that makes it amenable to dmrg calculations, yielding exact results that fully incorporate the many-body physics. we find that a perturbative description of the problem is not appropriate and cannot account for the slow decay of the correlations and the complete lack of ferromagnetism. in addition, at some particular distances, the impurities decouple forming their own independent kondo states. this can be predicted from the nodes of the lindhard function. our results indicate a possible route toward realizing dilute anti-ferromagnetism in phosphorene. i. introduction phosphorene, a single layer of black phosphorus, is one of the many allotropes of the element. others include red, white, and velvet phosphorus. of these, black is the most thermodynamically stable and least reactive [1]. its crystalline structure resembles that of graphite, whose 2d form is famously known as graphene, and can similarly be fabricated into 2d layers. the main qualitative distinction lies in the fact that phosphorene has a “puckered” hexagonal structure which is responsible for opening a gap in the band structure. in addition, the bonding structure is vastly different, since phosphorene does not have sp2 bonds, but is composed of 3p orbitals [2, 3]. successful production of monolayer black phosphorus has been achieved only in the last few years, with the first publications concerning its single layer properties appearing in 2014 [4–7]. while being a semi-conductor with a small direct band gap in bulk form (∼ 0.3 ev) [8], the gap increases as the number of layers is de∗a.feiguin@northeastern.edu 1 department of physics, northeastern university, boston, massachusetts ma 02115, usa. creased. since its appearance, there have been many proposed promising applications and exotic properties. besides being a semi-conductor with a band gap in the optical range, it has ample flexibility and high carrier mobility [4, 9]. further, it also possesses interesting optical properties such as absorbing light only polarized in the armchair direction, indicating a possible future as a linear polarizer [10]. its stable excitons present possible applications in optically driven quantum computing [11]. interest on its applications continues to grow. for a comprehensive review, we refer to ref. [11]. our understanding of magnetic doping in phosphorene is still in its infancy. a thorough study of metal adatoms adsorbed on phosphorene was performed in ref. [12] where a variety of structural, electronic and magnetic properties emerge. the authors show the binding energies are twice the amount in graphene. a dft study has proposed possible chemical doping by means of adsorption of different atoms, ranging from n-type to p-type, as well as transition metals with finite magnetic moment [13]. another conceivable method of moving the fermi level is to induce strain on the lattice, causing the level to move into the conduction band [14]. the case of magnetic impurities is considerably 090008-1 papers in physics, vol. 9, art. 090008 (2017) / a. allerdt et al. figure 1: phosphorene lattice showing the directions chosen to place the impurities for calculations shown in figs. 3, 4, and 5. the numbering of lattice sites refers to the impurity separations in the same figures. also shown are the five hopping parameters borrowed from ref. [5]. the labels a, b, c and d refer to the four sublattices. non-trivial, since one has to account for the kondo effect [15] with the impurity being screened by the conduction spins in the metallic substrate, and the rkky interaction [16–18], an effective indirect exchange between impurities mediated by the conduction electrons. these two phenomena are expected to be present and compete in phosphorene when the fermi level is not sitting in the gap. the rkky interaction in phosphorene with the inclusion of mechanical strain is investigated in ref. [14]. however, all their calculations are based on second order perturbation theory and ignore the effects that kondo physics can induce. in this work, we present numerical results for two kondo impurities in phosphorene that capture the full many-body physics and we discuss the competition between kondo and indirect exchange. ii. model and numerical method the hamiltonian studied in this work is the twoimpurity kondo model, generically written as: h = hband + jk ( ~s1 ·~sr1 + ~s2 ·~sr2 ) , (1) where hband is the non-interacting part describing the band structure of the material while the impurities are locally coupled to the substrate at positions r1 and r2 via an interaction jk. here, ~si represent the impurities, and ~sri is the spin of the conduction electrons at site ri. this problem has been theoretically studied in other figure 2: local density of states of phosphorene. vertical lines show positions of the different fermi energies used in the calculations. materials using a variety of approaches. this work, however, will take a non-perturbative approach revealing important subtle points. traditionally, the rkky interaction is characterized by the lindhard function in second order perturbation theory, which takes the form: χij = 2<   ∑ eα>ef>eβ 〈i|α〉〈α|j〉〈j|β〉〈β|i〉 eα −eβ   , (2) where |i,j〉 represent lattice sites and |α,β〉 are eigenstates of hband. this function oscillates and changes sign with a period that depends on the density of the electrons. when the full geometry of the problem is taken into account, the picture becomes more complex, since a system could have a complex fermi surface even with more than one band crossing the fermi level in some cases [19]. however, it has been shown by the authors in previous studies [19–21] that this does not uncover the full picture. we adapt the four-band model introduced in ref. [5] to obtain a tight-binding description of the phosphorene band structure. five hopping parameters are employed to closely reproduce the bands near the fermi level with a gap of approximately 1.6 ev. these are shown schematically in fig. 1, and their values are restated in table 1. a plot of the density of states is shown in fig. 2. all energies in the calculations are in units of ev. to see the band dispersions, we refer to ref. [5]. in order to solve the interacting problem we first perform a unitary transformation to map the quadratic part of the hamiltonian hband onto an equivalent onedimensional one, as described in detail in refs. [20, 22]. 090008-2 papers in physics, vol. 9, art. 090008 (2017) / a. allerdt et al. t1 −1.220 t2 3.665 t3 −0.205 t4 −0.105 t5 −0.055 table 1: tight binding hopping parameters in ev used in the calculations throughout this work. the full many-body calculation can in turn be carried out using the density matrix renormalization group (dmrg) algorithm [23–25] with high accuracy and without approximations. for all dmrg calculations, the total system size is l = 124 (including impurities), and we fix the truncation error to be smaller than 10−7. it is known that phosphorene nano-ribbons can host quasi-flat edge states whose emergence is topologically similar to graphene [26]. a study of their even/odd properties under application of an electric field has recently been performed [27], revealing a gap opening in these edge states. edge states can introduce notable finite-size effects [21]. however, due to the geometry of the lattices produced by the lanczos transformation employed in this work, these edge states are not present. in addition, we shall focus our study in a regime away from half-filling. iii. results we start by calculating the lindhard function as a reference using eq. (2). figure 3 shows results for two different fermi energies, one in the conduction band, and one in the valence band. notice the different period of oscillation, which is to be expected from conventional rkky theory [16–18]. we also show, in different panels, the dmrg result for the impurity spin-spin correlations 〈sz1sz2〉. since the system is su(2) symmetric, only the z-component is displayed. for small values of jk we find that he rkky correlations saturate at -1/4. this is an indication that the two impurities are decoupled forming “free moments”. when this happens, the spins can be pointing in either direction yielding a four-fold degenerate ground state. since we force sz = 0 in our calculations, the impurities have no choice but to align anti-ferromagnetically. this is a finite size effect that is expected when the interaction jk is of the order of the level spacing in the non-interacting bands. apart from short distances, the spin correlations roughly follow the oscillation patterns predicted figure 3: (a) lindhard function and (b) spin-spin correlations along the zig-zag direction with the fermi level at position a in fig. 2. (c) and (d) are the same quantities with the fermi level at position c. by the lindhard function with one striking difference: ferromagnetism is non-existent. this lack of ferromagnetism is also seen in previous works [19–21]. however, in doped graphene, for example [21], the correlations depart drastically from the lindhard function, and kondo physics dominates after just a few lattice spacings. here, we find that rkky physics plays the dominant role, with correlations persisting even for quite large coupling (jk=2.0 ev), which is greater than the band gap. figure 4 again the lindhard function and spin correlations, but for a different fermi energy and in an extended range. this is to highlight the fact that the rkky oscillations survive at large distances with little signs of decay. ferromagnetism is again absent in contrast with the predictions of perturbation theory, exemplifying the need for exact numerical techniques that can capture the many-body physics. the impurities, however, do appear to form their independent kondo singlets at some particular distances. this results in the impurities being completely uncorrelated 〈sz1sz2〉 = 0 and typically occurs at positions where the lindhard function has a node, as observed in ref. [20]. to show that these effects are not due to the particular directions chosen, calculations were done along other paths with qualitatively similar results. as an example, we show in fig. 5 the correlations along the perpendicular direction with the fermi level at posi090008-3 papers in physics, vol. 9, art. 090008 (2017) / a. allerdt et al. figure 4: lindhard function (top) and spin-spin correlations (bottom) along the zig-zag direction with the fermi level at position b in fig. 2. results are shown with an extended range, highlighting the persistent oscillations even for large jk. tion b. as can be seen from the lattice structure, single layer phosphorene contains four sublattices, as labeled in fig. 1. assuming the first impurity is placed on an ‘a’ site, along the zig-zag direction the impurities follow an “a − a,a − b,a − a,a − b...” pattern. alternatively, the second impurity could be placed on the other “layer” along the zig-zag direction resulting in a “a − c,a − d,a − c,a − d...” pattern. it is clear that a−d is equivalent to b −c. the armchair direction, as it has been defined, consists of impurities always on the same sublattice. in all cases (not shown), ferromagnetism is absent and we find dominant antiferromagnetic rkky interactions. iv. conclusions by means of a canonical transformation and exact numerical calculations using the dmrg method, we have studied the competition between kondo and rkky physics on phosphorene. the method is numerically exact, and even though it is limited to finite systems, these can be very large, of the order of a hundred lattice sites and more. our results highlight the non-perturbative nature of the rkky interaction and the non-trivial abfigure 5: lindhard function (top) and spin-spin correlations (bottom) along the armchair direction with the fermi level at position b in fig. 2. sence of ferromagnetism. this remains an outstanding question and should stimulate more research in this direction. it is possible that by adding a repulsive interaction between conduction electrons, the system may acquire a net magnetic moment, a behavior of this type has been observed in graphene doped with hydrogen defects [28–30]. according to lieb’s theorem [31], this is expected to occur for bi-partite lattices when two impurities are on the same sublattice. even though phosphorene has four sublattices (and hence, the theorem does not rigorously apply), the system may still realize similar physics. unfortunately, our approach can only describe non-interacting/quadratic hamiltonians and we cannot prove this conjecture. on the other hand, the dominant anti-ferromagnetism at all dopings and distances (more robust than in graphene [21]) indicates a route toward realizing dilute 2d anti-ferromagnetism with phosphorene. acknowledgements the authors are grateful to the u.s. department of energy, office of basic energy sciences, for support under grant de-sc0014407. [1] r b jacobs, phosphorus at high temperatures and pressures, j. chem. phys. 59, 945 (1937). 090008-4 papers in physics, vol. 9, art. 090008 (2017) / a. allerdt et al. [2] l pauling, m simonetta, bond orbitals and bond energy in elementary phosphorus, j. chem. phys. 20, 29 (1952). [3] r r hart, m b robin, n a kuebler, 3p orbitals, bent bonds, and the electronic spectrum of the p4 molecule, j. chem. phys. 42, 3631 (1965). [4] h liu et al., phosphorene: an unexplored 2d semiconductor with a high hole, acs nano 8, 4033 (2014). [5] a n rudenko, m i katsnelson, quasiparticle band structure and tight-binding model for singleand bilayer black phosphorus, phys. rev. b 89, 201408 (2014). [6] a s rodin, a carvalho, a h castro neto, straininduced gap modification in black phosphorus, phys. rev. lett. 112, 176801 (2014). [7] l li et al., black phosphorus field-effect transistors, nat. nanotechnol. 9, 372 (2014). [8] r w keyes, the electrical properties of black phosphorus, phys. rev. 92, 580 (1953). [9] x ling, h wang, s huang, f xia, m s dresselhaus, the renaissance of black phosphorus, p. natl. acad. sci. usa 112, 4523 (2015). [10] v tran, r soklaski, y liang, l yang, layercontrolled band gap and anisotropic excitons in few-layer black phosphorus, phys. rev. b 89, 235319 (2014). [11] a carvalho et al., phosphorene: from theory to applications, nat. rev. mater. 1, 16061 (2016). [12] v v kulish, o i malyi, c persson, p wu, adsorption of metal adatoms on single-layer phosphorene, phys. chem. chem. phys. 17, 992 (2015). [13] p rastogi, s kumar, s bhowmick, a agarwal, y s chauhan, effective doping of monolayer phosphorene by surface adsorption of atoms for electronic and spintronic applications, iete j. res. 63, 205 (2017). [14] h-j duan et al., anisotropic rkky interaction and modulation with mechanical strain in phosphorene, new j. phys. 19, 103010 (2017). [15] a c hewson, the kondo problem to heavy fermions, cambridge university press, new york (1983). [16] k yosida, magnetic properties of cu-mn alloys, phys. rev. 106, 893 (1957). [17] m a ruderman, c kittel, indirect exchange coupling of nuclear magnetic moments by conduction electrons, phys. rev. 96, 99 (1954). [18] t kasuya, a theory of metallic ferroand antiferromagnetism on zener’s model, prog. theor. phys. 16, 45 (1956). [19] a allerdt, r žitko, a e feiguin, nonperturbative effects and indirect exchange interaction between quantum impurities on metallic (111) surfaces, phys. rev. b 95, 235416 (2017). [20] a allerdt, c a büsser, g b martins, a e feiguin, kondo versus indirect exchange: role of lattice and actual range of rkky interactions in real materials, phys. rev. b 91, 085101 (2015). [21] a allerdt, a e feiguin, s d sarma, competition between kondo effect and rkky physics in graphene magnetism, phys. rev. b 95, 104402 (2017). [22] c a büsser, g b martins, a e feiguin, lanczos transformation for quantum impurity problems in d-dimensional lattices: application to graphene nanoribbons, phys. rev. b 88, 245113 (2013). [23] s r white, density matrix formulation for quantum renormalization groups, phys. rev. lett. 69, 2863 (1992). [24] s r white, density-matrix algorithms for quantum renormalization groups, phys. rev. b 48, 10345 (1993). [25] a e feiguin, the density matrix renormalization group method and its time-dependent variants, aip conf. proc. 1419, 5 (2011). [26] m ezawa, topological origin of quasi-flat edge band in phosphorene, new j. phys. 16, 115004 (2014). [27] b zhou, b zhou, x zhou, g zhou, even–odd effect on the edge states for zigzag phosphorene nanoribbons under a perpendicular electric field, j. phys. d: appl. phys. 50, 045106 (2017). 090008-5 papers in physics, vol. 9, art. 090008 (2017) / a. allerdt et al. [28] o v zazyev, l helm, defect-induced magnetism in graphene, phys. rev. b 75, 125408 (2007). [29] s casolo, o m løvvik, r martinazzo, g f tantardini, understanding adsorption of hydrogen atoms on graphene, j. chem. phys. 130, 054704 (2009). [30] h gonzález-herrero et al., atomic-scale control of graphene magnetism by using hydrogen atoms, science 352, 437 (2016). [31] e h lieb, two theorems on the hubbard model, phys. rev. lett. 62, 1201 (1989). 090008-6 papers in physics, vol. 10, art. 100005 (2018) received: 20 september 2017, accepted: 4 may 2018 edited by: a. mart́ı, m. monteiro reviewed by: j-l richter, lycée polyvalent j-b schwilgué, france licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.100005 www.papersinphysics.org issn 1852-4249 smartphones on the air track. examples and difficulties manuel á. gonzález,1∗ alfonso gómez,1 miguel á. gonzález1 in this paper we describe a classical experiment with an air track in which smartphones are used as experimental devices to obtain physical data. the proposed experiment allows users to easily observe and measure relationships between physical magnitudes, conservation of momentum in collisions and friction effects on movement by utilizing the users’ own mobile devices. i. introduction smartphones and tablets have sensors that can be used to redesign physics experiments by substituting laboratory equipment with those devices [1]. this technique can enrich data availability in an experiment and also reduces its costs. the work in the laboratory under controlled conditions can also help experimenters to learn the use and limitations of smartphones and to apply them in experiments outside the laboratory. in this paper, we show some results obtained in a classical experiment, pointing out basic concepts that can be explored and analyzed by using smartphones, as well as some difficulties of the work. ii. using the smartphone in an airtrack experiment a simple and usual mechanics experiment consists in studying linear movement on an air track without friction. on it, positions, speeds and accelerations can be measured by using measuring tapes, chronometers or more sophisticated photoelectric ∗e-mail: manuelgd@termo.uva.es 1 universidad de valladolid, 47011 valladolid, spain. cells. here, we discuss how smartphones can be used to analyze the movement of a body in three different configurations of the experiment: elastic collisions between two carts, accelerated movement of a cart pulled by a falling body via pulley, and movement of a cart when the air pump is switched off and it is stopped by friction. as experimental devices in our work we used two smartphones, a samsung galaxy s3 mini and a samsung galaxy s4. these smartphones were placed horizontally on the moving carts with their y-axis pointed along the direction of the movement and their z-axis vertically. the carts can hold additional weights so that we could study interactions between bodies with same or different masses. to access the data recorded using the sensors of the smartphones, we used two android apps: sensormobile [2] and physics toolbox [3], so that the advantages or disadvantages, accuracy and numerical noise, of different apps can also be analyzed. i. elastic collisions between carts in a horizontal air track this is a simple experiment that is done by using two carts, each with a smartphone measuring its acceleration, allowing us to check the momentum conservation in a collision between two bodies. in our experiments, we used different configurations 100005-1 papers in physics, vol. 10, art. 100005 (2018) / m. á. gonzález et al. -5 -4 -3 -2 -1 0 1 2 3 4 5 100 200 300 400 500 600 700 800 900 area #2 = +282.54 10 -3 m/s m2*area2 = +89.87 10 -3 kg m/s area #1 = -276.44 10 -3 m/s m1*area1 = -88.07 10 -3 kg m/s a c c e l. ( m /s 2 ) t (ms) mobile #1, m1 = 318.60 g mobile #2, m2 = 318.76 g figure 1: accelerations of two bodies of equal mass during a collision. a spreadsheet can be useful to calculate the area under the acceleration curves and check momentum conservation readily. that can be explored: both carts moving on the air track in the same or in opposite directions, or one cart at rest while the other moves towards it. moreover, by adding different masses to the carts we have analyzed collisions between bodies with the same or with different masses. figure 1 shows the results of one of such experiments corresponding to an elastic collision between two bodies (cart plus smartphone) of similar masses. in this case only one of the bodies was moving before the impact. once the acceleration of each body is measured by the smartphones, the csv files generated by the apps can be transferred to a computer and analyzed easily using a spreadsheet program. here the spreadsheet was used to obtain the area under the curve of each acceleration using a simple numerical method like the trapezoidal rule. as can be seen in fig. 1, when the masses are nearly equal the areas calculated from the smartphone data are also similar (the difference is a little larger than the 2%) considering the experimental noise. and multiplying those areas by the masses of the bodies the conservation of linear momentum is checked, as shown in fig. 1. ii. frictionless movement of a cart pulled by a falling body through pulley another dynamics problem studied in the first courses of physics is the movement of a body, on a frictionless horizontal or inclined plane, connected to a cable that passes over a pulley and then is fastened to a falling object. here, we study such movement by placing a smartphone on the moving cart and measuring its acceleration. figure 2 shows some results of two measurements of this type. they correspond to the acceleration suffered by a body (cart plus smartphone) of mass mc = 318.76 g on a horizontal plane when the falling body has different masses in two different experiments, mb = 20.02 g and 29.99 g. according to the theory studied in the classroom, assuming that there is no friction, the acceleration of the cart is a = mb mc+mb g that for our conditions gives accelerations a(mb=20.02) = 0.579 m/s 2 and a(mb=29.99) = 0.843 m/s 2. as can be seen these results agree well with the average acceleration of the cart as measured by the smartphone on it. the users could also add a second smartphone on the falling body and check that the acceleration of the cart and that of the falling body are the same within the experimental accuracy, according to this simple model. from the accelerometer measurements, the dependence of the travelled space with acceleration and time in the uniformly accelerated movement can be checked. if the length of the cable is the same in two measurements with different falling masses, then the ratio of travelling times ∆t2/∆t1 must be equal to the square root of the ratio of the average accelerations measured by the smartphones √ a2/a1. as can be seen from the measurements in the figure, ∆t2/∆t1 ≈ 1.199, while for the accelerations √ a2/a1 ≈ 1.207, so that the the theoretical dependence s = 1 2 at2 is easily proved. iii. movement with friction of a cart pulled by a falling body through pulley in this section, we show some results that can be obtained if the air pump of the track is set off and the cart moves with friction. for this experiment the lower part of the cart should be protected, for example with duct tape, to avoid damaging the air track with scratches, which will also vary the friction coefficient as the experiment is performed. the experiment is the same as in the section ii., except for the existence of friction. the acceleration of the cart is lower than the one without friction, the cart decelerates, and finally stops if the cable is long enough for the falling object to reach the floor 100005-2 papers in physics, vol. 10, art. 100005 (2018) / m. á. gonzález et al. 0 0.2 0.4 0.6 0.8 1 1.2 500 1000 1500 2000 ∆t2 = 1659 ms ∆t1 = 1402 ms a c c e l. ( m /s 2 ) t (ms) m1 = 29.99 g m2 = 20.02 g figure 2: measurements of the acceleration of a cart on an air track when pulled by different masses in two experiments. the average values of acceleration, in the two experiments, marked with lines in the figure, were a1 ≈ 0.84 m/s2 and a2 ≈ 0.58 m/s2 before the cart gets to the track end. figure 3 shows an example of the results with the smartphone accelerometer in such experiment. that figure shows acceleration data of three independent measurements under the same conditions to illustrate the repeatability of the results. some comments are necessary about the results shown in the figure. as can be seen there, the acceleration is positive when the movement is accelerated, and changes its sign when the falling body reaches the floor and the friction decelerates the movement of the cart. a very remarkable result that immediately calls our attention is the oscillations in the accelerated part of the movement. as can be seen in fig. 3 (and in fig. 4), these oscillations appear consistently in all the experiments and then, cannot be considered an experimental artifact. in our opinion, these oscillations appear due to the combined effect of the push on the string of the falling body and the pull due to the friction on the cart attached to the end of the string. these oscillations do not appear when the air pump is working and there is no friction. from our point of view, these oscillations reflect variations in the acceleration of the cart due to changes in the string tension. this is a problem that we consider very interesting to see. when this experiment is done using classical measurement devices the usual approximation that considers inextensible strings and constant tension seems to hold true, but smartphone sensors are sensible to small variations in them, as can be seen here. this observation can help the experimenters to consider the limitations of the physical model and the influence of the string characteristics. figure 3 shows in an inset the experimental points of the three measurements together with a damped oscillation with period t ≈ 0.105 s. for the theoretical behaviour we have tested accelerations for pure coulomb and pure viscous damping but the experimental results seems to be a mixing of both behaviours, what makes more difficult its analysis. the inset in fig. 3 shows a comparison of the experimental points with a theoretical curve for the acceleration. from this comparison, we have an estimation of the initial elongation of the string to be around 1.2·10−3 m (the length of the string used in this experiment was approximately 1.75 m). in addition, a qualitative analysis of the oscillations was acomplished by using strings with different elasticity. we observed that these oscillations were less noticeable for strings with lower elasticities, and that even for low elastic strings these oscillations were masked by the experimental noise. on the other hand, for more rigid strings, such as metallic strings, the amplitude of the oscillations was smaller due to their lower deformation. anyway, both the mixing of effects in this result and its experimental difficulties could make an exact analysis harder than in the classical version of the experiment. other results can be obtained easily by the users. from the masses of the cart (mc = 318.8 g) and of the falling body (mb = 358.05 g) they can calculate the theoretical acceleration without friction, resulting ateo = 5.18 m/s 2, that is clearly higher than the experimental value in the accelerated part of the movement, whose average value is aexp = 2.386 m/s2 (see fig. 3). using this experimental value, the students can obtain the friction coefficient between the duct tape protecting the cart and the track with µ = (mbg − (mc + mb)aexp)/(mcg), resulting µ = 0.61. once µ is known the friction deceleration afriction = µg = 5.94 m/s 2 can be calculated, that, as shown in fig. 3, agrees well with the experimental results within the experimental noise. from the experimental data the impulsemomentum relationship can also be observed. as the cart started and ended its movement with null speed, the areas under the acceleration and deceleration curves in fig. 4 must have the same value 100005-3 papers in physics, vol. 10, art. 100005 (2018) / m. á. gonzález et al. -8 -6 -4 -2 0 2 4 6 1 1.5 2 2.5 3 anf = 5.184 m/s 2 = 2.386 m/s 2 experimental µ = 0.61 calculated -µg = -5.94 m/s 2 a c c e l. ( m /s 2 ) t (s) experiment #1 #2 #3 t ≈ 0.105 s figure 3: results of four independent measurements of the acceleration of a cart pulled by a falling body via a pulley when there is friction between the cart and the air track. the values of the theoretical acceleration without friction, at, and of the experimental average acceleration, ae, are marked with straight lines. one of the measurements has been represented with lines and points to show the position of the experimental points. but opposite signs. by using a spreadsheet program and the numerical trapezoidal rule, those areas can be easily obtained. finally, fig. 4 shows another result that can be discussed. results for two different conditions of friction are shown in it: an aluminium cart on an aluminium track and a duct tape covered cart on an aluminium track. due to different friction coefficients, the cart accelerations are different and, consequently, their travelled times. as for fig. 3, the users can calculate the dynamic friction coefficients and the corresponding decelerations, and compare them with the experimental results. a final result that can be discussed is the influence of different friction coefficients on the stopping times. iii. conclusions smartphone sensors are very useful tools that permit to do measurements in the laboratory and outside it. the use of these devices can also increase the interest in physics and facilitate the understanding of the physical phenomena. we have shown some results of an air track experiment using a smartphone. some of the concepts reinforced with this experiment are acceleration, collisions, momentum, friction, friction coefficient, impulse-6 -4 -2 0 2 4 6 200 400 600 800 1000 1200 1400 1600 1800 calculated deceleration -µg= -4.31 m/s 2a c c e l. ( m /s 2 ) t (ms) -8 -6 -4 -2 0 2 4 6 calculated deceleration -µg= -6.64 m/s 2 a c c e l. ( m /s 2 ) figure 4: accelerations recorded by the smartphone in an experiment like that in fig. 3 but two different friction coefficients: painted al cart on al track (above) and duct tape covered cart on al track (below). the cart and the falling body masses were mc = 317.3 g and mb = 359.1 g, respectively. the experimental average accelerations of each case have been marked with horizontal lines, as well as the deccelerations once the falling body reaches the floor. momentum theorem and even elasticity. these experiments can also be useful to learn the importance of sensors accuracy and measuring frequency, as well as the reliability of the used applications. acknowledgements this work has been supported by the university of valladolid within its teaching innovation program under grants pid2016 64 and pid2016 67. [1] r vieyra, c vieyra, p jeanjacquot, a mart́ı, m monteiro, turn your smartphone into a science laboratory, the science teacher 82, 32 (2015). [2] sensormobile https://play.google.com/ store/apps/details?id=com.sensor. mobile, last accessed september 13, 2017. [3] physicstoolbox https://play.google. com/store/apps/details?id=com. chrystianvieyra.physicstoolboxsuite, last accessed september 13, 2017. 100005-4 https://play.google.com/store/apps/details?id=com.sensor.mobile https://play.google.com/store/apps/details?id=com.sensor.mobile https://play.google.com/store/apps/details?id=com.sensor.mobile https://play.google.com/store/apps/details?id=com.chrystianvieyra.physicstoolboxsuite https://play.google.com/store/apps/details?id=com.chrystianvieyra.physicstoolboxsuite https://play.google.com/store/apps/details?id=com.chrystianvieyra.physicstoolboxsuite introduction using the smartphone in an air-track experiment elastic collisions between carts in a horizontal air track frictionless movement of a cart pulled by a falling body through pulley movement with friction of a cart pulled by a falling body through pulley conclusions papers in physics, vol. 8, art. 080005 (2016) received: 1 july 2016, accepted: 30 august 2016 edited by: k. hallberg reviewed by: d. c. cabra, instituto de f́ısica la plata, la plata, argentina licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.080005 www.papersinphysics.org issn 1852-4249 topological quantum phase transition in strongly correlated kondo insulators in 1d franco t. lisandrini,1, 2 alejandro m. lobos,1 ariel o. dobry,1, 2 claudio j. gazza1, 2∗ we investigate, by means of a field-theory analysis combined with the density-matrix renormalization group (dmrg) method, a theoretical model for a strongly correlated quantum system in one dimension realizing a topologically-ordered haldane phase ground state. the model consists of a spin-1/2 heisenberg chain coupled to a tight-binding chain via two competing kondo exchange couplings of different type: a “s-wave” kondo coupling (jsk ), and a less common “p-wave” (j p k ) kondo coupling. while the first coupling is the standard kondo interaction studied in many condensed-matter systems, the latter has been recently introduced by alexandrov and coleman [phys. rev. b 90, 115147 (2014)] as a possible mechanism leading to a topological kondo-insulating ground state in one dimension. as a result of this competition, a topological quantum phase transition (tqpt) occurs in the system for a critical value of the ratio jsk/j p k , separating a (haldane-type) topological phase from a topologically trivial ground state where the system can be essentially described as a product of local singlets. we study and characterize the tqpt by means of the magnetization profile, the entanglement entropy and the full entanglement spectrum of the ground state. our results might be relevant to understand how topologically-ordered phases of fermions emerge in strongly interacting quantum systems. i. introduction the study of topological quantum phases of matter has become an area of great interest in present-day condensed matter physics. a topological phase is a quantum phase of matter which cannot be characterized by a local order parameter, and thus falls beyond the landau paradigm. in particular, topological insulators (i.e., materials which are insulating in the bulk but support topologically protected gapless states at the edges) were first proposed the∗e-mail: gazza@ifir-conicet.gov.ar 1 instituto de f́ısica rosario (conicet), bv 27 de febrero 210 bis, s2000ezp rosario, santa fe, argentina. 2 facultad de ciencias exactas, ingenieŕıa y agrimensura, universidad nacional de rosario, argentina. oretically for twoand three-dimensional systems with time-reversal symmetry [1–3], and soon after found in experiments on hgte quantum wells [4] and in bi1−xsbx [5], and bi2se3 [6], generating a lot of excitement and subsequent research. the electronic structure of a topological insulator cannot be smoothly connected to that of a trivial insulator, a fact that is mathematically expressed in the existence of a nonzero topological invariant, an integer number quantifying the non-local topological order in the ground state. a complete classification based on the dimensionality and underlying symmetries has been achieved in the form of a “periodic table” of topological insulators [7–9]. nevertheless, this classification refers only to the gapped phases of noninteracting fermions, and leaves open the problem of characterizing and classifying strongly in080005-1 papers in physics, vol. 8, art. 080005 (2016) / f. t. lisandrini et al. teracting topological insulators. this is a very important open question in modern condensed-matter physics. recently, dzero et al. [10–12] proposed a new kind of topological insulator: the topological kondo insulator (tki), which combines features of both non-interacting topological insulators and the well-known kondo insulators, a special class of heavy-fermion system with an insulating gap strongly renormalized by interactions. within a mean-field picture, tkis can be understood as a strongly renormalized f-electron band lying close to the fermi level, and hybridizing with the conduction-electron d−bands [13–15]. at halffilling, an insulating state appears due to the opening of a low-temperature hybridization gap at the fermi energy induced by interactions. due to the opposite parities of the states being hybridized, a topologically non-trivial ground state emerges, characterized by an insulating gap in the bulk and conducting dirac states at the surface [10]. at present, tki materials, among which samarium hexaboride (smb6) is the best known example, are under intense investigation both theoretically and experimentally [16–19] . from a theoretical point of view, tkis are interesting systems arising from the interplay between strong interactions and topology. although the large-n meanfield approach was successful in describing qualitatively heavy-fermion systems and tkis in particular, it would be desirable to understand better how tkis emerge. in order to shed more light into this question, in a recent work alexandrov and coleman proposed a onedimensional model for a topological kondo insulator [20], the “p-wave” kondo-heisenberg model (pkhm). such a model consists of a chain of spin-1/2 magnetic impurities interacting with a half-filled one-dimensional electron gas through a kondo exchange (see fig. 1). using a standard mean-field description [13–15], which expresses the original interacting problem as an effectively non-interacting one, the authors mentioned above found a topologically non-trivial insulating ground state (i.e., topological class d [7–9]) which hosts magnetic states at the open ends of the chain. however, this system was studied recently using the abelian bosonization approach combined with a perturbative renormalization group analysis, revealing an unexpected connection to the haldane phase at low temperatures [21]. the haldane phase is a paradigmatic example of a strongly interacting topological system [22–26]. the results in ref. [21] indicate that 1d tki systems might be much more complex and richer than expected with the näıve mean-field approach, as they are uncapable of describing the full complexity of the haldane phase, and suggest that they must be reconsidered from the more general perspective of interacting symmetry-protected topological (spt) phases [25–28]. more recently, two numerical studies using exact dmrg methods have confirmed that 1d tkis belong to the universality class of haldane insulators [29, 30]. these studies have extended the regime of validity of the results in ref. [21]. in this work, we theoretically investigate the robustness of the haldane phase in one-dimensional topological kondo insulators, and study the effect of local interactions that destabilize the topological phase and drive the system to a non-topolgical phase. our goal is to characterize the system at, and near to, the topological quantum phase transition (tqpt) from the perspective of symmetryprotected topological phases, using the concepts of entanglement entropy and entanglement spectrum to detect the topologically-ordered ground states. this is a novel perspective in the context of tkis, which might shed new light on the emergence of topological order on strongly correlated phases of fermions, and makes our work interesting from the pedagogical and conceptual points of view. ii. theoretical model we describe the system depicted in fig. 1 with the hamiltonian h = h1 +h2 +h (s) k +h (p) k , where the conduction band is represented by a l-site tightbinding chain h1 = −t l−1∑ j=1,σ ( c † j,σcj+1,σ + h.c. ) , (1) with c † j,σ, the creation operator of an electron with spin σ at site j. the hamiltonian h2 = jh l−1∑ j=1 sj ·sj+1 (jh > 0), (2) 080005-2 papers in physics, vol. 8, art. 080005 (2016) / f. t. lisandrini et al. figure 1: sketch of the kondo-heisenberg model under consideration. the lower leg represents a spin-1/2 heisenberg chain with jh > 0. the upper leg represents a half-filled one-dimensional tight binding chain interacting with the lower leg through two different kondo exchange couplings, a “s-wave” jsk and a “p-wave” j p k. we also show the fermionic and spin operators defined on each eight-dimensional “supersite” j (see text). corresponds to a spin-1/2 antiferromagnetic heisenberg chain. the terms h (s) k and h (p) k describe two different types of kondo exchange couplings between h1 and h2, namely h (s) k = j s k l∑ j=1 sj . sj, (3) h (p) k = j p k l∑ j=1 sj . πj, (4) with jak > 0 (a = s,p). eq. (3) describes the usual antiferromagnetic kondo exchange coupling of a spin sj in the heisenberg chain to the local spin density in the fermionic chain at site j, defined as: sj ≡ ∑ α,β c † j,α (σαβ 2 ) cj,β, (5) where σαβ is the vector of pauli matrices. on the other hand, eq. (4) describes a “p-wave” kondo interaction, which is unusual in that it couples the spin sj to the p-wave spin density in the fermionic chain at site j, defined as: [20] πj ≡ (6)∑ α,β (c†j+1,α − c†j−1,α√ 2 )(σαβ 2 )(cj+1,β − cj−1,β√ 2 ) , where the notation c0,σ = cl+1,σ = 0 is implied. the case jsk > 0 and j p k = 0 corresponds to the so-called “kondo-heisenberg model” in 1d, which has been extensively studied in the past in connection to the stripe phase of high-tc superconductors [31–37]. these previous works indicate that, at half-filling, that model does not support any topological phases. on the other hand, the case jsk = 0 and j p k > 0 is the “p-wave” kondo-heisenberg model proposed recently in ref. [20], and subsequently studied in refs. [21, 29, 30], where it was stablished that the ground state corresponds to the haldane phase. in both cases, at halffilling the system develops a mott-insulating gap in the fermionic chain due to “umklapp” processes (i.e., backscattering processes with 2kf momentum transfer) originated in the kondo interactions jsk or j p k, and at low temperatures (lower than the mott gap), the system effectively maps onto a spin-1/2 ladder. however, the ground states of the model in both cases cannot be smoothly connected (i.e., one is topologically trivial while the other is not), and therefore we anticipate that a gap-closing topological quantum phase transition (tqpt) must occur as a function of the ratio jsk/j p k. indeed, in ref. [21], it was proposed that while in the first case the (effective) spin-1/2 ladder forms singlets along the rungs, in the second case the kondo coupling j p k favors the formation of triplets on the rungs, and therefore the system maps onto the haldane spin-1 chain [38–41]. from this perspective, our toy-model hamiltonian h allows to explore a transtition from topological to non-topological kondo insulators, and to gain a valuable insight of the tqpt. iii. field-theory analysis the purpose of this section is to provide a simple and phenomenological understanding of the competition between the two kondo interactions jsk and j p k, which is not evident in eqs. (3) and (4). to that end, we introduce a field-theoretical representation of the model, valid at sufficiently low temperatures. we linearize the non-interacting spectrum �k = −2t cos (ka) in the tight-binding chain h1 around the fermi energy µ = 0, and take the continuum limit a → 0, where a is the lattice constant. then, the low-energy representation of the 080005-3 papers in physics, vol. 8, art. 080005 (2016) / f. t. lisandrini et al. fermionic operators becomes [42–46] cj,σ√ a ∼ eikf xjr1,σ (xj) + e−ikf xjl1,σ (xj) , (7) where r1,σ (x) and l1,σ (x) are rightand leftmoving fermionic field operators, which vary slowly on the scale of a. using this representation, the spin densities become [47]: sj a → [ j1r (xj) + j1l(xj) + (−1) j n1 (xj) ] , (8) πj a → 2 [j1r (xj) + j1l (xj) −(−1)jn1 (xj) ] , (9) where we have defined slowly-varying spin densities for the smooth spin configurations: j1r (xj) = ∑ α,β r † 1,α (xj) (σαβ 2 ) r1,β (xj) , (10) j1l (xj) = ∑ α,β l † 1,α (xj) (σαβ 2 ) l1,β (xj) , (11) and for the staggered spin density n1 (xj) = ∑ α,β r † 1,α (xj) (σαβ 2 ) l1,β (xj) + h.c.. (12) similarly, a continuum representation for the heisenberg chain can be achieved, e.g., by fermionization of the s=1/2 spins by means of a jordanwigner transformation. at low energies, the spin densities become [38–41, 47] sj a → j2r (xj) + j2l (xj) + (−1) j n2 (xj) . (13) we now focus on the kondo interaction and leave the analysis of h1 and h2 aside, as these terms are unimportant for the qualitative understading of the basic mechanism leading to the tqpt. keeping the most relevant (in the rg sense) terms, we can write the kondo interaction as: h (s) k + h (p) k → (j s k − 2j p k) ∫ dx n1 (x) .n2 (x) (+less relevant contributions), (14) i.e., the kondo interaction couples the staggered magnetization components in chains 1 and 2. in the above expression, note that while a large jsk favors a positive value of the effective coupling (jsk−2j p k), therefore promoting the formation of local singlets along the rungs, a large p-wave kondo coupling j p k favors an effective ferromagnetic coupling which promotes the formation of local triplets with s = 1 (hence the connection to the s = 1 haldane chain). the minus sign in front of j p k appears as a result of the p-wave nature of the orbitals in eq. (6). from this qualitative analysis we can conclude that jsk and j p k will be competing interactions promoting different ground states, and since these grounstates cannot be adiabatically connected with each other, a tqpt must occur. strictly speaking, near the critical region where the bare coupling (jsk − 2j p k) vanishes, the less relevant terms neglected in eq. (14) should be taken into account. however, note that operators with conformal spin 1 (i.e., operators of the form (∂xn1) .n2 or n1. (∂xn2), see for instance ref. [48]) are not allowed by the inversion symmetry of the hamiltonian and the p−wave symmetry of the orbitals in eq. (6), which demands that cj+1,α − cj−1,α → − ( c−(j+1),α − c−(j−1),α ) under the change j → −j, and therefore forbids the occurrence of terms proportional to ∂xn1. therefore, only the marginal operators j1ν.j2ν′ (with ν = {r,l}) and terms with conformal spin bigger than 1 are expected in the hamiltonian. we do not expect these operators to change the physics qualitatively near the critical point, and we can ignore them for this simplified analysis. as we show below, our numerical dmrg results are in accordance with this qualitative picture. iv. dmrg analysis before presenting the numerical results, it is worth providing technical details on the implementation of the dmrg method applied to the present model. this particular hamiltonian contains two types of terms: (a) terms involving two local operators, as in most condensed-matter models with nearestneighbor interactions, i.e., eqs. (1)-(3), and (b) terms involving three local operators, which result from the expansion of eq. (4). to make it easier to implement, we have found useful to define first 080005-4 papers in physics, vol. 8, art. 080005 (2016) / f. t. lisandrini et al. a “supersite” representation of the system, where each supersite combines a spin sj and the fermionic site along each rung (see fig. 1), therefore spanning a new 8-dimensional local basis. the first kind of terms could be easily handled with standard dmrg implementations where the system is represented as l(j)⊗•⊗•⊗r(n−j−2), with l(j) and r(j) the left and right blocks with j supersites, respectively, and the two circles are the exactlyrepresented middle supersites j + 1 and j + 2. this comes at a price, however, since one is then forced to re-express the electron creation operator c † j,σ, and the spin-1/2 operator sj in this new basis in order to implement eqs. (1) and (2). in this basis, note that the hamiltonian h (s) k , eq. (3), becomes an “on-site” term, which can be handled easily. in the second type of contributions, the presence of three-operator terms in h (p) k , eq. (4), must be properly treated in order to avoid extra truncation errors due to the tensor product of two (already truncated) operators inside each left and right blocks. then, during each left-right dmrg sweep iteration, we save in the previous superblock configuration l(j − 1) ⊗ • ⊗ • ⊗ r(n − j − 3) the exact correlation matrices between spin and fermionic operators that involve positions j−1 (i.e., rightmost supersite of left block) and j (first single site), which will become the two rightmost supersites of the new l(j) in the next sweep iteration step. these correlation matrices are [ a z† j−1,j,σ ] i;i′ = ρi;i1i2 [ szj−1 ] i1;i1′ [ c † j,σ ] i2;i2′ ρ † i1′i2′;i′, and [ a +† j−1,j,σ ] i;i′ = ρi;i1i2 [ s+j−1 ] i1;i1′ [ c † j,σ ] i2;i2′ ρ † i1′i2′;i′, where ρi;i1i2 is the m×8m reduced density matrix, with m the number of states kept, and where the indices i and i1 run over the truncated m-dimensional hilbert space while i2 runs over the 8-dimensional supersite space. in addition, we have assumed summation over repeated internal indices. similarly, correlations between spin and fermionic operators placed in the two leftmost supersites j + 3 and j + 4 of r(n −j−2) should also be kept during the corresponding step of the rightf-left sweep. therefore, we now deal with a standard “two-operator” interaction, for example the term l(j) ⊗• of h(p)k is − 1 4 j p k ∑ σ {( σa z† j−1,j,σ + a +† j−1,j,σ ) clj+1,σ + h.c. } . finally, we mention that in our implementation we have kept up to a maximum of 800 states and we have swept 12 times, assuring truncation errors in the density matrix of the order 10−9 at worst. we now turn to the results. we have used the tight-binding parameter t as our unit of energies, and in all of our calculations we have used the values jh/t = 1 and j p k/t = 2. we have studied the evolution of the ground state upon the increase of jsk starting from the value j s k = 0, where the system is in the haldane phase. the nature of the topologically-ordered ground state and the precise detection of the tqpt are determined using, respectively: a) the analysis of the spin profile in the ground state, b) the value of the entanglement entropy and c) the analysis of the degeneracies in the full entanglement spectrum. as shown recently in seminal works [25, 26], the last two properties are useful bona fide indicators of symmetry-protected topological orders. (a) spin profile. one characteristic feature of a s = 1 haldane chain with open boundary conditions is the presence of topologically protected fractionalized spin-1/2 end-states, a consequence of the broken z2 × z2 symmetry of the ground state [24]. these states can be represented as: | ↑l〉⊗| ↑r〉, | ↑l〉⊗| ↓r〉, | ↓l〉⊗| ↑r〉, | ↓l〉⊗| ↓r〉, and correspond to the fourfold-degenerate ground state in the thermodynamic limit l → ∞. in order to detect these fractionalized spin excitations in our system, we have defined the spin profile as 〈 tzj 〉 = 〈 ψm z=1 g ∣∣tzj ∣∣ψmz=1g 〉, where tzj is the z−projection of the total spin in the j-th rung tj = sj + sj, and ∣∣ψmz=1g 〉 is the ground state of the system with total spin mz = 1 (where the visualization of the spin states at the ends is easier). in fig. 2 we show 〈tzj 〉 vs j for different values of jsk/j p k and for l = 80. the presence of spin states localized at the edges can be clearly seen for jsk/j p k = 0 and j s k/j p k = 0.85, where 080005-5 papers in physics, vol. 8, art. 080005 (2016) / f. t. lisandrini et al. -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 10 20 30 40 50 60 70 80 l=80 < t jz > site j jsk/j p k=0.00 jsk/j p k=0.85 jsk/j p k=1.60 figure 2: spatial profile of 〈tzj 〉 = 〈ψm z=1 g |tzj |ψ mz=1 g 〉, i.e., the z component of the total spin in the supersite j, computed with the ground state of the subspace with total mz = 1 for l = 80, and for parameters jh/t = 1 and j p k/t = 2. the destruction of the topologically protected spin-1/2 states at the ends of the chain can be clearly seen as jsk/j p k is increased from jsk/j p k = 0 to j s k/j p k = 1.6. the spin density is accumulated at the ends of the system. note that since we are working in the subspace mz = 1, this ground state corresponds to the state | ↑l〉 ⊗ | ↑r〉. for jsk/j p k = 1.6, the magnetic edge-states have already disappeared, indicating that the onset of the topologically trivial phase must occur at lower values of jsk/j p k. however, while this analysis is useful to understand the nature of the topologically-ordered ground state, it does not allow a precise determination of the tqpt. to that end, we have studied the entanglement entropy and entanglement spectrum (see below). (b) entanglement entropy. we have also calculated the entanglement entropy (i.e., the von neumann entropy of the reduced density matrix), defined as [49] s(l/2) = −trρ̂l/2 ln ρ̂l/2 = − ∑ j λj ln λj, (15) where ρ̂l/2 is the reduced density matrix obtained after tracing out half of the chain, and λj the corresponding eigenvalues of ρ̂l/2, which are the squares of the schmidt values. recently, it has been clarified that the entanglement of a single quantum state is a crucial property not only from the perspective of quantum information, but also for condensed matter physics. in particular, the entanglement entropy has been shown to contain the quantum dimension, a property of topologicallyordered phases [50, 51]. hirano and hatsugai [52] have computed the entanglement entropy of the open-boundary spin-1 haldane chain and obtained the lower-bound value s(l/2) = ln (4) = 2 ln (2) which, according to the edge-state picture in the thermodynamical limit l →∞, corresponds to the aforementioned 4 spin-1/2 edge states. in fig. 3 we show the entanglement entropy of the system as a function of jsk/j p k, for different system sizes and in the subspace mz = 0, where we expect to find the ground state (i.e., the ground state of an even-numbered antiferromagnetic chain is a global singlet [53]). near the critical region, the entanglement entropy grows due to the contribution of the bulk, and exactly at the tqpt the entanglement entropy is predicted to show a logarithmic divergence s(l/2) ∼ ln(l), characteristic of critical onedimensional systems [49]. as the size of the system is increased, the logarithmic divergence becomes narrower and its position shifts to larger values. using the fitting function jsk,c (l) = j s k,c (∞) +a/l 2, we have obtained the extrapolated critical point jsk,c (∞) ≈ 1.11 j p k in the thermodynamic limit, see inset (a) in fig. 3. note that this value is smaller than the predicted value jsk,c = 2 j p k using the field-theory analysis of the previous section. we believe this to be the effect of the neglected marginal or irrelevant operators, which renormalize non-universal quantities such as the critical point. we have also confirmed the logarithmic scaling of the entanglement entropy at the critical point, i.e., smax(l/2) = α ln(l) + constant, and we have obtained a prefactor α = 0.052, see inset (b) in fig. 3. a detailed analysis of this value and its connection to the corresponding central charge value of the conformal field theory is beyond the scope of the present work and is left to a subsequent publication. in fig. 3, note that for jsk/j p k < j s k,c/j p k, the value of the entanglement entropy roughly corresponds to s(l/2) ∼ 2 ln (2), consistent with the theoretical predictions in the haldane phase. values of s(l/2) which are below the predicted lower080005-6 papers in physics, vol. 8, art. 080005 (2016) / f. t. lisandrini et al. 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 1.2 1.4 s (l / 2 ) jsk/j p k 2ln(2) l=40 l=60 l=80 l=100 1 1.05 1.1 0 0.001 (a) js k,c (∞)=1.11jp k j s k ,c / j p k 1/l2 2.15 2.2 2.25 100 (b) α=0.052 s m a x l figure 3: entanglement entropy of the reduced density matrix, s (l/2), as function of jsk/j p k for different lattice sizes. the entropy is computed within the subspace mz = 0 and for parameters jh/t = 1 and j p k/t = 2. the maximum of s (l/2) indicates the position of the critical point. inset (a): finite-size scaling of the critical point, i.e., jsk,c (l) = j s k,c (∞) + a/l 2, from where the value in the thermodynamic limit jsk,c (∞) ≈ 1.11 j p k is obtained. inset (b): maximum entropy smax(l/2) vs l. the results reproduce the predicted logarithmic divergence smax(l/2) = α ln(l) + constant, with the fitting constant α = 0.052. bound are presumably due to finite-size effects, which result in a decrease of the effective quantum dimension in small systems [52]. for jsk/j p k > jsk,c/j p k, s(l/2) tends to zero as expected for a topologically trivial ground state. this result can be easily understood in the limit jsk/j p k → ∞, where we expect the ground state to factorize as a product of local singlets (we recall that we are working in the subspace mz = 0), for which s(l/2) = 0. (c) degeneracy of the entanglement spectrum. finally, we focus on the full entanglement spectrum of the reduced density-matrix. degeneracies in the entanglement spectrum are intimately related to the existence of discrete symmetries which protect the topological order. in particular, as shown in ref. [25], an even degeneracy constitutes the most distinctive feature of the haldane phase. this fact allows to make an interesting connection -l n (λ i) jsk/j p k 0 1 2 3 4 5 6 7 8 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 figure 4: the largest eigenvalues of the entanglement spectrum (extrapolated to the thermodynamic limit l →∞), as a function of jsk. the results were obtained in the subspace mz = 0. the gray zone corresponds to the haldane phase, where the degeneracy in the spectrum of eigenvalues is even. between the theory of symmetry-protected topological phases and the theory of topological kondo insulators in one dimension, and might have important implications in the understanding of strongly interacting topological phases of fermions. in fig. 4 we show the evolution of the largest eigenvalues extrapolated to the thermodynamic limit of the density matrix as a function of the parameter jsk/j p k, obtained in the subspace m z = 0. note that for jsk/j p k < j s k,c/j p k ∼ 1.11 (i.e., gray-shaded area), the degeneracy of the eigenvalues is even, as expected for the haldane phase. in contrast, for values jsk/j p k > j s k,c/j p k, the evendegeneracy breaks down, indicating the onset of the trivial phase. in particular, note the evolution of the largest eigenvalue of the density matrix λ0 ≈ 1 (i.e., lowest “+” symbol) which becomes non-degenerate. v. conclusions using a combination of techniques, i.e., a fieldtheoretical analysis and the density-matrix renormalization group (dmrg), an essentially exact method in one dimension, we have studied the tran080005-7 papers in physics, vol. 8, art. 080005 (2016) / f. t. lisandrini et al. sition from a topological to a non-topological phase in a model for a one-dimensional strongly interacting topological kondo insulator. as a prototypical topological quantum phase transition with no landau-type local order-parameter, one must resort to global quantities characterizing the ground state. in this work, we have shown that the entanglement entropy and the entanglement spectrum can be used to characterize a topological kondo insulator in one dimension. this system was originally understood and classified according to noninteracting topological invariants (i.e., chern numbers), employing approximate large-n mean-field methods [20]. here, by the means of the dmrg, we have shown that a more appropriate way to understand this system is by using the concepts developed for symmetry-protected topological phases [25–28]. in particular, for parameters jh/t = 1 and j p k/t = 2, we have obtained the value of the critical point jsk,c/j p k ' 1.11 in the thermodynamic limit l → ∞. this value is smaller than the expected from the qualitative field-theoretical estimation jsk,c/j p k = 2, a fact that is presumably originated in the effect of marginal or irrelevant operators, which were neglected in the qualitative analysis and which renormalize a non-universal quantity such as the critical point. acknowledgements f.t.l, a.o.d. and c.j.g. acknowledge support from conicet-pip 11220120100389co. a.m.l acknowledges support from pict-2015-0217 of anpcyt. [1] b a bernevig, t l hughes, s c zhang, quantum spin hall effect and topological phase transition in hgte quantum wells, science 314, 1757 (2006). [2] l fu, c l kane, topological insulators with inversion symmetry, phys. rev. b 76, 045302 (2007). [3] h zhang, c x liu, x l dai, z fang, s c zhang, topological insulators in bi2se3, bi2te3 and sb2te3 with a single dirac cone on the surface, nat. phys. 5, 438 (2009). [4] m könig, s wiedmann, c brüne, a roth, h buhmann, l w molenkamp, x l qi, s c zhang, quantum spin hall insulator state in hgte quantum wells, science 318, 766 (2007). [5] d hsieh, d qian, l wray, y xia, y s hor, r j cava, m z hasan, a topological dirac insulator in a quantum spin hall phase, nature 452, 970 (2008). [6] y xia, d qian, l hsieh, d wray, a pal, h lin, a bansil, d grauer, y s hor, r j cava, m z hasan, observation of a large-gap topologicalinsulator class with a single dirac cone on the surface, nat. phys. 5, 398 (2009). [7] a altland, m r zirnbauer, nonstandard symmetry classes in mesoscopic normalsuperconducting hybrid structures, phys. rev. b 55, 1142 (1997). [8] a kitaev, periodic table for topological insulators and superconductors, aip conf. proc. 1134, 22 (2009). [9] s ryu, a p schnyder, a furusaki, a w w ludwig, topological insulators and superconductors: tenfold way and dimensional hierarchy, new j. phys. 12, 065010 (2010). [10] m dzero, k sun, v galitski, p coleman, topological kondo insulators, phys. rev. lett. 104, 106408 (2010). [11] m dzero, k sun, p coleman, v galitski, theory of topological kondo insulators, phys. rev. b 85, 045130 (2012). [12] m dzero, j xia, v galitski, p coleman, topological kondo insulators, annu. rev. condens. matter phys. 7, 249 (2016). [13] n read, d m newns, on the solution of the coqblin-schreiffer hamiltonian by the largen expansion technique, j. phys. c 16, 3273 (1983). [14] p coleman, mixed valence as an almost broken symmetry, phys. rev. b 35, 5072 (1987). [15] d m newns, n read, mean-field theory of intermediate valence/heavy fermion systems, adv. phys. 36, 799 (1987). 080005-8 papers in physics, vol. 8, art. 080005 (2016) / f. t. lisandrini et al. [16] s wolgast, c kurdak, k sun, j w allen, d j kim, z fisk, low-temperature surface conduction in the kondo insulator smb6, phys. rev. b 88, 180405 (2013). [17] x zhang, n p butch, p syers, s ziemak, r l greene, j paglione. hybridization, inter-ion correlation, and surface states in the kondo insulator smb6, phys. rev. x 3, 011011 (2013). [18] n xu et al., direct observation of the spin texture in smb6 as evidence of the topological kondo insulator, nat. commun. 5, 4566 (2014). [19] d j kim, j xia, z fisk, topological surface state in the kondo insulator samarium hexaboride, nat. mater. 13, 466 (2014). [20] v alexandrov, p coleman, end states in a one-dimensional topological kondo insulator in large-n limit, phys. rev. b 90, 115147 (2014). [21] a m lobos, a o dobry, v galitski, magnetic end states in a strongly interacting one-dimensional topological kondo insulator, phys. rev. x 5, 021017 (2015). [22] i affleck, t kennedy, e h lieb, h tasaki, rigorous results on valence-bond ground states in antiferromagnets, phys. rev. lett. 59, 799 (1987). [23] i affleck, t kennedy, e h lieb, h tasaki, valence bond ground states in isotropic quantum antiferromagnets, commun. math. phys. 115, 477 (1988). [24] t kennedy, h tasaki, hidden z2 ×z2 symmetry breaking in haldane gap antiferromagnets, phys. rev. b 45, 304 (1992). [25] f pollmann, a m turner, e berg, m oshikawa, entanglement spectrum of a topological phase in one dimension, phys. rev. b 81, 064439 (2010). [26] f pollmann, e berg, a m turner, m oshikawa, symmetry protection of topological phases in one-dimensional quantum spin systems, phys. rev. b 85, 075125 (2012). [27] z c gu, x g wen, tensor-entanglementfiltering renormalization approach and symmetry-protected topological order, phys. rev. b 80, 155131 (2009). [28] x chen, z c gu, x g wen, classification of gapped symmetric phases in one-dimensional spin systems, phys. rev. b 83, 035107 (2011). [29] a mezio, a m lobos, a o dobry, c j gazza, haldane phase in one-dimensional topological kondo insulators, phys. rev. b 92, 205128 (2015). [30] i hagymási, ö legeza, characterization of a correlated topological kondo insulator in one dimension, phys. rev. b 93, 165104 (2016). [31] o zachar, s a kivelson, v j emery, exact results for a 1d kondo lattice from bosonization, phys. rev. lett. 77, 1342 (1996). [32] a e sikkema, i affleck, s r white, spin gap in a doped kondo chain, phys. rev. lett. 79, 929 (1997). [33] o zachar, a m tsvelik, one dimensional electron gas interacting with a heisenberg spin-1/2 chain. phys. rev. b 64, 033103 (2001). [34] o zachar, staggered liquid phases of the onedimensional kondo-heisenberg lattice model, phys. rev. b 63, 205104 (2001). [35] e berg, e fradkin, s a kivelson, pair-densitywave correlations in the kondo-heisenberg model, phys. rev. lett. 105, 146403 (2010). [36] a dobry, a jaefari, e fradkin, inhomogeneous superconducting phases in the frustrated kondo-heisenberg chain, phys. rev. b 87, 245102 (2013). [37] g y cho, r soto-garrido, e fradkin, topological pair-density-wave superconducting states, phys. rev. lett., 113, 256405 (2014). [38] d g shelton, a a nersesyan, a m tsvelik, antiferromagnetic spin ladders: crossover between spin s=1/2 and s=1 chains, phys. rev. b 53, 8521 (1996). 080005-9 papers in physics, vol. 8, art. 080005 (2016) / f. t. lisandrini et al. [39] a o gogolin, a a nersesyan, a m tsvelik, l yu, zero-modes and thermodynamics of disordered spin-1/2 ladders, nucl. phys. b 540, 705 (1999). [40] p lecheminant, e orignac, magnetization and dimerization profiles of the cut two-leg spin ladder and spin-1 chain, phys. rev. b 65, 174406 (2002). [41] n j robinson, f h l essler, e jeckelmann, a m tsvelik, finite wave vector pairing in doped two-leg ladders, phys. rev. b, 85, 195103 (2012). [42] v j emery, theory of the one-dimensional electron gas, in: highly conducting onedimensional solids, eds. j t devreese, r p evrard, v e van doren, pag. 247, plenum, new york (1979). [43] j sólyom, the fermi gas model of one– dimensional conductors, adv. in phys. 28, 201 (1979). [44] j von delft, h schoeller, bosonization for beginners refermionization for experts, ann. phys. (n. y.) 7, 225 (1998). [45] t giamarchi, quantum physics in one dimension, oxford university press, oxford (2004). [46] a o gogolin, a a nersesyan, a m tsvelik, bosonization and strongly correlated systems, cambridge university press, cambridge (1999). [47] i affleck, in e. brezin and j. zinn-justin, editors, fields, strings and critical phenomena, pag. 563, elsevier science, amsterdam (1988). [48] a a nersesyan, a o gogolin, fabian h l essler, incommensurate spin correlations in heisenberg spin-1/2 zig-zag ladders, phys. rev. lett. 81, 910 (1998). [49] u schollwöck, the density-matrix renormalization group, rev. mod. phys. 77, 259 (2005). [50] a kitaev, j preskill, topological entanglement entropy, phys. rev. lett. 96, 110404 (2006). [51] m levin, x g wen, detecting topological order in a ground state wave function, phys. rev. lett. 96, 110405 (2006). [52] t hirano, y hatsugai, entanglement entropy of one-dimensional gapped spin chains, j. phys. soc. japan 76, 074603 (2007). [53] a auerbach, interacting electrons and quantum magnetism, springer, berlin (1998). 080005-10 papers in physics, vol. 11, art. 110006 (2019) received: 11 december 2018, accepted: 26 may 2019 edited by: a. goñi, a. cantarero, j. s. reparaz licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.110006 www.papersinphysics.org issn 1852-4249 development of two-stage multi-anvil apparatus for low-temperature measurements k. ishigaki,1∗j. gouchi,1 s. nagasaki,1 j. g. cheng,2 y. uwatoko 1† the two-stage 6-8 multi-anvil (ma8) apparatus is an important large-volume, highpressure technique that has been widely used in the high pressure mineralogy and material synthesis, mainly at room temperature or above. recently, we have successfully developed a two-stage ma8 apparatus for low-temperature physical property measurements. the first-stage anvils at top and bottom sides are fabricated as a single piece in order to reduce the total size of the cylindrical module, which is put in a top-loading high pressure cryostat and compressed by a 1000 ton hydraulic press. a castable, split octahedral gasket with integrated fin was specifically designed in order to introduce the electrical leads from the inside sample container filled with a liquid pressure transmitting medium. by using tungsten carbide (wc) second-stage cubes with a truncated edge length of 3 mm and an octahedral gasket with an edge length of 6 mm, we have successfully generated pressure over 20 gpa at room temperature. since the high pressure limit can be pushed to nearly 100 gpa by using the sintered diamond second-stage cubes, our ma8 apparatus has a great potential to expand the current pressure capacity for precise low-temperature measurements with a large sample volume. i. introduction pressure is a fundamental parameter like temperature that governs the states of matter. the application of high pressure can induce structural or electronic phase transitions or precisely tune the structural and physical properties. in condensed matter physics, the combination of high-pressure and low-temperature environments provides a very fertile ground for exploring novel quantum states of matter and exotic phenomena. for example, pres∗e-mail: ishigaki@issp.u-tokyo.ac.jp †e-mail: uwatoko@issp.u-tokyo.ac.jp 1 institute for solid state physics, university of tokyo, 5-1-5 kashiwanoha, chiba 277-8581, japan. 2 institute of physics, chinese academy of sciences, beijing 100190, people’s republic of china. sure can induce a magnetic quantum critical point, near which the landau fermi-liquid behavior usually breaks down and unconventional superconductivity frequently takes place due to the presence of strong quantum fluctuations. therefore, it is important to develop a high pressure apparatus for low-temperature measurements. despite the sophisticated low-temperature technologies existent, the high-pressure devices used in low-temperature conditions remain to be further developed due to the space constrain and other specific requirements, such as pressure homogeneity, sample volume, etc. currently, piston-cylinder cell (pcc) [1,2] and diamond anvil cell (dac) [3,4] are two widely used commercial high-pressure devices for in-situ physical property measurements at low temperatures. pcc offers a large sample space and relatively good hydrostaticity by employing a liquid 110006-1 papers in physics, vol. 11, art. 110006 (2019) / k. ishigaki et al. pressure transmitting medium (ptm) [5], but the maximum pressure is usually limited to 4 gpa [1], which is insufficient for many studies in condensed matter. although the dac [3] can achieve ultrahigh pressures and allow easy access for the electromagnetic radiations, the tiny sample space makes it difficult for in-situ physical property measurements requiring electrical contacts, and the solid ptm usually employed renders severe non-hydrostatic pressure conditions. besides the ptm, the level of pressure hydrostaticity/homogeneity also depends on the compression geometry. in comparison with dac, multianvil-type (ma) apparatus can maintain better pressure homogeneity even if the ptm becomes solidified at low temperature and/or high pressure [6]. in addition, the ma apparatus can reach pressure above 10 gpa, much higher than pcc. the single-stage cubic anvil cell (cac) device developed in the institute for solid state physics, the university of tokyo (issp, ut) [7] is one typical ma apparatus that can generate hydrostatic pressures up to 15 gpa. the design of miniature “palm”-type cac also enabled integration with 3he or dilution refrigerator so as to reach temperatures as low as 10 mk [8, 9]. these developments of cubic-type apparatus were essential for us to discover novel quantum phenomena [10] and new superconducting materials [11] recent years. to pursue more exotic phenomena in an extended pressure range, there is always a demand for the development of devices reaching even higher pressures. in this regard, the two-stage 6-8 multianvil (ma8) apparatus originally developed in 1970s by kawai and endo becomes an excellent option [12]. in this case, the first stage of six anvils surrounds a cubic cavity, in which it is placed the second stage, consisting of eight cubes with truncated corners forming an octahedron. after 40 years of developments, the ma8 apparatus has gain great success and has been widely used in high-pressure mineralogy and synthesis of materials. depending on the strength of the secondstage anvils, the maximum pressure of ma8 are used for high-pressure studies at or above room temperature. in this paper, we report the development of a two-stage ma8 apparatus for precise low-temperature physical property measurements in issp, ut. figure 1: schematic illustration of the first-stage and second-stage anvils. ii. experimental setup and results i. two-stage ma8 device for the commonly used two-stage ma8 apparatus, the first-stage six anvils (three on the top and three on the bottom) made of hardened steel are usually built into a thick-wall steel ring (kawai type) or contained in a removable cylindrical module (walker type) [13]. such designs are not suitable for low-temperature applications because the whole ma8 device has to be inserted into a cryostat. to reduce the total size of the ma8 device, we designed the first-stage three anvils on top and bottom sides as a whole piece, as shown in fig. 1. we have also used a nonmagnetic nicral alloy to fab110006-2 papers in physics, vol. 11, art. 110006 (2019) / k. ishigaki et al. figure 2: cross-sectional view of the internal configuration of the gasket with teflon cell. ricate the pair of cylindrical first-stage anvils in order to apply magnetic fields. the first-stage anvils have an outer diameter of 154 mm and form a cubic cavity with edge length of 32.3 mm. the secondstage anvils, consisting of eight cubes with truncated corners, are similar to the commonly used ma8 apparatus. here, we employed nonmagnetic wc (tms05/mf10 grade from fujilloy) with an edge length of 18 mm and truncated corner of 3 mm. as a common practice, these wc cubes are held together with six pieces of fiber-reinforced plastics (frp) pads, which are 0.5 mm in thickness and 36 × 36 mm in area. these frp pads also serve as an insulation to the first-stage anvils. the inside surfaces of the second-stage cubes are pasted with three 1.0 mm cubic teflon spacers to prevent electrical contact with adjacent anvils. ii. gasket design and sample assembly the adoption of a liquid ptm is essential to maintain a relatively good pressure homogeneity. however, the conventional design of octahedral gasket and sample assembly used for the ma8 apparatus also need to be modified in order to accommodate a sample container filled with liquid ptm. for this purpose, we adopt the castable, split octahedral gasket with integrated fin, which are made from ceramacast 584-p and ceramacast 584-l (100:28 figure 3: cross-sectional view of the top-loading cryostat. weight-in-weight) potting compound from aremco products, inc. the half-octahedral gaskets with integrated fins are made in-house in our laboratory according to the procedures described in ref. [14]. the edge length of the octahedron is 6 mm and the thickness of the gasket fin is 1 mm. figure 2 depicts the internal configuration of the gasket with the sample hanging inside the teflon capsule (i.d. 1.5 mm, o.d. 2.0 mm and length 2.5 mm), which is the same setup used in the cubic anvil cell [7]. the teflon cell can be filled with a liquid ptm such as daphne 7373 or glycerol, and the electrical leads are introduced via gold foil to the surfaces of octahedral gasket, which in turn contact with the wc cubes. iii. top loading high-pressure cryostat figure 3 shows a schematic cross-sectional view of the top-loading high-pressure cryostat, in which 110006-3 papers in physics, vol. 11, art. 110006 (2019) / k. ishigaki et al. table 1: phase transitions as pressure calibrants [15]. sample pressure (gpa) bi 2.55, 2.7, 7.7 sn 9.4 pb 13.4 zns 15.6 gaas 18.3 figure 4: electrical resistance of bi, sn, pb, zns and gaas as a function of loading force. the ma8 device is placed in between the upper and lower pushing columns. details about the design of the high-pressure cryostat can be found in an earlier publication about the cubic anvil cell apparatus [7]. the low-temperature condition (down to 2 k) is realized by filling the cryostat with liquid nitrogen and then helium with proper pumping. precise temperature control between 2 and 300 k was achieved by attaching a resistance heater onto the ma8 device. the pressure is generated by using a 1000-ton hydraulic press, which can maintain a constant loading force over the ma8 device over the whole temperature range. in addition, a 3.5 tesla helium-free superconducting magnet with a large bore size is also installed and the center of the magnetic field is aligned with the sample in the ma8 device. figure 5: pressure calibration line for a two-stage multi-anvil high pressure cell. iv. pressure calibration we have performed fixed point pressure calibration at room temperature by detecting the characteristic phase transitions of bi, sn, pb, zns and gaas in electrical resistance. a standard fourprobe method was used to measure the resistance of each sample. table 1 summarizes the transition pressure of these materials from previous studies [15]. figure 4 shows the electrical resistance of bi, sn, pb, zns and gaas as a function of loading force at room temperature. as can be seen, the characteristic phase transitions of bi at 2.55, 2.7 and 7.7 gpa were clearly observed at loading force of 12.2, 13.7, and 36.2 tons, respectively. we defined the phase transitions which are the offset. similarly, the resistance anomalies of sn and pb at 9.4 and 13.4 gpa were also observed at 51.7 and 67.9 tons, respectively. in addition, the metallization of zns and gaas at 15.6 and 18.3 gpa were successfully observed at a loading force of 71.1 and 83.9 tons, respectively. although the employed daphne 7373 ptm becomes solid at about 2.3 gpa, these characteristic phase transitions remain very sharp, signaling an excellent pressure homogeneity up to at least 20 gpa due to the multi-anvil geometry. based on these measurements, we have plotted 110006-4 papers in physics, vol. 11, art. 110006 (2019) / k. ishigaki et al. in fig. 5 the pressure calibration curve for our twostage ma8 apparatus installed with wc cubes having a truncated corner of 3 mm. as can be seen, all the calibration points fall nicely on a liner curve described by p (gpa)= 0.209× force (ton). from the extrapolation, we can reach about 25 gpa at a loading force of 120 tons, which is a much lower force than those reported in the literature employing mgo octahedron plus extra pyrophyllite gaskets. in the latter case, a large portion of loading force was dissipated on the relatively soft pyrophyllite gasket so that the calibration curve usually tends to saturate at higher loading forces. in contrast, the much improved pressure efficiency in our ma8 apparatus should be attributed to the octahedral gasket with integrated fin, which is much harder than pyrophyllite. as mentioned above, the maximum pressure at which ma8 can be pushed to is over 40 gpa by using a tapered second-stage wc anvils [16], or to nearly 100 gpa by employing much harder sintered diamond cubes [17]. it can be thus foreseen that the pressure capacity of our ma8 apparatus can be further improved. iii. conclusions we have successfully developed a two-stage 68 multi-anvil apparatus for accurate high-pressure and low-temperature measurements. by using tungsten carbide second-stage cubes with truncated corners of 3 mm and castable octahedral gasket with an edge length of 6 mm, we can generate pressures over 20 gpa at a relatively low loading force of 100 ton. an excellent pressure homogeneity/hydrostaticity up to 20 gpa has been demonstrated in our ma8 apparatus, which is expected to reach even higher pressures by employing wc anvils with smaller truncation sizes or sintered diamond anvils. acknowledgements this work was supported by grant-aid for challenging exploratory research (no. 16k13830). jgc acknowledges the support from most, nsfc, and cas through projects (grant nos. 2018yfa0305702, 1154377, 11874400, and qyzdb-ssw-slh013). [1] y uwatoko, s todo, k ueda, a uchida, m kosaka, n mori, t matsumoto, material properties of ni–cr–al alloy and design of a 4 gpa class non-magnetic high-pressure cell, j. phys.: cond. matter 14, 11291 (2002). [2] n fujiwara, t matsumoto, k k nakazawa, a hisada, y uwatoko, fabrication and efficiency evaluation of a hybrid nicral pressure cell up to 4 gpa, rev. sci. instr. 78, 073905 (2007). [3] m kano, h mori, y uwatoko, s tozer, anisotropy of the upper critical field in ultrahigh-pressure-induced superconductor (tmttf)2pf6, physica b: cond. matter 404, 3246 (2009). [4] k shimizu, h ishikawa, d takao, t yagi, k amaya, superconductivity in compressed lithium at 20 k, nature 419, 597 (2002). [5] k murata, s aoki, development of high pressure medium achieving high quality hydrostatic pressure, rev. high pressure sci. technol. 26, 3 (2016). [6] y nakamura, a takimoto, m matsui, rheology and nonhydrostatic pressure evaluation of solidified oils including daphne oils by observing microsphere deformation, j. phys.: conf. series, 215, 012176 (2010). [7] n mori, h takahashi, n takeshita, lowtemperature and high-pressure apparatus developed at issp, university of tokyo, high pressure res. 24, 225 (2004). [8] y uwatoko, k matsubayashi, t matsumoto, n aso, m nishi, t fujiwara, m hedo, s tabata, k takagi, m tado, h kagi, development of palm cubic anvil apparatus for low temperature physics, rev. high pressure sci. technol. 18, 230 (2008). [9] k matsubayashi, a hisada, t kawae, y uwatoko, recent progress in multi-extreme condition by miniature high-pressure, rev. high pressure sci. technol. 22, 206 (2012). [10] k matsubayashi, t tanaka, a sakai, s nakatsuji, y kubo, y uwatoko, pressureinduced heavy fermion superconductivity in the nonmagnetic quadrupolar system prti2al20, phys. rev. lett. 109, 187004 (2012). 110006-5 papers in physics, vol. 11, art. 110006 (2019) / k. ishigaki et al. [11] j g chen, k matsubayashi, w wu, j p sun, m nishi, f k lin, j l luo, y uwatoko, pressure induced superconductivity on the border of magnetic order in mnp, phys. rev. lett. 114, 117001 (2015). [12] n kawai, s endo, the generation of ultrahigh hydrostatic pressures by a split sphere apparatus, rev. sci. instr. 41, 1178 (1970). [13] d walker, m a carpenter, c m hitch, some simplifications to multianvil devices for high pressure experiments, am. mineralog. 75, 1020 (1990). [14] j g cheng, high-pressure synthesis of the 4d and 5d transition-metal oxides with the perovskite and the perovskite-related structure and their physical properties, phd. thesis, university of texas, texas, austin (2010). [15] e ito, pressure calibration for multi-anvil apparatuses in high-pressure eatrth science, rev. high pressure sci. technol. 13, 265 (2003). [16] t ishii, l shi, r huang, n tsujino, d druzhbin, r myhill, y li, l wang, t yamamoto, n miyajima, t kawazoe, n nishiyama, y higo, y tange, t katsura, generation of pressures over 40 gpa using kawai-type multianvil press with tungsten carbide anvils, rev. sci. instr. 87, 024501 (2016). [17] e ito, d yamazaki, t yoshino, h fukui, t katsura, y tange, k funakoshi, pressure generation and investigation of the post-perovskite transformation in mggeo3 by squeezing the kawai-cell equipped with sintered diamond anvils, eath planetary sci. lett. 293, 84 (2010). 110006-6 papers in physics, vol. 11, art. 110001 (2019) received: 29 october 2018, accepted: 18 january 2019 edited by: a. goñi, a. cantarero, j. s. reparaz licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.110001 www.papersinphysics.org issn 1852-4249 fluorine chemistry at extreme conditions: possible synthesis of hgf4 michael pravica,1∗ sarah schyck,1 blake harris,1 petrika cifligu,1 eunja kim,1 brant billinghurst2 by irradiating a pressurized mixture of a fluorine-bearing compound (xef2) and hgf2 with synchrotron hard x-rays (> 7 kev) inside a diamond anvil cell, we have observed dramatic changes in the far-infrared spectrum within the 30–35 gpa pressure range which suggest that we may have formed hgf4 in the following way: xef2 hv−→ xe + f2 (photochemically) and hgf2 + f2 → hgf4 (30 gpa < p < 35 gpa). this lends credence to recent theoretical calculations by botana et al. that suggest that hg may behave as a transition metal at high pressure in an environment with an excess of molecular fluorine. the spectral changes were observed to be reversible during pressure cycling above and below the above mentioned pressure range until a certain point when we suspect that molecular fluorine diffused out of the sample at lower pressure. upon pressure release, hgf2 and trace xef2 were observed to be remaining in the sample chamber suggesting that much of the xe and f2 diffused and leaked out from the sample chamber. i. introduction mercury and cesium have been predicted to behave as a transition metal [1, 2] and p-block element respectively [2,3] at high pressure (within the 1 mbar range) in the presence of fluorine and thus have higher oxidation states enabling sharing/transfer of electrons from the inner shells (i.e. below the valence levels) of the elements as fluorine atoms are brought closer to the metals via high pressure. as the most electronegative element, there are a number of challenges associated with loading highly reactive and toxic molecular fluorine into a diamond anvil cell which is likely the primary reason why ∗e-mail: pravica@physics.unlv.edu 1 high pressure science and engineering center (hipsec) and department of physics, university of nevada las vegas (unlv), 89154-4002 las vegas, nevada, usa. 2 far-ir beamline, canadian light source, 44 innovation blvd, saskatoon, sk s7n 2v3, canada. there was only one published study of the material at high pressure (> 1 gpa) to the best of our knowledge [4]. in an effort to develop fluorine chemistry at extreme conditions, we have utilized hard x-ray induced photochemistry [5] to release molecular fluorine in situ inside a sealed and pressurized diamond anvil cell by irradiating a relatively inert and easy-to-handle, powdered or liquid (and thus easy to load) fluorine-bearing compound such as perfluorohexane (c6f14) [6], potassium tetrafluoroborate (kbf4) [7] or xef2. the fluorine-bearing compound is then irradiated with x-rays that are of sufficient energy to penetrate the confining diamonds (or surrounding gasket) [7] which are typically in the hard x-ray range (> 7 kev). as long as we are at a pressure above the solidification pressure of fluorine (2 gpa), the released atomic or molecular fluorine from irradiation is now confined in the sample hole and thus available for chemical reaction. in the present study, we sought to verify the 110001-1 papers in physics, vol. 11, art. 110001 (2019) / m. pravica et al. predictions of transition metal behavior of hg by mixing a fluorine-bearing compound (xef2) with hgf2. fluorine would be produced via x-ray irradiation of xef2 via the following photochemical reaction: xef2 hv−→ xe + f2. (1) the molecular fluorine would then be available to react with hgf2 in the following way: hgf2 + f2 → hgf4, (2) under high pressure. the goal of this effort, then, was to ascertain if any molecular changes occurred after irradiation and then after further pressurization. as our samples are typically very fluorescent after irradiation, we chose infrared spectroscopy as the means to interrogate bonding changes within our sample. as the confined sample was ∼ 3 nano liters, we used a bright synchrotron hard x-ray source and synchrotron infrared source to produce fluorine in situ and to spectroscopically investigate our post-irradiated sample respectively. ii. experimental due to the high reactivity of both hgf2 and xef2 with air and water, loading of the sample was performed inside a ar-backfilled glovebox located at the high pressure collaborative access team’s sample preparation facility at the advanced photon source of argonne national laboratory. a rhenium gasket was preindented to 20 µm thickness (from 250 µm initial thickness) using a symmetricstyle diamond anvil cell (dac) with diamonds that each had a culet diameter of ∼ 300 µm and were ir-transmitting type i quality. a sample hole of diameter ∼ 80 µm was laser drilled in the gasket [8]. powdered xenon difluoride (sigma aldrich > 99%) was pulverized with hgf2 (sigma aldrich > 99%) in a 50/50 mixture by volume and was loaded via spatula into the gasket hole. one thermally-relieved ruby (for pressure measurement) was introduced into the sample which was pressurized to 10 gpa. no pressure-transmitting medium was used in our experiments and all were performed at room temperature. raman spectroscopy was performed on the sample to verify that xef2 was present in the loaded and pressurized sample. the loaded sample was then irradiated with “white” x-rays produced at the 16 bm-b beamline at the advanced photon source (aps). the beam was ∼ 30 microns in diameter. the hgf2 and xef2 mixture was irradiated for more than five hours at pressures above 10 gpa to avoid any material losses triggered by the x-ray induced decomposition of xef2. xrd patterns of the sample were taken at the 16 id-b using monochromatic x-rays that were collected by a mar345r image plate detector. we also note that no irradiation-induced changes in pure hgf2 were observed at any pressure in separate experiments. thus, only xef2 is photochemically-affected by x-rays. the irradiated sample was then transported to the 02b1-1 far-infrared (far-ir) beamline of the canadian light source (cls) where ir spectroscopy measurements at various pressures were carried out in situ inside the dac. pressure was measured using a homemade ruby-fluorimeter constructed by our group located on site at the cls. the ir collection system consisted of a plexiglass enclosure housing the dac and collection optics which was in front of the fourier transform-ir system and was continuously purged from water vapor (measured by a humidity sensor) using positive pressure nitrogen gas blowoff from a liquid nitrogen dewar. a horizontal microscope system collected far-ir spectra. the ir beam was redirected from the sample compartment of a bruker ifs 125 hrr spectrometer to within the working distance of a schwarzchild objective which focused ir light onto the sample. a similar light focusing objective placed behind the sample was used to collect the transmitted light, directing it onto an offaxis parabolic mirror which refocused the ir light into an infrared laboratoriesr si bolometer. the spectrometer was equipped with a 6-micron mylar beamsplitter. the data was collected using a scanner velocity of 40 khz, 12.5-mm entrance aperture, with a 1 cm−1 resolution. the si bolometer was set for a gain of 16×. interferograms were transformed using a zero filling factor of 8 and a 3-term blackman harris apodization function. ft-ir spectral scans typically required 15 minutes to acquire and all measurements were performed at room temperature. 110001-2 papers in physics, vol. 11, art. 110001 (2019) / m. pravica et al. figure 1: transmission far-ir spectra of hgf2 and xef2 mixture pressurized up to 30 gpa and held at 30 gpa for 6 hours then pressurized to 40 gpa. as the pressure is increased beyond 19.5 gpa, a broad multiplet of spectral lines appear near 474 cm−1 and one smaller mode appears near 234 cm−1. the patterns disappear in the 40 gpa spectra. iii. results after initial loading at the aps, the sample possessed a greenish yellow tint demonstrating the presence of hgf2. after further pressurization at the cls, the sample significantly darkened. we present our ir spectral data in the 35–650 cm−1 range in fig. 1. we first compressed the sample from 10 gpa up to 40 gpa recording spectral patterns along the way. as is evident from the figure, a peak near 235 cm−1 and a multiplet of peaks centered near 474 cm−1 appear around 30 gpa. we allowed the sample to remain at 30 gpa for 6 hours and then took another ir spectrum to examine stability of the new peaks with time. the 235 cm−1 peak vanished or was severely diminished within the signal to noise of our system and the multiplet centered around 474 cm−1 largely disappeared or severely diminished with the exception of the peak itself. upon further pressurization to 40 gpa, the highest pressure we subjected the sample to, the patterns completely disappear. the sample pressure was then reduced to 35 gpa (fig. 2, curve a) to ascertain if the observed peaks returned which they did as evidenced in the 32 gpa pattern in fig. 2, curve b. pressure was again increased to 35 gpa and the pattern again disapfigure 2: transmission far ir spectra of the irradiated xef2 and hgf2 mixture as the sample was decompressed from 35 gpa (trace a) to 32 gpa (trace b) and then recompressed to 35 gpa (trace c) demonstrating reversibility of the peak structure at 32 gpa. this pressure-cycled sequence occurred after the first viewing of the feature around 30 gpa present in fig. 1. peared (fig. 2, curve c). pressure was reduced to just above ambient (∼ 1 gpa) and the sample returned to its original white/yellow appearance before irradiation (white). figure 3 displays photos of the sample at various stages. raman spectroscopy was performed upon returning the sample to the pravica raman facility at unlv indicating that only hgf2 and a residual amount of xef2 remained in the sample chamber. x-ray diffraction (xrd) patterns taken of the sample before irradiation and after irradiation, compression and decompression to ambient conditions (see fig. 4) further verifies the claim that the xe and f2 (produced via irradiation of xef2) leaked out from the gasket once the pressure was reduced to near ambient conditions. there is no indication that the rhenium gasket suffered any significant chemical reaction from the f2 (see fig. 4). we have observed this behavior of little or no diffusion of f2 in our samples in prior experiments that produced f2 from kbf4 leading to little or no gasket damage [10] and no discernible reaction with the diamonds [4]. 110001-3 papers in physics, vol. 11, art. 110001 (2019) / m. pravica et al. figure 3: progression sequence of the sample. the first photo on the left represents the mixed xef2 + hgf2 sample near 10 gpa after sample loading. a yellowish hue is evident due to the presence of hgf2. the second (middle) photo illustrates darkening of the sample after irradiation and pressurization to 25 gpa and persisted in this visual state until 40 gpa, the highest pressure in this study. the final photo on the right demonstrates that the sample has returned to its original appearance after reducing pressure to ambient conditions. figure 4: xrd patterns of the hgf2/xef2 mixture (a) before x-ray irradiation and (b) after irradiation, pressurization, and decompression indicating that the xef2 leaked out from the gasket in the form of xe and f2 (produced from the initial x-ray irradiation) leaving only hgf2 in the sample chamber in the fm-3m crystalline structure. the vertical olive green bars in (a) represent the tetragonal crystal structure of xef2 with the i4/mmm space group [11]. iv. discussion hgf4 in the gaseous state has a predicted ir mode (a2u) near 233 cm −1 [9] which agrees well with the mode we observed near 235 cm−1 though we recognize that our mode was observed in the solid state (not the gaseous state) and is at very high pressure. we suspect that the feature near 474 cm−1 is an overtone of the mode near 235 cm−1. botana et al. have calculated stability of hgf4 in the 38–73 gpa pressure range; that hgf3 and hgf4 are both stable from 73–200 gpa; and that from 200–500 gpa, only hgf3 is the stable compound with hg in the +3 oxidation state [1]. seeking to confirm this prediction, we pressurized the dac into the 30 gpa and higher pressure range. as is apparent from our data, a new compound with mercury appears to form near 30 gpa and then disappears around 35 gpa. the compound forms reversibly with pressure cycling. upon further reduction of pressure to ambient conditions, the sample turned white (as it was originally before being irradiated). raman spectroscopy confirmed only the presence of hgf2 indicating that the xe and f2 leaked out from the gasket. the process (irradiation, pressurization and return to ambient) is visually described in fig. 3. we note in passing that we performed a purely high pressure mid-ir study of just the xef2 + hgf2 mixture (see fig. 5) and found no evidence of any significant spectral changes (with the exception of a phase transition near 5 gpa from hgf2) demonstrating that x-ray irradiation in combination with high pressure is necessary to produce the interesting features observed in fig. 2. v. conclusions we have performed a synchrotron far-ir experiment on an irradiated mixture of xef2 and hgf2 pressurized in a dac. the irradiation was performed to release molecular fluorine inside the sample chamber at high pressure in situ thereby obviating the need to load toxic and reactive molecular fluorine inside the diamond cell. upon further pressurization just above 30 gpa, we observed the dramatic appearance of a peak or peaks centered near 234 cm−1 and likely an overtone near 474 cm−1 in a narrow pressure range somewhere between 30–35 gpa which appears to be reversible and which appears to correlate with the calculated a2u mode of hgf4. our observation differs somewhat from the predictions of botana et al. of a 38–73 gpa pressure range of stability [1] but given the challenges associated with connecting theory and experiment at high pressure and given the complex chemistry occurring during and after hard x-ray irradiation and at high pressures, our results are nevertheless encouraging. 110001-4 papers in physics, vol. 11, art. 110001 (2019) / m. pravica et al. figure 5: transmission mid-ir spectra of hgf2 and xef2 non-irradiated mixture pressurized up to 36 gpa. upon release of pressure to ambient, the fluorine and xe produced by the irradiation of xef2 likely leaked out and hgf2 remained inside along with residual xef2. though far-ir experiments do not by themselves prove the formation of hgf4, we are nevertheless encouraged by our results. further experiments are planned to confirm and further verify our results. we anticipate that this seminal experiment will further encourage development of fluorine chemistry at extreme conditions. acknowledgements we thank tim may and zhenxian liu for help in the far-ir and midir measurements, respectively. we gratefully acknowledge support from the department of energy national nuclear security administration (doennsa) under award number de-na0002912. we also acknowledge support from the doe cooperative agreement no. de-fc08-01nv14049 with the university of nevada, las vegas. portions of this work were performed at hpcat (sector 16), advanced photon source (aps), argonne national laboratory. hpcat operations are supported by doe-nnsa under award no. de-na0001974 and doe-bes under award no. de-fg0299er45775, with partial instrumentation funding by nsf. aps is supported by doe-bes, under contract no. de-ac02-06ch11357. a portion of the research described in this paper was performed at the far-ir beamline of the canadian light source, which is supported by the natural sciences and engineering research council of canada, the national research council canada, the canadian institutes of health research, the province of saskatchewan, western economic diversification canada, and the university of saskatchewan. [1] j botana, x wang, c hou, d yan, h lin, y ma, ms miao, mercury under pressure acts as a transition metal: calculated from first principles, angew. chem. int. ed. engl. 54, 9280 (2015). [2] ms miao, j botana, m pravica, d sneed, c park, inner-shell chemistry under high pressure, jap. j. appl. phys. 56, 05fa10 (2017). [3] ms miao, caesium in high oxidation states and as a p-block element, nature chem. 5, 846 (2013). [4] d schiferl, s kinkead, r hanson, d pinnick, raman spectra and phase diagram of fluorine at pressures up to 6 gpa and temperatures between 10 and 320 k, j. chem. phys. 87, 3016 (1987). [5] m pravica, l bai, c park, y liu, m galley, j robinson, n bhattacharya, note: a novel method for in situ loading of gases via x-ray induced chemistry, rev. sci. instrum. 82, 106102 (2011). [6] m pravica, d sneed, m white, y wang, note: loading method of molecular fluorine using xray induced chemistry, rev. sci. instrum. 85, 086110 (2014). [7] m pravica, m white, y wang, y xiao, p chow, hard x-ray induced synthesis of of2, chim. oggi 36, 50 (2018). [8] r hrubiak, s sinogeikin, e rod, g shen, the laser micro-machining system for diamond anvil cell experiments and general precision machining applications at the high pressure collaborative access team, rev. sci. instrum. 86, 072202 (2015). [9] m kaupp, hg von schnering, gaseous mercury (iv) fluoride, hgf4: an ab initio study, angew. chem. int. ed. engl. 32, 861 (1993). 110001-5 papers in physics, vol. 11, art. 110001 (2019) / m. pravica et al. [10] m pravica, m white, y wang, a novel method for generating molecular mixtures at extreme conditions: the case of fluorine and oxygen, aip conf. proc. 1793, 060030 (2017). [11] g wu, x huang, y huang, l pan, f li, x li, m liu, b liu, t cui, confirmation of the structural phase transitions in xef2 under high pressure, j. phys. chem. c. 121, 6264 (2017). 110001-6 papers in physics, vol. 2, art. 020008 (2010) received: 20 october 2010, accepted: 1 december 2010 edited by: a. vindigni reviewed by: a. a. fedorenko, cnrs-lab. de physique, ens de lyon, france. licence: creative commons attribution 3.0 doi: 10.4279/pip.020008 www.papersinphysics.org issn 1852-4249 anisotropic finite-size scaling of an elastic string at the depinning threshold in a random-periodic medium s. bustingorry,1∗a. b. kolton1† we numerically study the geometry of a driven elastic string at its sample-dependent depinning threshold in random-periodic media. we find that the anisotropic finite-size scaling of the average square width w2 and of its associated probability distribution are both controlled by the ratio k = m/lζdep , where ζdep is the random-manifold depinning roughness exponent, l is the longitudinal size of the string and m the transverse periodicity of the random medium. the rescaled average square width w2/l2ζdep displays a non-trivial single minimum for a finite value of k. we show that the initial decrease for small k reflects the crossover at k ∼ 1 from the random-periodic to the random-manifold roughness. the increase for very large k implies that the increasingly rare critical configurations, accompanying the crossover to gumbel critical-force statistics, display anomalous roughness properties: a transverse-periodicity scaling in spite that w2 � m , and subleading corrections to the standard random-manifold longitudinal-size scaling. our results are relevant to understanding the dimensional crossover from interface to particle depinning. i. introduction the study of the static and dynamic properties of d-dimensional elastic interfaces in d+1-dimensional random media is of interest in a wide range of physical systems. some concrete experimental examples are magnetic [1–4] or ferroelectric [5,6] domain walls, contact lines of liquids [7], fluid invasion in porous media [8, 9], and fractures [10, 11]. in all these systems, the basic physics is controlled by the competition between quenched disorder (induced by the presence of impurities in the host materials) which promotes the wandering of the elastic object, against the elastic forces which tend to make the elastic object flat. one of the most dramatic and ∗e-mail: sbusting@cab.cnea.gov.ar †e-mail: koltona@cab.cnea.gov.ar 1 conicet, centro atómico bariloche, 8400 san carlos de bariloche, ŕıo negro, argentina. worth understanding manifestations of this competition is the response of these systems to an external drive. the mean square width or roughness of the interface is one of the most basic quantities in the study of pinned interfaces. in the absence of an external drive, the ground state of the system is disordered but well characterized by a self-affine rough geometry with a diverging typical width w ∼ lζeq , where l is the linear size of the elastic object and ζeq is the equilibrium roughness exponent. when the external force is increased from zero, the ground state becomes unstable and the interface is locked in metastable states. to overcome the barriers separating them and reach a finite steady-state velocity v it is necessary to exceed a finite critical force, above which barriers disappear and no metastable states exist. for directed d-dimensional elastic interfaces with convex elastic energies in a d = d + 1 dimensional space with disorder, the critical point 020008-1 papers in physics, vol. 2, art. 020008 (2010) / s. bustingorry et al. is unique, characterized by the critical force f = fc and its associated critical configuration [12]. this critical configuration is also rough and self-affine such that w ∼ lζdep with ζdep the depinning roughness exponent. when approaching the threshold from above, the steady-state average velocity vanishes like v ∼ (f −fc)β and the correlation length characterizing the cooperative avalanche-like motion diverges as ξ ∼ (f −fc)−ν for f > fc, where β is the velocity exponent and ν is the depinning correlation length exponent [13–16]. at finite temperature and for f � fc, the system presents an ultra-slow steady-state creep motion with universal features [17, 18] directly correlated with its multiaffine geometry [19,20]. at very small temperatures the absence of a divergent correlation length below fc shows that depinning must be regarded as a nonstandard phase transition [20, 21] while exactly at f = fc, the transition is smeared-out with the velocity vanishing as v ∼ tψ, with ψ, the so-called thermal rounding exponent [22–27]. during the last years, numerical simulations have played an important role to understand the physics behind the depinning transition thanks to the development of powerful exact algorithms. in particular, the development of an exact algorithm able to target efficiently the critical configuration and critical force for a given sample [28, 29] has allowed to study, precisely, the self-affine rough geometry at depinning [7, 28–31], the sample-to-sample critical force distribution [32], the critical exponents of the depinning transition [26, 27, 33], the renormalized disorder correlator [34], and the avalanche-size distribution in quasistatic motion [35]. moreover, the same algorithm has allowed to study, precisely, the transient universal dynamics at depinning [36, 37], and an extension of it has allowed to study lowtemperature creep dynamics [20, 21]. in practice, the algorithm for targeting the critical configuration [28, 29] has been numerically applied to directed interfaces of linear size l displacing in a disordered potential of transverse dimension m, applying periodic boundary conditions in both directions in order to avoid border effects. this is thus equivalent to an elastic string displacing in a disordered cylinder. the aspect ratio between longitudinal l and transverse m periodicities must be carefully chosen, in order to have the desired thermodynamic limit corresponding to a given experimental realization. in ref. [32] it was indeed shown that the critical force distribution p(fc) displays three regimes associated with m: (i) at very small m compared with the typical width lζdep of the interface, the interface wraps the computational box several times in the transverse direction, as shown schematically in fig. 1(b), and therefore the periodicity of the random medium is relevant and p(fc) is gaussian; (ii) at very large m compared with lζdep , as shown schematically in fig. 1(c), periodicity effects are absent but then the critical force, being the maximum among many independent sub-critical forces, obeys extreme value statistics and p(fc) becomes a gumbel distribution; (iii) in the intermediate regime, where m ≈ lζdep and periodicity effects are still irrelevant, as shown schematically in fig. 1(a), the distribution function is in between the gaussian and the gumbel distribution. it has been argued that only the last case, where m ≈ lζdep , corresponds to the random-manifold depinning universality class (periodicity effects absent) with a finite critical force in the thermodynamic limit l,m → ∞. this criterion does not give, however, the optimal value of the proportionality factor between m and lζdep , and must be modified at finite velocity since the crossover to the random-periodic universality class at large length-scales depends also on the velocity [38]. to avoid this problem, it has been therefore proposed to define the critical scaling in the fixed center of mass ensemble [39]. the crossover from the random-manifold to the random-periodic universality class is, however, physically interesting, as it can occur in periodic elastic systems such as elastic chains. remarkably, although the mapping from a periodic elastic system (with given lattice parameter) in a random potential to a nonperiodic elastic system (such as an interface) in a random potential with periodic boundary conditions is not exact, it was recently shown that the lattice parameter does play the role of m for elastic interfaces with regard to the geometrical or roughness properties [38]. since the periodicity can often be experimentally tuned in such periodic systems it is thus worth studying in detail the geometry of critical interfaces of size l as a function of m with periodic boundary conditions, and thus complement the study of the critical force in such systems [32]. in this paper, we study in detail, using numerical simulations, the geometrical properties of the one020008-2 papers in physics, vol. 2, art. 020008 (2010) / s. bustingorry et al. figure 1: (a) elastic string driven by a force f in a random-periodic medium with periodic boundary conditions. it is described by a displacement field u(z) and has a mean width w. the anisotropic finite-size scaling of width fluctuations are controlled by the aspect-ratio parameter k = m/lζdep , with ζdep the random-manifold roughness exponent at depinning. in the case k � 1 (b) periodicity effects are important, while when k � 1 (c) they are not important but the roughness scaling of the critical configuration is anomalous. dimensional interface or elastic string critical configuration in a random-periodic pinning potential as a function of the aspect ratio parameter k, conveniently defined as k = m/lζdep . we show that k is indeed the only parameter controlling the finitesize scaling (i.e. the dependence of observables with the dimensions l and m) of the average square width and its sample-to-sample probability distribution. the scaled average square width w2l−2ζdep is described by a universal function of k displaying a non-trivial single minimum at a finite value of k. we show that while for small k this reflects the crossover at k ∼ 1 from the random-periodic to the random-manifold depinning universality class, for large k it implies that in the regime where the depinning threshold is controlled by extreme value (gumbel) statistics, critical configurations also become rougher, and display an anomalous roughness scaling. ii. method the model we consider here is an elastic string in (1+1) dimensions described by a single valued function u(z,t), which gives the transverse displacement u as a function of the longitudinal direction z and the time t [see fig. 1(a)]. the zero-temperature dynamics of the model is given by γ ∂tu(z,t) = c∂ 2 zu(z,t) + fp(u,z) + f, (1) where γ is the friction coefficient and c the elastic constant. the first term in the right hand side derives from an harmonic elastic energy. the effects of a random-bond type disorder is given by the pinning force fp(u,z) = −∂uu(u,z). the disorder potential u(u,z) has zero average and sample-tosample fluctuations given by [u(u,z) −u(u′,z′)]2 = δ(z −z′) r2(u−u′), (2) where the overline indicates average over disorder realizations and r(u) stands for a correlator of finite range rf [18]. finally, f represents the uniform external drive acting on the string. physically, this model can phenomenologically describe, for instance, a magnetic domain wall in a thin film ferromagnetic material with weak and randomly located imperfections [1], being f proportional to an applied external magnetic field pushing the wall in the energetically favorable direction. in order to numerically solve eq. (1), the system is discretized in the z-direction in l segments of size δz = 1, i.e. z → j = 0, ...,l− 1, while keeping uj(t) as a continuous variable. to model the continuous random potential, a cubic spline is used, which passes through m regularly spaced uncorrelated gaussian number points [30]. for the numerical simulations performed here we have used, without loss of generality, γ = 1, c = 1 and rf = 1 and a disorder intensity r(0) = 1. in both spatial dimensions we have used periodic boundary conditions, thus defining a l×m system. the critical configuration uc(z) and force fc are defined from the pinned (zero-velocity) configuration with the largest driving force f in the long time limit dynamics. they are thus the real solutions of c∂2zu(z) + fp(u,z) + f = 0, (3) 020008-3 papers in physics, vol. 2, art. 020008 (2010) / s. bustingorry et al. such that for f > fc there are no further real solutions (pinned configurations). middleton theorems [12] assure that for eqs. (3) the solution exists and it is unique for both uc(z) and fc, and that above fc the string trajectory in an l dimensional phase-space is trapped into a periodic attractor (for a system with periodic boundary conditions as the one we consider). in other words, the critical configuration is the marginal fixed point solution or critical state of the dynamics, being fc the critical point control parameter of a hopf bifurcation. solving the l-dimensional system of eqs. (3) for large l directly is a formidable task, due to the non-linearity of the pinning force fp. on the other hand, solving the long-time dynamics at different driving forces f to localize fc and uc is very inefficient due to the critical slowing down. fortunately, middleton theorems, and in particular the “non-passing rule”, can be used again to devise a precise and very efficient algorithm which allows to obtain the critical force fc and the critical configuration ucj for each independent disorder realization iteratively without solving the actual dynamics nor directly inverting the system of eqs. (3) [30]. once the critical force and the critical configuration are determined with this algorithm, we can compute the different observables. in particular, the square width or roughness of the string at the critical point for a given disorder realization is defined as w2 = 1 l l−1∑ j=0 [ ucj − 1 l l−1∑ k=0 uck ]2 . (4) computing w2 for different disorder realizations allows us to compute its disorder average w2 and the sample-to-sample probability distribution p(w2). in addition, the average structure factor associated to the critical configuration is sq = 1 l ∣∣∣∣∣∣ l−1∑ j=0 ucj e −iqj ∣∣∣∣∣∣ 2 , (5) where q = 2πn/l, with n = 1, ...,l − 1. one can show, using a simple dimensional analysis, that given a roughness exponent ζ, such that w2 ∼ l2ζ, the structure factor behaves as s(q) ∼ q−(1+2ζ) for small q, thus yielding an estimate to ζ without changing l. to compute averages over disorder and sample-to-sample fluctuations, we consider figure 2: the scaling of w2 for the critical configuration at different m values as indicated. the curves for m = 64 and 16384 are shifted upwards for clarity. the dashed and dotted lines are guides to the eye showing the expected slopes corresponding to the different roughness exponents. between 103 and 104 independent disorder realizations depending on the size of the system. iii. results i. roughness at the critical point figure 2 shows the scaling of the square width of the critical configuration w2 with the longitudinal size of the system l for l = 32, 64, 128, 256, 512 and different values of m. when m is small, m = 8, for all the l values shown we observe w2 ∼ l2ζl with ζl = 1.5, corresponding to the larkin exponent in (1 + 1) dimensions. this value is different from the value ζdep = 1.25 [33, 40] expected for the randommanifold universality class, and is thus indicating that the periodicity effects are important for this joint values of m and l. this situation is schematically represented in fig. 1(b). this result is a numerical confirmation of the two-loop functional renormalization group result of ref. [16] which shows that the ζ = 0 fixed point, leading to a universal logarithmic growth of displacements at equilibrium is unstable. the fluctuations are governed, instead, by a coarse-grained generated randomforce as in the larkin model, yielding a roughness exponent ζl = (4−d)/2 in d dimensions [16], which agrees with our result for d = 1. we can thus say 020008-4 papers in physics, vol. 2, art. 020008 (2010) / s. bustingorry et al. figure 3: structure factor of the critical configuration for l = 256 and different m values, as indicated. the curves for m = 64 and 16384 are shifted upwards for clarity. the dashed and dotted lines are guides to the eye showing the expected slopes corresponding to the different roughness exponents. that for small enough m (compared to l) the system belongs to the same random-periodic depinning universality class as charge density wave systems [14, 41], which strictly correspond to m = 1. when m is large, on the other hand, m = 16384 in fig. 2, for all the l values considered the exponent is consistent with ζdep, of the random-manifold universality class. this situation is schematically represented in fig. 1(c), and we will show later that, for this elongated samples, the effects of extreme value statistics are already visible. for intermediate values of m, such as m = 64 in fig. 2, we can observe the crossover in the scale-dependent roughness exponent ζ(l) ∼ 1 2 d log w2 d log l changing from ζdep to ζl as l increases, as indicated by the dashed and dotted lines. this crossover, from the random-manifold to the random-periodic depinning geometry, occurs at a characteristic distance l∗ ∼ m1/ζdep , when the width in the random-manifold regime reaches the transverse dimension or periodicity m. at finite velocity, this crossover length remains constant up to a non-trivial characteristic velocity and then decreases with increasing velocity [38]. the above mentioned geometrical crossover can be studied in more details through the analysis of the structure factor s(q), for a line of fixed size l. in fig. 3 we show s(q) for l = 256 and m = figure 4: scaling of the structure factor of the critical configuration for l = 256 and different values of the transverse size m = 2p with p = 3, 4, ..., 14 m. although the values of the two exponents are very close, the change in the slope of the scaling function against the scaling variable x = q m1/ζdep is clearly observed. 8, 64, 16384. for the intermediate value m = 64 a crossover between the two regimes is visible, and can be described by sq ∼ { q−(1+2ζl) q � q∗, q−(1+2ζdep) q � q∗. (6) with q∗ expected to scale as q∗ ∼ l∗−1 ∼ m−1/ζdep . therefore, the structure factor should scale as sqm −(2+1/ζdep) = h(x), where the scaled variable is x = q m1/ζdep ∼ q/q∗ and the scaling function behaves as h(x) ∼ { x−(1+2ζl) x � 1, x−(1+2ζdep) x � 1. (7) the collapse of fig. 4 for l = 256 and different values of m = 2p with p = 3, 4, ..., 14 shows that this scaling form is a very good approximation. however, as we show below, small corrections can be expected fully in the random-manifold regime in the large ml−ζdep limit of very elongated samples. in fig. 5(a), we show w2 as a function of the transverse periodicity m for different values of the longitudinal periodicity l. remarkably, w2 is a non-monotonic function of m. for small m it decreases towards an l dependent minimum m∗, and then increases with increasing m, in the regime where the extreme value statistics starts to affect the distribution of the critical force [32]. since 020008-5 papers in physics, vol. 2, art. 020008 (2010) / s. bustingorry et al. figure 5: (a) squared width of the critical configuration as a function of m for different system sizes l as indicated. (b) scaling of the width in (a), showing that the relevant control parameter is m/lζdep . the dashed line in (a) and (b) corresponds to w2 = m2, which is always to the left of the minimum of w2 occurring at k∗ = m∗l−ζdep . the solid line indicates k2(1−ζl/ζdep) which is the behavior expected purely from the random-periodic to random-manifold crossover at the characteristic distance l∗ ∼ m1/ζdep . the only typical transverse scale in fig. 5(a) is set by the minimum m∗, we can expect w2 ∼ m∗2g(m/m∗) with g(x) some universal function. on the other hand, since the only relevant characteristic length-scale of the problem is set by the crossover between the random-periodic regime and the random-manifold regime, we can simply write m∗ ∼ lζdep and therefore w2 l−2ζdep ∼ g(m l−ζdep ). (8) this scaling form is confirmed in fig. 5(b) and shows that the aspect-ratio parameter k = ml−ζdep fully controls the anisotropic finite-size scaling of the problem. it is worth, however, noting some interesting consequences of the result of fig. 5(b), as we describe below. since at very small k the interface is in the random-periodic regime, eq. (8) should led to w2 ∼ l2ζl and therefore one deduces that, g(k) ∼ k2(1−ζl/ζdep), k � k∗, (9) where k∗ = m∗l−ζdep . the fact that the randomperiodic roughness exponent ζl = 3/2 is larger than the random-manifold one ζdep ≈ 5/4 consequently implies an initial decrease of g(k) as g(k) ∼ k−2/5, as shown in fig. 5(b) by the solid line. periodicity effects, or the crossover from random-periodic to random-manifold, thus explain the initial decrease of g(k) seen in fig. 5(b), or the initial decrease of w2 against m for fixed l, seen in fig. 5(a). at this respect, it is then worth noting that the line w2 = m2, shown by a dashed line, lies completely in the regime k < k∗ implying that the naive criterion w2 < m2 is not enough to avoid periodicity effects, and to have the system fully in the random-manifold regime. as we show later, this is related with the shape of the probability distribution of p(w2) which displays sample-to-sample fluctuations of the order of the average w2. the presence of a minimum at k∗ in the function g(k) and in particular its slower than power-law increase for k > k∗ is non-trivial and constitutes one of the main results of the present work. this result shows that corrections to the standard scaling w2 ∼ lζdep may arise from the aspect-ratio dependence of the prefactor g(k). on the one hand, w2 now grows with m for l fixed, in spite that w2 � m2, i.e. transverse-size/periodicity scaling is present. on the other hand, the scaling of w2 with 020008-6 papers in physics, vol. 2, art. 020008 (2010) / s. bustingorry et al. l is slower in this regime, due to subleading scaling corrections coming from g(k). the precise origin of these interesting leading and subleading corrections in the finite-size anisotropic scaling are highly non-trivial. since the critical configurations in this regime have the constant roughness exponent ζdep of the random-manifold universality class, the slow increase of g(k) cannot be attributed to a geometrical crossover effect, as for the case k < k∗. however, we might relate this effect to the crossover in the critical force statistics, from gaussian to gumbel, in the k � k∗ limit [32]. in the gumbel regime, the average critical force is expected to increase as fc ∼ log(m/l ζ dep) ≡ log k [39], since the sample critical force can be roughly regarded as the maximum among m/lζdep independent subcritical forces and configurations [32]. the increase in the critical force might be therefore correlated with the slow increase of roughness. the physical connection between the two is subtle though, since a large critical force in a very elongated sample could be achieved both by profiting very rare correlated pinning forces such as accidental columnar defects, or by profiting very rare non-correlated strong pinning forces. since in the first case the critical configuration would be more correlated and in general less rough than for less elongated samples (smaller k), contrary to our numerical data of fig. 5(b), we think that the second cause is more plausible. we can thus think that in the k � k∗ limit of extreme value statistics of fc, the effective disorder strength on the critical configuration increases with k. this might be translated into the universal function g(k), such that w2 ≈ l2ζdepg(k) can increase for increasing values of k at fixed l in such regime. a quantitative description of these scaling corrections remains an open challenge. ii. distribution function we now analyze sample-to-sample fluctuations of the square width w2 by computing its probability distribution p(w2). this property is relevant as w2 fluctuates even in the thermodynamic limit for critical interfaces with a positive roughness exponent [42]. it has been computed for models with dynamical disorder such as random-walk [43] or edwards–wilkinson interfaces [44, 45], the mullins herrings model [46] and for non-markovian gaussian signals in general [47, 48]. it has also been figure 6: scaling function φ(x) for l = 256 and different values of m = 8, 128, 2048, 16384, which shows the change with the transverse size m. calculated for non-linear models such as the onedimensional kardar–parisi–zhang model [49, 50] and for the quenched edwards–wilkinson model at equilibrium [51]. in particular, the probability distribution p(w2) of critical interfaces at the depinning transition was studied analytically [52], numerically [31] and also experimentally for contact lines in partial wetting [7]. remarkably, non-gaussian effects in depinning models are found to be smaller than 0.1% [31, 52], thus showing that p(w2) is strongly determined by the self-affine (critical) geometry itself, rather than by the particular mechanism producing it. as in all the above mentioned systems the width distribution p(w2) at different universality classes of the depinning transition was found to scale as w2p(w2) ≈ φζ ( w2 w2 ) . (10) with φζ an universal function, which only depends on the roughness exponent ζ and on boundary conditions when the global width is considered [47,48]. in this way, w2 is the only characteristic lengthscale of the system, absorbing the system longitudinal size l, and all the non-universal parameters of the model such as the elastic constant of the interface, the strength of the disorder and/or the temperature. since φζ can be easily generated using non-markovian gaussian signals [53], the quantity w2p(w2) is a good observable to extract the 020008-7 papers in physics, vol. 2, art. 020008 (2010) / s. bustingorry et al. figure 7: scaling function φ(x) for different values of l = 32, 64, 128, 256 while keeping (a) k = m/lζdep ≈ 1 and (b) k = m/lζdep ≈ 0.025. the dotted line corresponds to the scaling function of the non-disordered edwards–wilkinson equation [43], while the continuous and dashed lines correspond to the scaling functions of gaussian signals with ζ = 1.25 and ζ = 1.5, respectively [31, 53]. roughness exponent of a critical interface from experimental data. in fig. 6, we show how the scaled distribution function φ(x) ≡ w2 p(x w2) looks like for the depinning transition in a random-periodic medium for a fixed value l = 256 and different values of m. we see that φ(x) depends on m for small m but converges to a fixed shape for large m. we also note that for all m φ(x) extends appreciably beyond x = 1 explaining why the criterion w2 . m2 is not enough to be fully in the random-manifold regime, as noted in fig. 5. in fig. 7, we show the scaling function φ(x) for different values of l and m but fixing the aspectratio parameter k = m/lζdep , k ≈ 1 > k∗ in fig. 7(a) and k ≈ 0.025 � k∗ in fig. 7(b), with k∗ the minimum of w2. since data for the same k practically collapses into the same curve, we can write for our case: w2p(w2) = φ ( w2 w2 ,k ) . (11) therefore, the anisotropic scaling of the probability distribution is fully controlled by k, as it was found for w2. in figs. 7(a) and (b), we also show the universal functions φζl and φζdep generated using non-markovian gaussian signals [31, 53], and for comparison we also show φ1/2 corresponding to the markovian periodic gaussian signal or the edwards–wilkinson equation [43]. comparing this with the collapsed data for depinning, we see that the function φ ( w2 w2 ,k ) respects the limits φ (x,k → 0) = φζl (x), φ (x,k & k∗) ≈ φζdep (x), (12) as expected from the existence of the geometric crossover between the roughness exponents ζl for k → 0 and ζdep for k > k∗. for intermediate values k < k∗, however, φ ( w2 w2 ,k ) does not necessarily coincide with the one of a gaussian signal function φζ for a given ζ, since the critical configuration includes a crossover length l∗ . l. whether multi-affine or effective exponent self-affine nonmarkovian gaussian signals can be used to describe satisfactorily these intermediate cases is an interesting open issue. iv. conclusions we have numerically studied the anisotropic finitesize scaling of the roughness of a driven elastic string at its sample-dependent depinning threshold in a random medium with periodic boundary conditions in both the longitudinal and transverse directions. the average square width w2 and its probability distribution are both controlled by the parameter k = m/lζdep . a non-trivial single minimum for a finite value of k was found in w2/l2ζdep . for small k, the initial decrease of w2 reflects the crossover from the random-periodic to the random-manifold roughness. for very large k, the growth with k implies that the crossover to gumbel 020008-8 papers in physics, vol. 2, art. 020008 (2010) / s. bustingorry et al. statistics in the critical forces induces corrections to g(k), that grow with k, to the string roughness scaling w2 ≈ g(k)l2ζdep . these increasingly rare critical configurations thus have an anomalous roughness scaling: they have a transversesize/periodicity scaling in spite that its width is w2 � m2, and subleading (negative) corrections to the standard random-manifold longitudinal-size scaling. our results could be useful for understanding roughness fluctuations and scaling in finite experimental systems. the crossover from randomperiodic to random-manifold roughness could be studied in periodic elastic systems with variable periodicity, such as confined vortex rows [54] and single-files of macroscopically charged particles [55] or colloids [56], with additional quenched disorder. the rare-event dominated scaling corrections to the interface roughness scaling could be studied in systems with a large transverse dimension, such as domain walls in ferromagnetic nanowires [57]. for the later case, it would be interesting to have a quantitative theory, making the connection between the extreme value statistics of the depinning threshold and the anomalous scaling corrections to the roughness of such rare critical configurations. this would allow to understand the dimensional crossover, from interface to particle depinning. acknowledgements this work was supported by cnea, conicet under grant no. pip11220090100051, and anpcyt under grant no. pict2007886. a. b. k. acknowledges universidad de barcelona, ministerio de ciencia e innovación (spain) and generalitat de catalunya for partial funding through i3 program. [1] s lemerle, j ferré, c chappert, v mathet, t giamarchi, p le doussal, domain wall creep in an ising ultrathin magnetic film, phys. rev. lett. 80, 849 (1998). [2] m bauer, a mougin, j p jamet, v repain, j ferré, s l stamps, h bernas, c chappert, deroughening of domain wall pairs by dipolar repulsion, phys. rev. lett. 94, 207211 (2005). [3] m yamanouchi, d chiba, f matsukura, t dietl, h ohno, velocity of domain-wall motion induced by electrical current in the ferromagnetic semiconductor (ga,mn)as, phys. rev. lett. 96, 096601 (2006). [4] p j metaxas, j p jamet, a mougin, m cormier, j ferré, v baltz, b rodmacq, b dieny, r l stamps, creep and flow regimes of magnetic domain-wall motion in ultrathin pt/co/pt films with perpendicular anisotropy, phys. rev. lett. 99, 217208 (2007). [5] p paruch, t giamarchi, j m triscone, domain wall roughness in epitaxial ferroelectric pbzr0.2ti0.8o3 thin films, phys. rev. lett. 94, 197601 (2005). [6] p paruch, j m triscone, high-temperature ferroelectric domain stability in epitaxial pbzr0.2ti0.8o3 thin films, appl. phys. lett. 88, 162907 (2006). [7] s moulinet, a rosso, w krauth, e rolley, width distribution of contact lines on a disordered substrate, phys. rev. e 69, 035103(r) (2004). [8] n martys, m cieplak, m o robbins, critical phenomena in fluid invasion of porous media, phys. rev. lett. 66, 1058 (1991). [9] i hecht, h taitelbaum, roughness and growth in a continuous fluid invasion model, phys. rev. e 70, 046307 (2004). [10] e bouchaud, j p bouchaud, d s fisher, s ramanathan, j r rice, can crack front waves explain the roughness of cracks?, j. mech. phys. solids 50, 1703 (2002). [11] m alava, p k v v nukalaz, s zapperi, statistical models of fracture, adv. phys. 55, 349 (2006). [12] a a middleton, asymptotic uniqueness of the sliding state for charge-density waves, phys. rev. lett. 68, 670 (1992). [13] d s fisher, sliding charge-density waves as a dynamic critical phenomenon, phys. rev. b 31, 1396 (1985). 020008-9 papers in physics, vol. 2, art. 020008 (2010) / s. bustingorry et al. [14] o narayan, d s fisher, critical behavior of sliding charge-density waves in 4 − ε dimensions, phys. rev. b 46, 11520 (1992). [15] t nattermann, s stepanow, l h tang, h leschhorn, dynamics of interface depinning in a disordered medium, j. phys. ii 2, 1483 (1992). [16] p le doussal, k j wiese, p chauve, two-loop functional renormalization group theory of the depinning transition, phys. rev. b 66, 174201 (2002). [17] l b ioffe, v m vinokur, dynamics of interfaces and dislocations in disordered media, j. phys. c: solid state phys. 20, 6149 (1987). [18] p chauve, t giamarchi, p le doussal, creep and depinning in disordered media, phys. rev. b 62, 6241 (2000). [19] a b kolton, a rosso, t giamarchi, creep motion of an elastic string in a random potential, phys. rev. lett. 94, 047002 (2005). [20] a b kolton, a rosso, t giamarchi, w krauth, creep dynamics of elastic manifolds via exact transition pathways, phys. rev. b 79, 184207 (2009). [21] a b kolton, a rosso, t giamarchi, w krauth, dynamics below the depinning threshold in disordered elastic systems, phys. rev. lett. 97, 057001 (2006). [22] l w chen, m c marchetti, interface motion in random media at finite temperature, phys. rev. b 51, 6296 (1995). [23] d vandembroucq, r skoe, s roux, universal depinning force fluctuations of an elastic line: application to finite temperature behavior, phys. rev. e 70, 051101 (2004). [24] u nowak, k d usadel, influence of temperature on the depinning transition of driven interfaces, europhys. lett. 44, 634 (1998). [25] l roters, a hucht, s lübeck, u nowak, k d usadel, depinning transition and thermal fluctuations in the random-field ising model, phys. rev. e 60, 5202 (1999). [26] s bustingorry, a b kolton, t giamarchi, thermal rounding of the depinning transition, europhys. lett. 81, 26005 (2008). [27] s bustingorry, a b kolton, t giamarchi, (unpublished). [28] a rosso, w krauth, monte carlo dynamics of driven elastic strings in disordered media, phys. rev. b 65, 012202 (2001). [29] a rosso, w krauth, origin of the roughness exponent in elastic strings at the depinning threshold, phys. rev. lett. 87, 187002 (2001). [30] a rosso, w krauth, roughness at the depinning threshold for a long-range elastic string, phys. rev. e 65, 025101(r) (2002). [31] a rosso, w krauth, p le doussal, j vannimenus, k j wiese, universal interface width distributions at the depinning threshold, phys. rev. e 68, 036128 (2003). [32] c bolech, a rosso, universal statistics of the critical depinning force of elastic systems in random media, phys. rev. lett. 93, 125701 (2004). [33] o duemmer, w krauth, critical exponents of the driven elastic string in a disordered medium, phys. rev. e 71, 061601 (2005). [34] a rosso, p l doussal, k j wiese, numerical calculation of the functional renormalization group fixed-point functions at the depinning transition, phys. rev. b 75, 220201 (2007). [35] a rosso, p le doussal, k j wiese, avalanchesize distribution at the depinning transition: a numerical test of the theory, phys. rev. b 80, 144204 (2009). [36] a b kolton, a rosso, e v albano, t giamarchi, short-time relaxation of a driven elastic string in a random medium, phys. rev. b 74, 140201 (2006). [37] a b kolton, g schehr, p le doussal, universal nonstationary dynamics at the depinning transition, phys. rev. lett. 103, 160602 (2009). 020008-10 papers in physics, vol. 2, art. 020008 (2010) / s. bustingorry et al. [38] s bustingorry, a b kolton, t giamarchi, random-manifold to random-periodic depinning of an elastic interface, phys. rev. b 82, 094202 (2010). [39] a a fedorenko, p le doussal, k j wiese, universal distribution of threshold forces at the depinning transition, phys. rev. e 74, 041110 (2006). [40] a rosso, a k hartmann, w krauth, depinning of elastic manifolds, phys. rev. e 67, 021602 (2003). [41] o narayan, d fisher, threshold critical dynamics of driven interfaces in random media, phys. rev. b 48, 7030 (1993). [42] z rácz, scaling functions for nonequilibrium fluctuations: a picture gallery, spie proc. 5112, 248 (2003). [43] g foltin, k oerding, z rácz, r l workman, r k p zia, width distribution for random-walk interfaces, phys. rev. e 50, r639 (1994). [44] t antal, z rácz, dynamic scaling of the width distribution in edwards–wilkinson type models of interface dynamics, phys. rev. e 54, 2256 (1996). [45] s bustingorry, l f cugliandolo, j l iguain, out-of-equilibrium relaxation of the edwards– wilkinson elastic line, j. stat. mech.: theor. exp. p09008 (2007). [46] m plischke, z rácz, r k p zia, width distribution of curvature-driven interfaces: a study of universality, phys. rev. e 50, 3589 (1994). [47] a rosso, r santachiara, w krauth, geometry of gaussian signals, j. stat. mech.: theor. exp. l08001 (2005). [48] r santachiara, a rosso, w krauth, universal width distributions in non-markovian gaussian processes, j. stat. mech.: theor. exp. p02009 (2007). [49] e marinari, a pagnani, g parisi, z rácz, width distributions and the upper critical dimension of kardar–parisi–zhang interfaces, phys. rev. e 65, 026136 (2002). [50] s bustingorry, aging dynamics of non-linear elastic interfaces: the kardar–parisi–zhang equation, j. stat. mech.: theor. exp. p10002 (2007). [51] s bustingorry, j l iguain, s chamon, l f cugliandolo, d domı́nguez, dynamic fluctuations of elastic lines in random environments, europhys. lett. 76, 856 (2006). [52] p le doussal, k j wiese, higher correlations, universal distributions, and finite size scaling in the field theory of depinning, phys. rev. e 68, 046118 (2003). [53] w krauth, statistical mechanics: algorithms and computations, oxford university press, new york (2006). [54] n kokubo, r besseling, p kes, dynamic ordering and frustration of confined vortex rows studied by mode-locking experiments, phys. rev. b 69, 064504 (2004). [55] c coste, j b delfau, c even, m s jean, singlefile diffusion of macroscopic charged particles, phys. rev. e 81, 051201 (2010). [56] s herrera-velarde, a zamudio-ojeda, r castañeda-priego, ordering and single-file diffusion in colloidal systems, j. chem. phys. 133, 114902 (2010). [57] k j kim, j c lee, s m ahn, k s lee, c w lee, y j cho, s seo, k h shin, s b choe, h w lee, interdimensional universality of dynamic interfaces, nature (london) 458, 740 (2009). 020008-11 papers in physics, vol. 10, art. 100007 (2018) received: 7 july 2018, accepted: 27 september 2018 edited by: a. mart́ı, m. monteiro reviewed by: r. marotti, instituto de f́ısica, facultad de ingenieŕıa universidad de la república, uruguay. licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.100007 www.papersinphysics.org issn 1852-4249 temperature-dependent transport measurements with arduino a. hilberer,1 g. laurent,1 a. lorin,1 a. partier,1 j. bobroff,2 f. bouquet,2∗ c. even,2 j. m. fischbach,1 c. a. marrache-kikuchi,3† m. monteverde,2 b. pilette,1 q. quay2 the current performances of single-board microcontrollers render them attractive, not only for basic applications, but also for more elaborate projects, amongst which are physics teaching or research. in this article, we show how temperature-dependent transport measurements can be performed by using an arduino board, from cryogenic temperatures up to room temperature or above. we focus on two of the main issues for this type of experiments: the determination of the sample temperature and the measurement of its resistance. we also detail two student-led experiments: evidencing the magnetocaloric effect in gadolinium and measuring the resistive transition of a high critical temperature superconductor. i. introduction the development of single-board microcontrollers and single-board computers has given physicists access to a large variety of inexpensive experimentation that can be used either to design simple test benches, to put together set-ups for class demonstration or to devise student practical work. moreover, specifications of single-board components are now such that, although they cannot rival with state-of-the-art scientific equipment, one can nonetheless derive valuable physical results from them. ∗e-mail: frederic.bouquet@u-psud.fr †e-mail: claire.marrache@u-psud.fr 1 magistère de physique fondamentale, département de physique, univ. paris-sud, université paris-saclay, 91405 orsay campus, france. 2 laboratoire de physique des solides, cnrs, univ. parissud, université paris-saclay, 91405 orsay campus, france. 3 csnsm, univ. paris-sud, cnrs/in2p3, université paris-saclay, 91405 orsay, france. we will here focus on the arduino microcontroller board [1]. let us note that other boards, such as mbed, hawkboard, rasberry pi, or odroid to cite but a few, exist which may be cheaper and/or have better characteristics than arduino. in our case, we have employed arduino boards to take advantage of the important user’s community. this has been an important selling point for the students with whom we are working. indeed, our experience with arduino is primarily based on undergraduate project-based physics labs [2] we have initiated within the fundamental physics department of université paris sud for students to gain a first hands-on practice of experimental physics. in these practicals, students are asked to choose a subject they want to study during a week-long project. they then have to design and build the experiment with the equipment available in the lab. the aim is not only to study a physical phenomenon, but to do so by using inexpensive materials and low-cost boards. in this article, we will describe two projects that have been developed by third year students. the first one aimed at quantifying the magnetocaloric 100007-1 papers in physics, vol. 10, art. 100007 (2018) / c. a. marrache-kikuchi et al. effect in gadolinium and the second one, which has been popular amongst students, consisted in measuring the resistive transition of a high critical temperature superconductor (htcs). however, the techniques to do so can more generally be used for any experiment involving the measurement of a low voltage while varying the set-up temperature. in particular, they could be applied to simple transport characterization of samples in research laboratories. in the following, we will focus on two important issues for this kind of measurements: thermometry and thermal anchoring on the one hand, and measuring resistances on the other. ii. determining the temperature of an object using microcontrollers one of the experimental control parameters that is most commonly used to make a physical system properties vary is temperature. there is a wide variety of temperature sensors, depending on the temperature range of interest. the aim of this paper is not to list those, but rather, to focus on the most ordinarily found sensors compatible with an arduino read out. we will also review some basic techniques to ensure a proper thermal contact between the sample and the thermometer. i. sensors types a. built-in arduino sensors there are a number of temperature sensors that are generally sold with standard arduino kits. the arduino starter kit, for instance, comes with a tmp36 low voltage temperature sensor [3] (available for about $1 if purchased separately) whose operating principle is based on the temperaturedependence of the voltage drop across a diode. the advantage of this type of thermometer is that it can be directly plugged into arduino without any additional electrical circuit. furthermore, provided that the corresponding library is downloaded, the temperature is straightforwardly read via the computer interface in ◦c, so that no calibration is needed. however, these sensors are limited in accuracy and operation: the tmp36 sensor for example has a ±2◦c precision over the −40◦c to +125◦c range where it can operate. if they are extremely convenient for non-demanding temperature read-outs, such as students atmospheric probes for example [4], they are not adapted to the precision needed for most research lab experiments. b. thermocouple thermocouples are cheap and robust thermal sensors that are industrially available for about $15, and which cover a wide range of temperatures (for example from −200◦c to +1250◦c for a type k thermocouple [5]). they are one of the few thermometers that are reliable at temperatures much higher than room temperature. thermocouples are also extremely convenient to measure the temperature of small-sized samples. indeed, only the hot junction between the two metals needs to be in contact with the region where the temperature is to be monitored. on the other hand, the quality of the readings will strongly depend on how thermally stable the cold junction is, and the temperature measurement is less precise than using a thermistor. indeed, the voltage to be measured is small: the sensitivity of a thermocouple is of the order of tens of microvolts per kelvin, and it decreases when the temperature decreases (a typek thermocouple has a sensitivity of 40 µv.k−1 at room temperature but a sensitivity of 10 µv.k−1 at liquid nitrogen temperature). let us note that an amplification of the voltage signal is then needed to read the temperature with arduino. some chips provide a ready-to-use thermocouple amplifier for microcontrollers (such as the max31856 breakout with a resolution of a quarter of a kelvin when using the adafruit library [6] and an accuracy of a few kelvins). better sensitivity could be achieved with a home-made amplifier (see below) and some care. c. platinum thin resistive films platinum thin films are practical and very reliable resistive thermometers typically working from 20 k to 700 k [7]. they are therefore suited for cryogenic applications – at least down to liquid nitrogen temperatures – as well as for moderate heating. the advantage of this sensor is that its response is entirely determined by the value of its resistance at 0◦c [8]. the most commonly used platinum resis100007-2 papers in physics, vol. 10, art. 100007 (2018) / c. a. marrache-kikuchi et al. tance is the so-called pt100 which has a resistance of 100 ω at 0◦c and costs approximately $3 to $5. these thermal sensors have a typical precision of about 20 mk up to 300 k and about 200 mk above room temperature. moreover, their magnetic fielddependent temperature errors are well-known [7]. it is possible to mount those resistances on a dedicated arduino resistance-to-temperature converter such as max31865 [9], but it is often simpler to plainly measure the resistance with a dedicated electrical circuit as will be explained in section iii. this is particularly convenient for low or high temperature measurements for which the arduino board cannot be at the same temperature as the thermometer and the sample are. ii. thermal anchoring for the temperature measurement to be relevant, the thermometer must be in good thermal contact with the sample. how to achieve a good thermal anchoring is a subject of investigation in itself, but in this section we will outline a few standard techniques, focusing on the low temperature case. to cool down a sample at low temperatures, one could use a peltier module, but the simplest – and not so expensive – way is to use liquid nitrogen. some basic safety measures have to be taken to manipulate this cryogenic fluid: use protection glasses, gloves, work in a well-ventilated room and, above all, ensure that it is poured into a vessel that is not leak-tight to allow natural evaporation of liquid nitrogen and avoid pressure build-up in the vessel. once these precautions are observed, the manipulation is relatively safe. to ensure that the thermometer indeed probes the sample temperature, the most obvious technique is to solidly attach the thermometer to the sample using good thermal conductors. the thermal sensor can, for instance, be glued onto a copper sample holder. the glue then has to retain its properties at the probed temperature range. in the low temperature case, one frequently uses ge 7031 varnish which sustains very low temperatures and can easily be removed with a solvent. alternatively, the thermometer could be mechanically fixed with a spring-shaped material whose elasticity is maintained at a low temperature, such as cuni sheets. upon cooling, the spring-shaped material will continue to apply pressure onto the thermometer, thus ensuring a good mechanical and thermal contact with the sample holder. another method is to thermally insulate the thermometer and the sample from the outside world, while putting them in contact with a common thermal bath. this can be done by inserting them into a container filled with glass beads of a few millimeters in diameter [10] (inset of fig. 7), or alternatively, sand. these materials provide a good thermal insulation of the {sample+thermometer} system from the outside world while allowing for an important thermal inertia. moreover, when working at temperatures close to 77 k, they limit the liquid nitrogen evaporation so that the temperature increases back to room temperature only very slowly: typically for a volume 1 l of beads that is initially immersed in liquid nitrogen, the temperature reaches back 300 k in 3 to 4 hours. the heat exchange between the sensor and the sample is then guaranteed through the evaporated n2 gas, thus ensuring the temperature is homogeneous within the entire volume. an alternative method for achieving good thermal contact between the sensor and the sample through gas exchange is explained in ref. [11]. iii. measuring resistances with microcontrollers microcontroller inputs give a reading of electric potentials. measuring resistances is then slightly more complicated than plugging a resistance into an ohmmeter. for educational purposes, this is actually rather valuable since it gives students the opportunity to experiment with the notion of resistance and to realize that even the simplest measurement may present some challenge. in the following, we will present standard methods to measure resistances and we will particularly focus on the low-resistance case. i. current-voltage measurement the simplest set-up for measuring a standard resistance is the voltage divider set-up represented in fig. 1: the resistance of interest r0 is put in series with a reference resistance rref . the voltage drop across both resistances is controlled by the board 5 v output. the potential v1 can be read by one 100007-3 papers in physics, vol. 10, art. 100007 (2018) / c. a. marrache-kikuchi et al. figure 1: schematic representation of the currentvoltage set-up. of the microcontroller’s inputs and should be close to 5 v. the potential v2 – read by a second input – corresponds to the voltage drop across the unknown resistance. r0 can then be determined through the simple relation: r0 = v2 v1 −v2 rref (1) the monitoring of v1 allows for a better precision through a direct monitoring of the current. arduino’s 5 v output sometimes varies in time. to have a better stabilization of the voltage, it may be useful to use an external power source for the microcontroller and not use the computer’s usb output. let us note that, if rref � r0, the current through the circuit can be considered to be constant, which is often very convenient when the resistance measurement does not require a large precision. this method presents several drawbacks when dealing with small values of r0: since the ultimate resolution of an arduino uno board is of about 1 mv with vref = 1.1 v, one cannot measure r0 smaller than about 2 × 10−4rref . in the case of a standard commercial htcs sample for instance, the normal state resistance is often of the order of a few tens of mω. to observe the resistance drop across the critical temperature tc of a superconductor, rref should then be of the order of a few ω. such resistances are commercially available or, alternatively, can be custom-made with a relatively good precision (of the order of a few mω) by using a long string of copper wire (commercially available cu wires of 0.2 mm in diameter have a resistance of about 0.5 ω/m for example). however, unless v2 is amplified, the precision of the measurement is not optimal. moreover, using this method to measure small resistances leads the circuit current to exceed the maximum current allowed at the microcontroller’s output. in the following, we will see another method to measure small resistances. ii. wheatstone bridge another resistance determination method, which can achieve a good precision, is the wheatstone bridge. the principle of the measurement is illustrated in fig. 2. r1 and r3 are fixed value resistances, while r2 is a tunable resistance and r0 the resistance of interest. the potentials v1 and v2 are then related by: v2 −v1 = ( r2 r1 + r2 − r0 r0 + r3 ) ve (2) the bridge is co-called “balanced” when r2 is tuned such that v1 and v2 are equal. the resistances are then related through: r0 = r2r3 r1 (3) the precision that can be achieved through this method and when using a microcontroller is about the same as for the current-voltage measurement method. however, this method is not very practical when dealing with resistances r0 that vary, since the bridge has to be maintained close to balance at each measurement point. in particular, it is not well suited for the measurement of a superconductor’s resistive transition. iii. voltage amplifier the most practical solution for the measurement of small voltages – and hence small resistances – is the amplification of the potential difference across the resistance. this can be done via standard voltage amplification set-ups using operational amplifiers, either in single-ended or differential input configurations. in the single-ended case, illustrated in fig. 3, the output potential is given by: vout = 1 + r2 r1 vin (4) 100007-4 papers in physics, vol. 10, art. 100007 (2018) / c. a. marrache-kikuchi et al. figure 2: schematic representation of the wheatstone bridge: r1 and r3 are fixed resistances, r2 is a tunable resistance, and r0 is the resistance of interest. figure 3: voltage amplification. the input voltage vin can then be amplified at will, depending on the ratio r2 r1 . the output voltage vout can then be read by the microcontroller. this amplification method has a much larger precision than the previously mentioned methods, it does not require tuning at each data point and can be used to measure any small voltage: the voltage drop across a superconductor, but also the difference of potential across a thermocouple, or to derive thermoelectric coefficients (seebeck or thermopower). going beyond this simple amplification method requires substantially more work. one possible method is to fabricate a microcontroller-based lockin amplifier, as demonstrated in ref. [12]. iv. using another adc than arduino’s the specifications of arduino digital-analog converter are often the main limitation in the above measurements. as already mentioned, the arduino adc provides – at best – 10 bits on the 1.1 v internal reference voltage, and can only measure a voltage in single-ended configurations. one alternative would be to use another microcontroller, with a better adc. for example, the low-cost frdm-kl25z from nxp [14] provides a adc that can measure a voltage either in singleended or in differential input configurations with 16 bits on 3.3 v. the ease-of-use and the large users community can be a strong motivation to keep arduino as your board of choice. in which case, a second solution would be to use an external adc when better resolution or a differential mode configuration is needed. for example, we have tested the ads1115 chip [15]: this external adc can measure 4 single channels or 2 differential channels with 16 bits on 4.1 v. the possibility of a preamplification up to 16 times brings the resolution down to 8 µv per bit instead of the standard 5 mv (or 1 mv with the 1.1 v internal reference). the possibility of measuring a voltage in a differential mode configuration with a resolution better than 10 µv are two important advantages that open many interesting possibilities for physics measurements: for instance, measuring a strain gauge or a resistance in a four-wire configuration, or measuring directly a thermocouple or the resistance of a superconductor across the transition. the main drawback to this method is that it is not as easy as using the arduino adc: a library should be installed first (but good tutorials can be found online, see for example ref. [15]). also, an external adc is generally not as robust as the arduino’s adc, and the user should carefully monitor the voltage input so as not to damage the adc. iv. evidencing a magnetocaloric effect with microcontrollers to illustrate these methods, let us detail the magnetocaloric effect that we have measured. this effect consists in the temperature change occurring when a magnetic material is placed in a varying 100007-5 papers in physics, vol. 10, art. 100007 (2018) / c. a. marrache-kikuchi et al. figure 4: measurement of the resistance of a pt resistive thermometer with a wheatstone bridge and amplified by an opamp-based circuit. magnetic field. a more detailed explanation of this the phenomenon can be found in ref. [16]. in our case, gadolinium (gd) was chosen for its paramagnetic properties and its curie temperature close to room temperature (tcurie = 292 k). at a temperature of about 298 k, a 2.242 g gd sample was submitted to the magnetic field created by a neodymium magnet of maximum value 0.51 t. in this experiment, the challenge was to measure the small temperature difference induced by the application of a magnetic field. to this effect, a pt100 thermistor was put in good thermal contact with the gd sample via thermal paste. the resistance change was measured by a wheatstone bridge with the following characteristics: r1 = r3 = 100 ω, r2 has been set at 108 ω to be close to balance at the considered temperature and ve = 5 v via arduino’s internal source. an additional resistance rc = 800 ω was placed in series to limit the current going through the pt100, thus avoiding heating the thermometer. ve is then replaced by r1+r2 r1+r2+2rc ve in eq. (2). the off-balance difference of potential v2 − v1 was differentially amplified with a gain of 100 (r4 = 1.5 kω and r5 = 150 kω). the voltage vout was then read by the board (arduino mega in this case) using 2.56 v as arduino’s adc reference voltage [17]. the overall read-out circuit is schematically shown in fig. 4. the temperature is then inferred knowing that, in the [273 k 323 k] range, the pt100 response can be linearly fitted by: t [k] = 2.578rp t[ω] + 15.35 (5) as illustrated in fig. 5, the magnetocaloric ef0.0 2.5 5.0 7.5 10.0 12.5 15.0 time (s) 298.15 298.20 298.25 298.30 298.35 298.40 298.45 298.50 298.55 te m pe ra tu re (k ) figure 5: magnetocaloric effect in a gd sample submitted to a 0.51 t magnetic field (blue background) before going back to the zero-field situation (white background). each data point corresponds to the average of 50 measurements. the noise level is of the order of 10 mk. fect is clearly visible with an amplitude of about ∆t ' 0.33 ± 0.01 k and a time scale of a few seconds. the resolution of the setup corresponds to 50 mk (18 mω). each data point in fig. 5 corresponds to an average of 50 measurements so that the effective noise that can be observed is of about 10 mk, or about 5 mω in resistance. this yields a relative precision for the measurement of a few 10−5, which is remarkable given the simplicity of the apparatus . when the magnet is taken away from the gd sample, the temperature decreases back to its initial value, as predicted by the isentropic character of the magnetocaloric effect. v. measuring a superconducting resistive transition with microcontrollers for the second experiment, we would like to detail is the measurement of the superconducting resistive transition of a hcts. indeed, in such compounds, the critical temperature tc below which the sample is superconducting and exhibits zero resistance is larger than 77 k. the transition can therefore easily be observed by cooling the sample down to liquid nitrogen temperature and warming it back up to 100007-6 papers in physics, vol. 10, art. 100007 (2018) / c. a. marrache-kikuchi et al. figure 6: amplification of the voltage drop across a superconducting sample. inset: geometry of the hcts sample. room temperature. in the present case, the hcts is a commercial bi2sr2ca2cu3o10 sample which specifications indicate a critical temperature tc = 110 k at midtransition point and a room temperature resistivity of 1 mω.cm [18]. in this case, the experimental challenge is therefore to measure very small resistances with good precision. to achieve this, it is essential to adopt a four-wire measurement configuration for the superconductor, as schematized in the inset of fig. 6. indeed, in this way, no contact resistances or connection wires contribute to the measured resistance. moreover, the voltage drop across the superconductor has been amplified by a factor of 480 by a single-ended operational amplifier set-up as shown in fig. 6. the current going through the superconductor is fixed by r0 = 110.0 ω � r1,rhct s and is experimentally measured via the potential v read at the extremity of a homemade resistance r1 = 1.18 ω, made out of copper wire. the temperature has been measured with a pt100 resistive thermometer using the set-up shown in fig. 1 with rref = 217.3 ω and a reference voltage of vref = 3.3 v provided by one of arduino uno’s internal sources. both the sample and the thermometer have been attached to a printed circuit board and have been wrapped in cotton to ensure temperature homogeneity. the ensemble was placed in a polystyrene container filled with glass beads (inset of fig. 7). liquid nitrogen was then poured into the container figure 7: resistive transition of a superconductor measured with a voltage amplification. inset: experimental setup. the blue polystyrene container is filled with glass beads. both the superconductor and the pt100 thermometer are immersed inside with liquid nitrogen. and the temperature of the ensemble was let to increase back to room temperature while recording the data. in this manner, we have measured the resistive transition given in fig. 7. the experimental data have been averaged by a convolution with a gaussian of half width 0.4 k to take into account the error in the temperature measurement. as it can be seen, the resolution of the measurement is of the order of 0.1 mω for the superconductor’s resistance. the latter is actually dominated by the thermal gradient that may exist between the sample and the thermometer if the operator does not carefully check that both are close to one another or if the container is forcefully warmed-up (with a hair dryer for instance). nonetheless, the precision of the measurement is good (< 1% relative uncertainty) and the measured mid-point tc is of 112 k ± 2k, very close to the value given by the specifications. vi. conclusion in conclusion, we have shown that standard temperature and resistance measurement methods could be adapted to microcontrollers. the performances that are then attainable are sufficient to probe with reasonable sensitivity a large range of 100007-7 papers in physics, vol. 10, art. 100007 (2018) / c. a. marrache-kikuchi et al. physics phenomena such as thermoelectric effects, temperature-dependence of the resistivity, hall effect, magnetocaloric effect, etc. we have illustrated this with the measurements of magnetocaloric effect in gadolinium and of the resistive transition of a high critical temperature superconductor. we believe that the scope of inexpensive, transportable and easy-to-build experiments that are accessible through the use of single-board microcontrollers is continuously expanding and, in some cases, can now even replace standard characterization methods in research laboratories. furthermore, they provide a large range of opportunities to devise innovative teaching activities that enhances students involvement. acknowledgements we thank all the students who have participated in the arduino-based labworks. we thankfully acknowledge patrick puzo for welcoming this project-based teaching within the magistère de physique d’orsay curriculum. this work has been supported by a “pédagogie innovante” grant from idex paris-saclay. author contributions a. h. and g. l. designed and conducted the measurements of the magnetocaloric effect. a. l. and a. p. designed and conducted the measurements of the superconducting resistive transition. j. b., f. b., c. e., j. m. f., c. a. m-k., m. m., b. p. and q. q. have supervised and coordinated the studies. all authors have contributed to the redaction of the paper. [1] arduino project website, https://www.arduino.cc/ [2] f bouquet, j bobroff, m fuchs-gallezot, l maurines, project-based physics labs using low-cost open-source hardware, am. j. phys. 85, 216 (2017). [3] tmp36 specifications, https://www.arduino.cc/en/uploads/ main/temperaturesensor.pdf [4] v k merhar, r capuder, t maros̆evic̆, s artac̆, a mozer, m s̆tekovic̆, vic̆ goes to near space, the physics teacher 54, 482 (2016). [5] thermocouple operation temperature range, https://www.omega.co.uk/techref/ colorcodes.html [6] thermocouple amplifier, https://www.adafruit.com/product/3263 [7] characteristics of pt thermometers are available at the following website, https://www.lakeshore.com/documents/ lstc_platinum_l.pdf [8] platinum rtd sensor resistance to temperature conversion tables are available at the following websites, https://www.omega.com/techref/pdf/ z252-254.pdf, www.intech.co.nz/products/temperature/ typert/rtd-pt100-conversion.pdf [9] rtd-to-digital converter, https://www.maximintegrated.com/en/ products/sensors/max31865.html [10] g ireson, measuring the transition temperature of a superconductor in a pre-university laboratory, physics education 41, 556 (2006). [11] l m león-rossano, an inexpensive and easy experiment to measure the electrical resistance of high-tc superconductors as a function of temperature, am. j. phys. 65, 1024 (1997). [12] k d schultz, phase-sensitive detection in the undergraduate lab using a low-cost microcontroller, am. j. phys. 84, 557 (2016). [13] r henaff, g le doudic, b pilette, c even, j m fischbach, f bouquet, j bobroff, m monteverde, c a marrache-kikuchi, a study of kinetic friction: the timoshenko oscillator, am. j. phys. 86, 174 (2018). [14] https://os.mbed.com/platforms/kl25z/ [15] https://www.adafruit.com/product/1085 [16] v percharsky, k a gscheider jr, magnetocaloric effect and magnetic refrigeration, j. magn. magn. mater. 200, 44 (1999). [17] https://www.arduino.cc/reference/ en/language/functions/analog-io/ analogreference/ [18] https://shop.can-superconductors.com/ index.php?id_product=11&controller= product 100007-8 050001.dvi papers in physics, vol. 5, art. 050001 (2013) received: 12 december 2012, accepted: 2 february 2013 edited by: g. mindlin licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.050001 www.papersinphysics.org issn 1852-4249 lt2c2: a language of thought with turing-computable kolmogorov complexity sergio romano,1∗ mariano sigman,2,3† santiago figueira1,3‡ in this paper, we present a theoretical effort to connect the theory of program size to psychology by implementing a concrete language of thought with turing-computable kolmogorov complexity (lt2c2) satisfying the following requirements: 1) to be simple enough so that the complexity of any given finite binary sequence can be computed, 2) to be based on tangible operations of human reasoning (printing, repeating,. . . ), 3) to be sufficiently powerful to generate all possible sequences but not too powerful as to identify regularities which would be invisible to humans. we first formalize lt2c2, giving its syntax and semantics and defining an adequate notion of program size. our setting leads to a kolmogorov complexity function relative to lt2c2 which is computable in polynomial time, and it also induces a prediction algorithm in the spirit of solomonoff’s inductive inference theory. we then prove the efficacy of this language by investigating regularities in strings produced by participants attempting to generate random strings. participants had a profound understanding of randomness and hence avoided typical misconceptions such as exaggerating the number of alternations. we reasoned that remaining regularities would express the algorithmic nature of human thoughts, revealed in the form of specific patterns. kolmogorov complexity relative to lt2c2 passed three expected tests examined here: 1) human sequences were less complex than control prng sequences, 2) human sequences were not stationary, showing decreasing values of complexity resulting from fatigue, 3) each individual showed traces of algorithmic stability since fitting of partial sequences was more effective to predict subsequent sequences than average fits. this work extends on previous efforts to combine notions of kolmogorov complexity theory and algorithmic information theory to psychology, by explicitly proposing a language which may describe the patterns of human thoughts. ∗e-mail: sgromano@dc.uba.ar †e-mail: sigman@df.uba.ar ‡e-mail: santiago@dc.uba.ar 1 department of computer science, fcen, university of buenos aires, pabellón i, ciudad universitaria (c1428ega) buenos aires, argentina. 2 laboratory of integrative neuroscience, physics department, fcen, university of buenos aires, pabellón i, ciudad universitaria (c1428ega) buenos aires, argentina. 3 conicet, argentina. i. introduction although people feel they understand the concept of randomness [1], humans are unable to produce random sequences, even when instructed to do so [2–6], and to perceive randomness in a way that is inconsistent with probability theory [7–10]. for instance, random sequences are not perceived by participants as such because runs appear too long to be random [11,12] and, similarly, sequences pro050001-1 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. duced by participants aiming to be random have too many alternations [13, 14]. this bias, known as the gambler’s fallacy, is thought to result from an expectation of local representativeness (lr) of randomness [10] which ascribes chance to a selfcorrecting mechanism, promptly restoring the balance whenever disrupted. in words of tversky and kahneman [5], people apply the law of large numbers too hastily, as if it were the law of small numbers. the gambler’s fallacy leads to classic psychological illusions in real-world situations such as the hot hand perception by which people assume specific states of high performance, while analysis of records show that sequences of hits and misses are largely compatible with bernoulli (random) process [15,16]. despite massive evidence showing that perception and productions of randomness shows systematic distortions, a mathematical and psychological theory of randomness remains partly elusive. from a mathematical point of view —as discussed below— a notion of randomness for finite sequences presents a major challenge. from a psychological point of view, it remains difficult to ascribe whether the inability to produce and perceive randomness adequately results from a genuine misunderstanding of randomness or, instead, as a consequence of the algorithmic nature of human thoughts which is revealed in the forms of patterns and, hence, in the impossibility of producing genuine chance. in this work, we address both issues by developing a framework based on a specific language of thought by instantiating a simple device which induces a computable (and efficient) definition of algorithmic complexity [17–19]. the notion of algorithmic complexity is described in greater detail below but, in short, it assigns a measure of complexity to a given sequence as the length of the shortest program capable of producing it. if a sequence is algorithmically compressible, it implies that there may be a certain pattern embedded (described succinctly by the program) and hence it is not random. for instance, the binary version of champernowne’s sequence [20] 01101110010111011110001001101010111100 . . . consisting of the concatenation of the binary representation of all the natural numbers, one after another, is known to be normal in the scale of 2, which means that every finite word of length n occurs with a limit frequency of 2−n —e.g., the string 1 occurs with probability 2−1, the string 10 with probability 2−2, and so on. although this sequence may seem random based on its probability distribution, every prefix of length n is produced by a program much shorter than n. the theory of program size, developed simultaneously in the ’60s by kolmogorov [17], solomonoff [21] and chaitin [22], had a major influence in theoretical computer science. its practical relevance was rather obscure because most notions, tools and problems were undecidable and, overall, because it did not apply to finite sequences. a problem at the heart of this theory is that the complexity of any given sequence depends on the chosen language. for instance, the sequence x1 = 1100101001111000101000110101100110011100 which seems highly complex, may be trivially accounted by a single character if there is a symbol (or instruction of a programming language) which accounts for this sequence. this has its psychological analog in the kind of regularities people often extract: x2 = 1010101010101010101010101010101010101010 is obviously a non-random sequence, as it can succinctly be expressed as repeat 20 times: print ‘10’. (1) instead, the sequence x3 = 0010010000111111011010101000100010000101 appears more random and yet it is highly compressible as it consists of the first 40 binary digits of π after the decimal point. this regularity is simply not extracted by the human-compressor and demonstrates how the exceptions to randomness reveal natural patterns of thoughts [23]. the genesis of a practical (computable) algorithmic information theory [24] has had an influence (although not yet a major impact) in psychology. variants of kolmogorov complexity have been applied to human concept learning [25], to general theories of cognition [26] and to subjective randomness [23, 27]. in this last work, falk and konold showed that a simple measure, inspired in algorithmic notions, was a good correlate of perceived randomness [27]. griffiths & tenenbaum developed 050001-2 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. statistical models that incorporate the detection of certain regularities, which are classified in terms of the chomsky hierarchy [23]. they showed the existence of motifs (repetition, symmetry) and related their probability distributions to kolmogorov complexity via levin’s coding theorem (cf. section vii. for more details). the main novelty of our work is to develop a class of specific programming languages (or turing machines) which allows us to stick to the theory of program size developed by kolomogorov, solomonoff and chaitin. we use the patterns of sequences of humans aiming to produce random strings to fit, for each individual, the language which captures these regularities. ii. mathematical theory of randomness the idea behind kolmogorov complexity theory is to study the length of the descriptions that a formal language can produce to identify a given string. all descriptions are finite words over a finite alphabet, and hence each description has a finite length —or, more generally, a suitable notion of size. one string may have many descriptions, but any description should describe one and only one string. roughly, the kolmogorov complexity [17] of a string x is the length of the shortest description of x. so a string is ‘simple’ if it has at least one short description, and it is ‘complex’ if all its descriptions are long. random strings are those with high complexity. as we have mentioned, kolmogorov complexity uses programming languages to describe strings. some programming languages are turing complete, which means that any partial computable function can be represented in it. the commonly used programming languages, like c++ or java, are all turing complete. however, there are also turing incomplete programming languages, which are less powerful but more convenient for specific tasks. in any reasonable imperative language, one can describe x2 above with a program like (1), of length 26, which is considerably smaller than 40, the size of the described string. it is clear that x2 is ‘simple’. the case of x3 is a bit tricky. although at first sight it seems to have a complete lack of structure, it contains a hidden pattern: it consists of the first forty binary digits of π after the decimal point. this pattern could hardly be recognized by the reader, but once it is revealed to us, we agree that x3 must also be tagged as ‘simple’. observe that the underlying programming language is central: x3 is ‘simple’ with the proviso that the language is strong enough to represent (in a reasonable way) an algorithm for computing the bits of π —a language to which humans are not likely to have access when they try to find patterns in a string. finally, for x1, the best way to describe it seems to be something like print ‘1100101001111000101000110101100110011100’, which includes the string in question verbatim, length 48. hence x1 only has long descriptions and hence it is ‘complex’. in general, both the string of length n which alternates 0s and 1s and the string which consists of the first n binary digits of π after the decimal point can be computed by a program of length ≈ log n — and this applies to any computable sequence. the idea of the algorithmic randomness theory is that a truly random string of length n necessarily needs a program of length ≈ n (cf. section ii. for details). i. languages, turing machines and kolmogorov complexity any programming language l can be formalized with a turing machine ml, so that programs of l are represented as inputs of ml via an adequate binary codification. if l is turing complete then the corresponding machine ml is called universal, which is equivalent to say that ml can simulate any other turing machine. let {0, 1}∗ denote the set of finite words over the binary alphabet. given a turing machine m, a program p and a string x (p, x ∈ {0, 1}∗), we say that p is an m-description of x if m(p) = x — i.e., the program p, when executed in the machine m, computes x. here we do not care about the time that the computation needs, or the memory it consumes. the kolmogorov complexity of x ∈ {0, 1}∗ relative to m is defined by the length of the shorter m-description of x. more formally, km(x) def = min{|p|: m(p) = x} ∪ {∞}, where |p| denotes the length of p. here m is any given turing machine, possibly one with a very specific behavior, so it may be the case that a given 050001-3 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. string x does not have any m-description at all. in this case, m(x) = ∞. in practical terms, a machine m is a useful candidate to measure complexity if it computes a surjective function. in this case, every string x has at least one m-description and therefore km(x) < ∞. ii. randomness for finite words the strength of kolmogorov complexity appears when m is set to any universal turing machine u. the invariance theorem states that ku is minimal, in the sense that for every turing machine m there is a constant cm such that for all x ∈ {0, 1} ∗ we have ku(x) ≤ km(c) + cm . here, cm can be seen as the specification of the language m in u (i.e., the information contained in cm tells u that the machine to be simulated is m). if u and u′ are two universal turing then ku and ku′ differ at most by a constant. in a few words, ku(x) represents the length of the ultimate compressed version of x, performed by means of algorithmic processes. for analysis of arbitrarily long sequences, cm becomes negligible and hence for nonpractical aspects of the theory the choice of the machine is not relevant. however, for short sequences, as we study here, this becomes a fundamental problem, as notions of complexity are highly dependent on the choice of the underlying machine through the constant cm . the most trivial example, as referred in the introduction, is that for any given sequence, say x1, there is a machine m for which x1 has minimal complexity. iii. solomonoff induction here we have presented compression as a framework to understand randomness. another very influential paradigm proposed by schnorr is to use the notion of martingale (roughly, a betting strategy), by which a sequence is random if there is no computable martingale capable of predicting forthcoming symbols (say, of a binary alphabet {0, 1}) better than chance [28, 29]. in the 1960s, solomonoff [21] proposed a universal prediction method which successfully approximates any distribution µ, with the only requirement of µ being computable. this theory brings together concepts of algorithmic information, kolmogorov complexity and probability theory. roughly, the idea is that amongst all ‘explanations’ of x, those which are ‘simple’ are more relevant, hence following occam’s razor principle: amongst all hypothesis that are consistent with the data, choose the simplest. here the ‘explanations’ are formalized as programs computing x, and ‘simple’ means low kolmogorov complexity. solomonoff’s theory, builds on the notion of monotone (and prefix) turing machines. monotone machines are ordinary turing machines with a one-way read-only input tape, some work tapes, and a one-way write-only output tape. the output is written one symbol at a time, and no erasing is possible in it. the output can be finite if the machine halts, or infinite in case the machine computes forever. the output head of monotone machines can only “print and move to the right” so they are well suited for the problem of inference of forthcoming symbols based on partial (and finite) states of the output sequence. any monotone machine n has the monotonicity property (hence its name) with respect to extension: if p, q ∈ {0, 1}∗ then n(p) is a prefix of n(paq), where paq denotes the concatenation of p and q. one of solomonoff’s fundamental results is that given a finite observed sequence x ∈ {0, 1}∗, the most likely finite continuation is the one in which the concatenation of x and y is less complex in a kolmogorov sense. this is formalized in the following result (see theorem 5.2.3 of [24]): for almost all infinite binary sequences x (in the sense of µ) we have − lim n→∞ log µ(y | x↾n) = lim n→∞ kmu((x↾n) ay) − kmu(x↾n) + o(1) < ∞. here, x↾n represents the first n symbols of x, and kmu is the monotone kolmogorov complexity relative to a monotone universal machine u. that is, kmu(x) is defined as the length of the shortest program p such that the output of u(p) starts with x —and possibly has a (finite or infinite) continuation. in other words, solomonoff inductive inference leads to a method of prediction based on data compression, whose idea is that whenever the source has output the string x, it is a good heuristic to choose the extrapolation y of x that minimizes kmu(x ay). for instance, if one has observed x2, it is more likely for the continuation to be 1010 050001-4 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. rather than 0101, as the former can be succinctly described by a program like repeat 22 times: print ‘10’. (2) and the latter looks more difficult to describe; indeed the shorter program describing it seems to be something like repeat 20 times: print ‘10’; (3) print ‘0101’. intuitively, as program (2) is shorter than (3), x2 a1010 is more probable than x2 a0101. hence, if we have seen x2, it seems to be a better strategy to predict 1. iii. a framework for human thoughts the notion of thought is not well grounded. we lack an operative working definition and, as also happens with other terms in neuroscience (consciousness, self, ...), the word thought is highly polysemic in common language. it may refer, for example, to a belief, to an idea or to the content of the conscious mind. due to this difficulty, the mere notion of thought has not been a principal or directed object of study in neuroscience, although of course it is always present implicitly, vaguely, without a formal definition. here we do not intend to elaborate an extensive review on the philosophical and biological conceptions of thoughts (see [30] for a good review on thoughts). nor are we in a theoretical position to provide a full formal definition of a thought. instead, we point to the key assumptions of our framework about the nature of thoughts. this accounts to defining constraints in the class of thoughts which we aim to describe. in other words, we do not claim to provide a general theory of human thoughts (which is not amenable at this stage lacking a full definition of the class) but rather of a subset of thoughts which satisfy certain constraints defined below. for instance, e.b. titchener and w. wundt, the founders of structuralist school in psychology (seeking structure in the mind without evoking metaphysical conceptions, a tradition which we inherit and to which we adhere), believed that thoughts were images (there are not imageless thoughts) and hence can be broken down to elementary sensations [30]. while we do not necessarily agree with this propositions (see carey [31] for more contemporary versions denying the sensory foundations of conceptual knowledge), here we do not intend to explain all possible thoughts but rather a subset, a simpler class which —in agreement with the wundt and titchener— can be expressed in images. more precisely, we develop a theory which may account for boole’s [32] notion of thoughts as propositions and statements about the world which can be represented symbolically. hence, a first and crucial assumption of our framework is that thoughts are discrete. elsewhere we have extensively discussed [33–39] how the human brain, whose architecture is quite different from turing machines, can emerge in a form of computation which is discrete, symbolic and resembles turing devices. second, here we focus on the notion of “propless” mental activity, i.e., whatever (symbolic) computations can be carried out by humans without resorting to external aids such as paper, marbles, computers or books. this is done by actually asking participants to perform the task “in their heads”. again, this is not intended to set a proposition about the universality of human thoughts but, instead, a narrower set of thoughts which we conceive is theoretically addressable in this mathematical framework. summarizing: 1. we think we do not have a good mathematical (even philosophical) conception of thoughts, as mental structures, yet. 2. intuitively (and philosophically), we adhere to a materialistic and computable approach to thoughts. broadly, one can think (to picture, not to provide a formal framework) that thoughts are formations of the mind with certain stability which defines distinguishable clusters or objects [40–42]. 3. while the set of such objects and the rules of their transitions may be of many different forms (analogous, parallel, unconscious, unlinked to sensory experience, non-linguistic, non-symbolic), here we work on a subset of thoughts, a class defined by boole’s attempt 050001-5 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. to formalize thought as symbolic propositions about the world. 4. this states —which may correspond to human “conscious rational thoughts”, the seed of boole and turing foundations [34,34]— are discrete and defined by symbols and potentially represented by a turing device. 5. we focus on an even narrower space of thoughts. binary formations (right or left, zero or one) to focus on what kind of language better describes these transitions. this work can be naturally extended to understand discrete transitions in conceptual formations [43–45]. 6. we concentrate on prop-less mental activity to understand limitations of the human mind when it does not have evident external support (paper, computer...) iv. implementing a language of thought with turingcomputable complexity as explained in section ii.i., kolmogorov complexity considers all possible computable compressors and assigns to a string x the length of the shortest of the corresponding compressions. this seems to be a perfect theory of compression but it has a drawback: the function ku is not computable, that is, there is no effective procedure to calculate ku(x) given x. on the other hand, the definition of randomness introduced in section ii.i., having very deep and intricate connections with algorithmic information and computability theories, is simply too strong to explain our own perception of randomness. to detect that x3 consists of the first twenty bits of π is incompatible with human patterns of thought. hence, the intrinsic algorithms (or observed patterns) which make human sequences not random are too restricted to be accounted by a universal machine and may be better described by a specific machine. furthermore, our hypothesis is that each person uses his own particular specific machine or algorithm to generate a random string. as a first step in this complicated enterprise, we propose to work with a specific language lt2c2 which meets the following requirements: • lt2c2 must reflect some plausible features of our mental activity when finding succinct descriptions of words. for instance, finding repetitions in a sequence such as x2 seems to be something easy for our brain, but detecting numerical dependencies between its digits as in x3 seems to be very unlikely. • lt2c2 must be able to describe any string in {0, 1}∗. this means that the map given by the induced machine n def = nlt2c2 must be surjective. • n must be simple enough so that kn —the kolmogorov complexity relative to n— becomes computable. this requirement clearly makes lt2c2 turing incomplete, but as we have seen before, this is consistent with human deviations from randomness. • the rate of compression given by kn must be sensible for very short strings, since our experiments will produce such strings. for instance, the approach, followed in [46], of using the size of the compressed file via general-purpose compressors like lempel-ziv based dictionary (gzip) or block based (bzip2) to approximate the kolmogorov complexity does not work in our setting. this method works best for long files. • lt2c2 should have certain degrees of freedom, which can be adjusted in order to approximate the specific machine that each individual follows during the process of randomness generation. we will not go into the details on how to codify the instructions of lt2c2 into binary strings of n: for the sake of simplicity we take n as a surjective total mapping lt2c2 → {0, 1}∗. we restrict ourselves to describe the grammar and semantics of our proposed programming language lt2c2. it is basically an imperative language with only two classes of instructions: a sort of print i, which prints the bit i in the output; and a sort of repeat n times p , which for a fixed n ∈ n it repeats n times the program p . the former is simply represented as i and the latter as (p)n. formally, we set the alphabet {0, 1, (, ),0 , . . . ,9 } and define lt2c2 over such alphabet with the following grammar: 050001-6 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. p ::= ǫ | 0 | 1 | pp | (p)n, where n > 1 is the decimal representation of n ∈ n and ǫ denotes the empty string. the semantics of lt2c2 is given through the behavior of n as follows: n(i) def = p for i ∈ {ǫ, 0, 1} n(p1p2) def = n(p1) an(p2) n((p)n) def = n(p)a · · · an(p) ︸ ︷︷ ︸ n times . n is not universal, but every string x has a program in n which describes it: namely x itself. furthermore, n is monotone in the sense that if p, q ∈ lt2c2 then n(p) is a prefix of n(paq). in table 1, the first column shows some examples of n-programs which compute 1001001001. program size 1001001001 10 (100)21(0)21 6.6 (100)31 4.5 1((0)21)3 3.8 table 1: some n-descriptions of 1001001001 and its sizes for b = r = 1 i. kolmogorov complexity for lt2c2 the kolmogorov complexity relative to n (and hence to the language lt2c2) is defined as kn(x) def = min{‖p‖: p ∈ lt2c2, n(p) = x}, where ‖p‖, the size of a program p, is inductively defined as: ‖ǫ‖ def = 0 ‖p‖ def = b for p ∈ {0, 1} ‖p1p2‖ def = ‖p1‖ + ‖p2‖ ‖(p)n‖ def = r · log n + ‖p‖. in the above definition, b ∈ n, r ∈ r are two parameters that control the relative weight of the print operation and the repeat n times operation. in the sequel, we drop the subindex of kn and simply write k def = kn. table 1 shows some examples of the size of n-programs when b = r = 1. observe that for all x we have k(x) ≤ ‖x‖. it is not difficult to see that k(x) depends only on the values of k(y), where y is any nonempty and proper substring of x. since ‖·‖ is computable in polynomial time, using dynamic programming one can calculate k(x) in polynomial time. this, of course, is a major difference with respect to the kolmogorov complexity relative to a universal machine, which is not computable. ii. from compression to prediction as one can imagine, the perfect universal prediction method described in section ii.iii. is, again, noncomputable. we define a computable prediction algorithm based on solomonoff’s theory of inductive inference but using k, the kolmogorov complexity relative to lt2c2, instead of kmu (which depends on a universal machine). to predict the next symbol of x ∈ {0, 1}∗, we follow the idea described in section ii.iii.: amongst all extrapolations y of x we choose the one that minimizes k(xay). if such y starts with 1, we predict 1, else we predict 0. since we cannot examine the infinitely many extrapolations, we restrict to those up to a fixed given length ℓf . also, we do not take into account the whole x but only a suffix of length ℓp . both ℓf and ℓp are parameters which control, respectively, how many extrapolation bits are examined (ℓf many future bits) and how many bits of the tail of x (ℓp many past bits) are considered. let {0, 1}n (resp. {0, 1}≤n) be the set of words over the binary alphabet {0, 1} of length n (resp. at most n). formally, the prediction method is as follows. suppose x = x1 · · · xn (xi ∈ {0, 1}) is a string. the next symbol is determined as follows: next(x1 · · · xn) def =    0 if m0 < m1; 1 if m0 > m1; g(xn−ℓp · · · xn) otherwise. where for i ∈ {0, 1}, mi def = min{k(xn−ℓp · · · xni ay): y ∈ {0, 1}≤ℓf }, 050001-7 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. and g : {0, 1}ℓp → {0, 1} is defined as g(z) = i if the number of occurrences of i in z is greater than the number of occurrences of 1 − i in z; in case the number of occurrences of 1s and 0s in z coincide then g(z) is defined as the last bit of z. v. methods thirty eight volunteers (mean age = 24) participated in an experiment to examine the capacity of lt2c2 to identify regularities in production of binary sequences. participants were asked to produce random sequences, without further instruction. all the participants were college students or graduates with programming experience and knowledge of the theoretical foundations of randomness and computability. this was intended to test these ideas in a hard sample where we did not expect typical errors which results from a misunderstanding of chance. the experiment was divided in four blocks. in each block the participant pressed freely the left or right arrow 120 times. after each key press, the participant received a notification with a green square which progressively filled a line to indicate the participant the number of choices made. at the end of the block, participants were provided feedback of how many times the predictor method has correctly predicted their input. after this point, a new trial would start. 38 participants performed 4 sequences, yielding a total of 152 sequences. 14 sequences were excluded from analysis because they had an extremely high level of predictability. including these sequences would have actually improved all the scores reported here. the experiment was programmed in actionscript and can be seen at http://gamesdata. lafhis-server.exp.dc.uba.ar/azarexp. vi. results i. law of large numbers any reasonable notion of randomness for strings on base 2 should imply borel’s normality, or the law of large numbers in the sense that if x ∈ {0, 1}n is random then the number of occurrences of any given string y in x divided by n should tend to 2−|y|, as n goes to infinity. a well-known result obtained in some investigations on generation or perception of randomness in binary sequences is that people tend to increase the number of alternations of symbols with respect to the expected value [27]. given a string x of length n with r runs, there are n − 1 transitions between successive symbols and the number of alternations between symbol types is r − 1. the probability of alternation of the string x is defined as p(x) : {0, 1}≥2 → [0, 1] p(x) = r−1 n−1 . in our experiment, the average p(x) of participants was 0.51, very close to the expected probability of alternation of a random sequence which should be 0.5. a t-test on the p(x) of the strings produced by participants, where the null hypothesis is that they are a random sample from a normal distribution with mean 0.5, shows that the hypothesis cannot be rejected as the p-value is 0.31 and the confidence interval on the mean is [0.49, 0.53]. this means that the probability of alternation is not a good measure to distinguish participant’s strings from random ones, or at least, that the participants in this very experiment can bypass this validation. although the probability of alternation was close to the expected value in a random string, participants tend to produce n-grams of length ≥ 2 with probability distributions which are not equiprobable (see fig. 1). strings containing more alternations (like 1010, 0101, 010, 101) and 3− and 4− runs have a higher frequency than expected by chance. this might be seen as an effort from participants to keep the probability of alternation close to 0.5 by compensating the excess of alternations with blocks of repetitions of the same symbol. ii. comparing human randomness with other random sources we asked whether k, the kolmogorov complexity relative to lt2c2 defined in section iv.i., is able to detect and compress more patterns in strings generated by participants than in strings produced by other sources, which are considered random for many practical issues. in particular, we studied strings originated by two sources: pseudorandom number generator (prng) and atmospheric noise (an). 050001-8 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. 0.5 0 1 2−1 0.0625 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 2−4 0.25 00 01 10 11 2−2 0.125 000 001 010 011 100 101 110 111 2−3 figure 1: frequency of sub-strings up to length 4 for the prng source, we chose the mersenne twister algorithm [47] (specifically, the second revision from 2002 that is currently implemented in gnu scientific library). the atmospheric noise was taken from random.org site (property of randomness and integrity services limited) which also runs real-time statistic tests recommended by the us national institute of standards and technology to ensure the random quality of the numbers produced over time. in table 2, we summarize our results using b = 1 and r = 1 for the parameters of k as defined in section iv.i. participants prng an mean µ 48.43 52.99 53.88 std σ 6.62 3.06 2.87 1st quartile 45.30 50.42 51.88 median 49.23 53.15 53.85 3rd quartile 51.79 55.21 55.79 table 2: values of k(x), where x is a string produced by participants, prng or an sources the mean and median of k increases when comparing participant’s string with prng or an strings. this difference was significant, as confirmed by a t-test (p-value of 4.9 × 10−11 when comparing participant’s sample with prng one, a p-value of 1.2 × 10−15 when comparing participant’s with an and a p-value of 1.4 × 10−2 when comparing prng with an sample). therefore, despite the simplicity of lt2c2, based merely on prints and repeats, it is rich enough to identify regularities of human sequences. the k function relative to lt2c2 is an effective and significant measure to distinguish strings produced by participants with profound understanding in the mathematics of randomness, from prng and an strings. as expected, humans produce less complex (i.e., less random) strings than those produced by prng or atmospheric noise sources. iii. mental fatigue on cognitively demanding tasks, fatigue affects performance by deteriorating the capacity to organize behavior [48–52]. specifically, weiss claim that boredom may be a factor that increases nonrandomness [48]. hence, as another test to the ability of k relative to lt2c2 to identify idiosyncratic elements of human regularities, we asked whether the random quality of the participant’s string deteriorated with time. for each of the 138 strings x = x1 · · · x120 (xi ∈ {0, 1}) produced by the participants, we measured the k complexity of all the sub-strings of length 30. specifically, we calculated the average k(xi · · · xi+30) from the 138 strings for each i ∈ [0, 90] (see fig. 2), using the same parameters as in section vi.ii. (b = r = 1), and compared to the same sliding average procedure for prng (fig. 3) and an sources (fig. 4). the sole source which showed a significant linear regression was human generated data (see table 3) which, as expected, showed a negative correlation indicating that participants produced less complex or random strings over time (slope −0.007, p < 0.02). the finding of a fatigue-related effect shows that the unpropped, i.e., resource-limited, human turing machine is not only limited in terms of the language it can parse, but also in terms of the amount of time it can dedicate to a particular task. 050001-9 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. 20 40 60 80 12.6 12.8 13 13.2 13.4 i k figure 2: average of k(xi · · · xi+30) for participants 20 40 60 80 13.6 13.8 14 14.2 14.4 i k figure 3: average of k(xi · · · xi+30) for prng iv. predictability in section iv.ii., we introduced a prediction method with two parameters: ℓf and ℓp . a predictor based on lt2c2 achieved levels of predictability close to 56% which were highly significant (see table 4). the predictor, as expected, performed at chance for the control prng and an data. this fit was relatively insensitive to the values of ℓp and ℓf , 20 40 60 80 14 14.2 14.4 14.6 14.8 15 i k figure 4: average of k(xi · · · xi+30) for an participants prng an mean slope -0.007 0.0016 -0.0005 p-value 0.02 0.5 0.8 ci [-0.01,-0.001] [-0.003,0.006] [-0.005,0.004] table 3: predictability contrary to our intuition that there may be a memory scale which would correspond in this framework to a given length. a very important aspect of this investigation, in line with the prior work of [23], is to inquire whether specific parameters are stable for a given individual. to this aim, we optimized, for each participant, the parameters using the first 80 symbols of the sequence and then tested these parameters in the second half of each segment (last 80 symbols of the sequence) after this optimization procedure, mean predictability increased significantly to 58.14% (p < 0.002, see table 5). as expected, the optimization based on partial data of prng and an resulted in no improvement in the classifier, which remained at chance with no significant difference (p < 0.3, p < 0.2, respectively). hence, while the specific parameters for compression vary widely across each individual, they show stability in the time-scale of this experiment. participants prng an mean µ 56.16 50.69 49.48 std σ 0.07 0.02 0.02 1st quartile 49.97 48.84 48.30 median 55.02 50.77 49.04 3rd quartile 59.75 52.21 50.46 table 4: average predictability participants prng an mean µ 58.14 51.20 49.01 std σ 0.07 0.04 0.03 1st quartile 52.88 48.56 47.11 median 56.73 50.72 49.28 3rd quartile 62.02 53.85 50.48 table 5: optimized predictability 050001-10 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. vii. discussion here we analyzed strings produced by participants attempting to generate random strings. participants had a profound understanding of randomness and hence avoided typical misconceptions such as exaggerating the number of alternations. we reasoned that remaining regularities would express the algorithmic nature of human thoughts, revealed in the form of specific patterns. our effort here was to bridge the gap between kolmogorov theory and psychology, developing a concrete language, lt2c2, satisfying the following requirements: 1) to be simple enough so that the complexity of any given sequence can be computed, 2) to be based on tangible operations of human reasoning ( printing, repeating, . . . ), 3) to be sufficiently powerful to generate all possible sequences but not too powerful as to identify regularities which would be invisible to humans. more specifically, our aim is to develop a class of languages with certain degrees of freedom which can then be fit to an individual (or an individual in a specific context and time). here, we opted for a comparably easier strategy by only allowing the relative cost of each operation to vary. however, a natural extension of this framework is to generate classes of languages where structural and qualitative aspects of the language are free to vary. for instance, one can devise a program structure for repeating portions of (not necessarily neighboring) code, or considering the more general framework of for-programs where the repetitions are more general than in our setting: for i=1 to n do p(i), where p is a program that uses the successive values of i = 1, 2, . . . , n in each iteration. for instance, the following program for i=1 to 6 do print ‘0’ repeat i times: print ‘1’ would describe the string 010110111011110111110111111. the challenge from the computational theoretical point of view is to define an extension which induces a computable (even more, feasible, whenever possible) kolmogorov complexity. for instance, adding simple control structures like conditional jumps or allowing the use of imperative program variables may turn the language into turing-complete, with the theoretical consequences that we already mentioned. the aim is to keep the language simple and yet include structures to compact some patterns which are compatible with the human language of thought. we emphasize that our aim here was not to generate an optimal predictor of human sequences. clearly, restricting lt2c2 to a very rudimentary language is not the way to go to identify vast classes of patterns. our goal, instead, was to use human sequences to calibrate a language which expresses and captures specific patterns of human thought in a tangible and concrete way. our model is based on ideas from kolmogorov complexity and solomonoff’s induction. it is important to compare it to what we think is the closest and more similar approach in previous studies: the work [23] of griffiths and tenenbaum’s. griffiths and tenenbaum devise a series of statistical models that account for different kind of regularities. each model z is fixed and assigns to every binary string x a probability pz(x). this probabilistic approach is connected to kolmogorov complexity theory via levin’s famous coding theorem, which points out a remarkably numerical relation between the algorithmic probability pu (x) (the probability that the universal prefix turing machine u outputs x when the input is filled-up with the results of coin tosses) and the (prefix) kolmogorov complexity ku described in section ii.i. formally, the theorem states that there is a constant c such that for any string x ∈ {0, 1}∗ such that | − log pu (x) − ku(x)| ≤ c (4) (the reader is referred to section 4.3.4 of [24] for more details). griffiths & tenenbaum’s bridge to kolmogorov complexity is only established through this last theoretical result: replacing pu by pz in eq. (4) should automatically give us some kolmogorov complexity kz with respect to some underlying turing machine z. while there is hence a formal relation to kolmogorov complexity, there is no explicit definition of the underlying machine, and hence no notion of program. on the contrary, we propose a specific language 050001-11 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. of thought, formalized as the programming language lt2c2 or, alternatively, as a turing machine n, which assigns formal semantics to each program. semantics are given, precisely, through the behavior of n. the fundamental introduction of program semantics and the clear distinction between inputs (programs of n) and outputs (binary strings) allows us to give a straightforward definition of kolmogorov complexity relative to n, denoted kn, which —because of the choice of lt2c2— becomes computable in polynomial time. once we count with a complexity function, we apply solomonoff’s ideas of inductive inference to obtain a predictor which tries to guess the continuation of a given string under the assumption that the most probable one is the most compressible in terms of lt2c2-kolmogorov complexity. as in [23], we also make use of the coding theorem (4), but in the opposite direction: given the complexity kn, we derive an algorithmic probability pn. this work is mainly a theoretical development, to develop a framework to adapt kolmogorov ideas in a constructive procedure (i.e., defining an explicit language) to identify regularities in human sequences. the theory was validated experimentally, as three tests were satisfied: 1) human sequences were less complex than control prng sequences, 2) human sequences were non-stationary, showing decreasing values of complexity, 3) each individual showed traces of algorithmic stability since fitting of partial data was more effective to predict subsequent data than average fits. our hope is that this theory may constitute, in the future, a useful framework to ground and describe the patterns of human thoughts. acknowledgements the authors are thankful to daniel goŕın and guillermo cecchi for useful discussions. s. figueira is partially supported by grants pict-2011-0365 and ubacyt 20020110100025. [1] m kac, what is random?, am. sci. 71, 405 (1983). [2] h reichenbach, the theory of probability, university of california press, berkeley (1949). [3] g s tune, response preferences: a review of some relevant literature, psychol. bull. 61, 286 (1964). [4] a d baddeley, the capacity for generating information by randomization, q. j. exp. psychol. 18, 119 (1966). [5] a tversky, d kahneman, belief in the law of small numbers, psychol. bull. 76, 105 (1971). [6] w a wagenaar, randomness and randomizers: maybe the problem is not so big, j. behav. decis. making 4, 220 (1991). [7] r falk, perception of randomness, unpublished doctoral dissertation, hebrew university of jerusalem (1975). [8] r falk, the perception of randomness, in: proceedings of the fifth international conference for the psychology of mathematics education, vol. 1, pag. 222, grenoble, france (1981). [9] d kahneman, a tversky, subjective probability: a judgment of representativeness, cognitive psychol. 3, 430 (1972). [10] a tversky, d kahneman, subjective probability: a judgment of representativeness, cognitive psychol. 3, 430 (1972). [11] t gilovich, r vallone, a tversky, the hot hand in basketball: on the misperception of random sequences, cognitive psychol. 17, 295 (1985). [12] w a wgenaar, g b kerenm, cance and luck are not the same, j. behav. decis. making 1, 65 (1988). [13] d budescu, a rapoport, subjective randomization in one-and two-person games, j. behav. decis. making 7, 261 (1994). [14] a rapoport, d v budescu, generation of random series in two-person strictly competitive games, j. exp. psychol. gen. 121, 352 (1992). [15] p d larkey, r a smith, j b kadane, it’s okay to believe in the hot hand, chance: new directions for statistics and computing 2, 22– 30 (1989). 050001-12 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. [16] a tversky, t gilovich, the ”hot hand”: statistical reality or cognitive illusion?, chance: new directions for statistics and computing 2, 31 (1989). [17] a n kolmogorov, three approaches to the quantitative definition of information, probl. inf. transm. 1, 1 (1965). [18] g j chaitin, a theory of program size formally identical to information theory, j. amc 22, 329 (1975). [19] l a levin, a k zvonkin, the complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms, russ. math. surv. 25, 83 (1970). [20] d g champernowne, the construction of decimals in the scale of ten, j. london math. soc. 8, 254 (1933). [21] r j solomonoff, a formal theory of inductive inference: part i, inform. control 7, 1 (1964); ibid. part ii 7, 224 (1964). [22] g chaitin, on the length of programs for computing finite binary sequences: statistical considerations, j. acm 13, 547 (1969). [23] t l griffiths, j b tenenbaum, probability, algorithmic complexity, and subjective randomness, in: proceedings of the twenty-fifth annual conference of the cognitive science society, eds. r alterman, d hirsh, pag. 480, cognitive science society, boston (ma, usa), (2003). [24] m li, p m vitányi, an introduction to kolmogorov complexity and its applications, springer, berlin, 3rd edition (2008). [25] j feldman, minimization of boolean copmlexity in human concept learning, nature london 407, 630 (2000). [26] n chater, the search for simplicity: a fundamental cognitive principle?, q. j. exp. psychol. 52a, 273 (1999). [27] r falk, c konold, making sense of randomness: implicit encoding as a bias for judgment, psychol. rev. 104, 301 (1997). [28] c p schnorr, zufälligkeit und wahrscheinlichkeit, lecture notes in mathematics vol. 218. springer-verlag, berlin, new york (1971). [29] c p schnorr, a unified approach to the definition of a random sequence, math. syst. theory 5, 246 (1971). [30] d dellarosa, a history of thinking, in: the psychology of human thought, eds. r j sternberg, e e smith, cambridge university press, cambridge (usa) (1988). [31] s carey, the origin of concepts, oxford university press, oxford (usa) (2009). [32] g boole, an investigation of the laws of thought: on which are founded the mathematical theories of logic and probabilities, vol. 2, walton and maberly, london (1854). [33] a zylberberg, s dehaene, g mindlin, m sigman, neurophysiological bases of exponential sensory decay and top-down memory retrieval: a model, front. comput. neurosci. 3, 4 (2009). [34] a zylberberg, s dehaene, p roelfsema, m sigman, the human turing machine: a neural framework for mental programs, trends. cogn. sci. 15, 293 (2011). [35] m graziano, p polosecki, d shalom, m sigman, parsing a perceptual decision into a sequence of moments of thought, front. integ. neurosci. 5, 45 (2011). [36] a zylberberg, p barttfeld, m sigman, the construction of confidence in a perceptual decision, front. integ. neurosci. 6, 79 (2012). [37] d shalom, b dagnino, m sigman, looking at breakout: urgency and predictability direct eye events, vision res. 51, 1262 (2011). [38] s dehaene, m sigman, from a single decision to a multi-step algorithm, curr. opin. neurobiol. 22, 937 (2012). [39] j kamienkowski, h pashler, s dehaene, m sigman, effects of practice on task architecture: combined evidence from interference experiments and random-walk models of decision making, cognition 119, 81 (2011). 050001-13 papers in physics, vol. 5, art. 050001 (2013) / s romano et al. [40] a zylberberg, d slezak, p roelfsema, s dehaene, m sigman, the brain’s router: a cortical network model of serial processing in the primate brain, plos comput. biol. 6, e1000765 (2010). [41] l gallos, h makse, m sigman, a small world of weak ties provides optimal global integration of self-similar modules in functional brain networks, p. natl. acad. sci. usa 109, 2825 (2012). [42] l gallos, m sigman, h makse, the conundrum of functional brain networks: smallworld efficiency or fractal modularity, front. physiol. 3 123 (2012). [43] m costa, f bonomo, m sigman, scaleinvariant transition probabilities in free word association trajectories, front. integ. neurosci. 3 19 (2009). [44] n mota, n vasconcelos, n lemos, a pieretti, o kinouchi, g cecchi, m copelli, s ribeiro, speech graphs provide a quantitative measure of thought disorder in psychosis, plos one 7, e34928 (2012). [45] m sigman, g cecchi, global organization of the wordnet lexicon, p. natl. acad. sci. usa 99, 1742 (2002). [46] r cilibrasi, p m vitányi, clustering by compression, ieee t. inform. theory 51, 1523 (2005). [47] m matsumoto, t nishimura, mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator, acm trans. model. comput. simul. 8, 3 (1998). [48] r l weiss, on producing random responses, psychon. rep. 14, 931 (1964). [49] f bartlett, fatigue following highly skilled work, nature (london) 147, 717 (1941). [50] d e broadbent, is a fatigue test now possible?, ergonomics 22, 1277 (1979). [51] w floyd, a welford, symposium on fatigue and symposium on human factors in equipment design, eds. w f floyd, a t welford, arno press, new york (1953). [52] r hockey, stress and fatigue in human performance, wiley, chichester (1983). 050001-14 papers in physics, vol. 4, art. 040003 (2012) received: 14 march 2011, accepted: 23 april 2012 edited by: v. lakshminarayanan reviewed by: c. negreira, laboratorio de acústica ultrasonora, universidad de la república, uruguay licence: creative commons attribution 3.0 doi: 10.4279/pip.040003 www.papersinphysics.org issn 1852-4249 natural and laser-induced cavitation in corn stems: on the mechanisms of acoustic emissions e. fernández,1 r. j. fernández,1 g. m. bilmes2∗ water in plant xylem is often superheated, and therefore in a meta-stable state. under certain conditions, it may suddenly turn from the liquid to the vapor state. this cavitation process produces acoustic emissions. we report the measurement of ultrasonic acoustic emissions (uae) produced by natural and induced cavitation in corn stems. we induced cavitation and uae in vivo, in well controlled and reproducible experiments, by irradiating the bare stem of the plants with a continuous-wave laser beam. by tracing the source of uae, we were able to detect absorption and frequency filtering of the uae propagating through the stem. this technique allows the unique possibility of studying localized embolism of plant conduits, and thus to test hypotheses on the hydraulic architecture of plants. based on our results, we postulate that the source of uae is a transient “cavity oscillation” triggered by the disruptive effect of cavitation inception. i. introduction the cohesion-tension theory suggests that water in the xylem of transpiring plants is under tension with a hydrostatic pressure below atmospheric and, thus, most of the time at “negative” values [1]. negative pressures means that water in the xylem has a reduced density compared to equilibrium [2]. according to its phase diagram, water under these conditions is overheated (i.e., in a metastable state). therefore, it should not be in the liquid but in the vapor phase [3]. the molecules in the liquid phase are further away from each other, but their mutual attraction allows the system to ∗e-mail: gabrielb@ciop.unlp.edu.ar 1 ifeva, facultad de agronomı́a, universidad de buenos aires y conicet, av. san mart́ın 4453, c1417dse buenos aires, argentina. 2 centro de investigaciones opticas (conicet-cic) and facultad de ingenieŕıa, universidad nacional de la plata, casilla de correo 124, 1900 la plata, argentina. remain unchanged. under sufficiently high tension (i.e., low pressures caused by water deficit), xylem may fail to maintain this state, causing liquid water to turn into vapor in a violent way. this phenomenon, usually known as cavitation, causes the embolism of the conduits, reducing tissue hydraulic conductivity and exacerbating plant physiological stress [4,5]. some herbaceous species are known to sustain cavitation almost every day, repairing embolism during the night, while most woody species preclude cavitation occurrence by a combination of stomatal behavior and anatomical and morphological adjustment [6]. cavitation events in xylem produce sound [7, 8]. in 1966, milburn and johnson developed a technique to detect sound by registering ‘clicks’ in a record player pick-up head attached to stressed plants and connected to an amplifier [9]. they associated this sound emission with the rupture of the water column in xylem vessels. since then, several authors have used this audible acoustic emis040003-1 papers in physics, vol. 4, art. 040003 (2012) / e. fernández et al. sion technique to measure xylem cavitation [10,11]. later on, some authors have improved the technique by detecting ultrasonic acoustic emissions (uae) [12–17]. these authors have demonstrated a good correlation between uae and cavitation. however, the connection between audible or ultrasonic acoustic emissions and cavitation phenomena on xylem vessels remains unexplained. tyree and dixon [12] proposed four possible sources of acoustic emissions that we will consider in the discussion section. others authors have developed explanations based on alternatives to the cohesion-tension theory [3, 18] one of the problems of studying cavitation in plants is the spontaneous character of the phenomenon, so far precluding our ability to produce it in a controlled way. most cavitation experiments use transpiration to raise xylem water tension, the trigger for cavitation events. in some cases, xylem tension was increased by centrifugation [19,20], but even there, cavitation events took place rather randomly along the water column. on the other hand, in order to study bubble behavior in isolated physical systems, several authors explored the generation of cavitation phenomena using lasers [21–24]. in these experiments, cavitation bubbles are generated in a very well defined location taking advantage of the accuracy of the laser beam. even though this technique was developed to generate cavitation in transparent environments, we wondered whether it could be used in biological systems to generate cavitation at specific locations along the stem. in this article, we report spontaneous uae produced by natural cavitation in xylem vessels of corn (zea mays l.) stems and we characterized and classified the signals. we also developed a method to produce laser-induced cavitation and uae events in a controlled way by irradiating plants with a continuous-wave laser. we performed experiments with this method to study the generation and propagation of uae. our results allowed us to explain the connection between cavitation and uae, as well as the relationship between signal frequency and the localization of the source in the stem. ii. materials and methods corn plants were grown in a greenhouse in 3l pots containing sand. they were watered at field capacfigure 1: set-up for the different experiments. (a) experiment 1. (b) experiments 2 and 4. (c) experiment 3. (d) experiment 5. cs: corn stem; t1 and t2: pzt transducers for uae detection. ity every 1-2 days with nutritive solution (3 g l−1 of ksc ii – roulier). after four months, tasseling plants (around 1 m high and 12 mm stem width) were used to perform experiments under different conditions: total darkness (d); room diffuse light (rl; par ca. 100 me m−2 s−1); leaf illumination with a 150 w incandescent lamp (il; placed ca. 0.5 m away), and laser irradiation (l). in the latter case, experiments were carried out by directing the beam of a 50 mw he–ne red laser (630 nm), or a continuous-wave (cw) ar-ion laser (spectra physics model 165/09) directly on to the stems. most experiments were conducted sequentially in 3-5 plants, and we report the full range of observed results. ultrasonic acoustic emissions generated in the stems of plants were recorded by home-made pztbased piezoelectric transducers (4 × 4 mm, 230 khz) [25] coated with glycerin and clamped to the bare stem by a three-prong thumbtack. signals, of the order of 1 mv, were amplified (gain 103) and recorded in a storage digital oscilloscope. different transducer positions in the stems were explored, as well as simultaneous measurement of uae with two detectors attached to different points of the stems, providing a method to trace the origin of the signals (fig. 1). iii. results i. measurements of spontaneous uae in the first experiment (experiment 1 ), a transducer was attached to the bare stem on an internode with 5–7 developed leaves above it. uae were monitored in the dark (d), under room light (rl) and under incandescent lamp (il) illumination. the experiment was performed with several 040003-2 papers in physics, vol. 4, art. 040003 (2012) / e. fernández et al. figure 2: examples of the detected uae related to cavitation events in corn stem. (a) type 1 broadband frequency emission signals detected at room light. (b) type 2 low frequency signals detected at room light. (c) type 1 signals detected with the laser impinging near the detector. (d) type 2 signals detected with the laser impinging far from the detector. (e) and (f) frequency spectra of type 1 and type 2 signals detected with laser. see the similarity of the signals produced with laser and those detected at room light. plants changing the sequence conditions of light (d–rl–il; rl–d–il, etc.). we registered no emissions in the dark. in experiments 2–3 h long, a rate of 1.15 ± 0.09 emissions min−1 were detected when the plant was transpiring under ambient light. this rate was increased to 1.45 ± 0.15 emissions min−1 when transpiration was stimulated with an incandescent lamp. the change in the rate of emissions between rl and il took place less than a minute after turning the lamp on or off. when two transducers were attached to the bare stem at the same height but in different radial positions [fig. 1(b)], the rate of emission and the type of signals (see below) were the same for both detectors. 040003-3 papers in physics, vol. 4, art. 040003 (2012) / e. fernández et al. in a second set of experiments performed under room light, and each lasting ca. 2 h (experiment 2 ), two transducers were attached to the bare stem at different heights. both transducers registered uae, but the rate of emissions was dependent on the transducer position: near the leaves was higher than closer to the plant base. for instance, when t1 was located at 14 cm from the plant base and t2 at 35–40 cm, no signals were detected by t1, while 0.3–3.5 emissions min−1 were detected by t2. the amplitude of the signals detected by each transducer was registered as a function of time, and the uae were classified by their form and main frequency. two types of signals were identified: those who have a broad band of frequencies up to 0.2 mhz, named type 1, [fig. 2(a)], and low frequency signals, with values below 0.075 mhz, named type 2, [fig. 2(b)]. ii. laser induced uae with the aim of developing a method to induce uae in a controlled way, in the next series of experiments (experiment 3 ) we impinged a laser beam at a point on a corn stem with a transducer attached on the opposite side [fig. 1(c)]. we started measuring uae in the dark, and without laser irradiation. under these conditions, no uae were detected. then, again in the dark, we irradiated the stem with the he–ne red laser, but even at its maximum power, no uae were detected. after that, the cw ar-ion laser was tested at different wavelengths and powers. we found that with powers up to 600 mw, only the blue line at 488 nm produced results. under these conditions, when the laser was turned on, acoustic signals were registered and when it was turned off, the rate of emission decayed and disappeared after a few seconds (fig. 3). this sequence (switching the laser on and off, always impinging on the same point of the stem) was repeated with the same qualitative results, although the rate of uae decreased with every cycle (in fig. 3 compare the slope of the sequence starting at minute 40 with the one starting at minute 55). even when the rate of emissions in different plants encompassed a wide range (ca. 2–17 emissions min−1 with the laser on), the same pattern always held (i.e., emissions when laser is on, and no emissions a few seconds after the laser is off). the same behavior was observed when the laser figure 3: laser induced uae in corn stem. the beam of a cw ar ion laser (600 mw) at 488 nm impinges on the stem opposite to the transducer. grey line: the laser is on. black line: the laser is off. impinged at a right angle from the transducer axis. figure 4: (a) spontaneous and (b) laser induced uae in corn stems measured simultaneously with two transducers attached at the same height of the stem. in (b) the laser impinged between both transducers. open triangles: transducer t1. open circles: transducer t2. the signals were classified according to their form and frequencies. figure 2(c) shows a typical signal generated by the laser in this experiment. 040003-4 papers in physics, vol. 4, art. 040003 (2012) / e. fernández et al. as it can be seen, these signals are similar to the broadband frequency signals [the type 1 shown in fig. 2(a)] measured in experiment 2 with room light. with the aim of comparing spontaneous and laser-induced uae, we attached two transducers to the bare stem on opposite sides, at the same height [experiment 4, fig. 1(b)]. we first registered acoustic emissions detected simultaneously by both transducers at room light, without laser irradiation [fig. 4(a)]. after that, in the dark, we measured the uae generated after impinging the cw laser between both detectors, in a direction perpendicular to their axes [fig. 4(b)]. the characteristic signals observed in both cases were type 1 signals (broadband frequency signals). then, we proceeded to study how the distance between detector and source modified the rate and shape of the uae (experiment 5 ). two transducers were attached to the bare stem: one at 8.5 cm (t1) and the other at 12.5 cm (t2) from the base. the cw laser beam impinged on different points of the stem. points a and b were at the same height of t1 and t2, respectively, but at the opposite side; point c was between t1 and t2 [fig. 1(d)]. when the laser beam impinged on a, both transducers detected uae. broadband frequency signals (type 1) were observed with t1 and low frequency signals (type 2) were observed with t2. figure 2(d) shows an example of type 2 signals generated with laser. as it can be seen, these signals are similar to those detected at room light [fig. 2(b)]. besides, the number of emissions detected by t1 was higher than that detected by t2 [fig. 5(a)]. when the laser beam impinged on b, once again, both transducers detected uae. in this case, t2 detected type 1 signals while t1 detected type 2 signals, and the number of emissions detected by t2 was higher than that detected by t1 [fig. 5(b)]. when the laser beam impinged on c, both transducers simultaneously detected uae of low frequency similar to those described as type 2 [fig. 5(c)]. iv. discussion and conclusions experiments 1 and 2 show that the spontaneous uae can be attributed to natural cavitation events occurring in the xylem vessels of the corn stem: no figure 5: laser induced uae as a function of the transducer position. two pzt transducers were attached to the stem at two heights as shown in fig. 1(d). open triangles: transducer t1. open circles: transducer t2. (a) the laser beam impinged near t1. (b) the laser beam impinged near t2. (c) the laser beam impinged between t1 and t2. the arrow in (b) indicates the laser was off. emissions were observed when the plant was in the dark. around 1 emission min−1 was detected under room light, and a rate ca. 25% higher under the lamp. this behavior is in agreement with the cohesion-tension theory and current plant cavitation models. as transpiration rate increases, xylem tension rises and cavitation events are expected to increase, as it happens in our experiments. besides, the uae signals registered (fig. 2) were very similar to those described [12]. these authors demonstrated that these kinds of emissions are strongly 040003-5 papers in physics, vol. 4, art. 040003 (2012) / e. fernández et al. related to cavitation events [12, 13, 26]. in the transpiring plant, the tension developed in the water stream generates a metastable equilibrium. when liquid water is subjected to a sufficiently low pressure, this equilibrium can be broken, and form a cavity. this initial stage of the cavitation phenomenon is termed cavitation inception. when the plant is in the dark, water in the xylem is slightly under tension at a pressure value close to atmospheric. under these conditions, the local pressure does not fall enough, compared to the saturated vapor pressure, to produce cavitation inception. as the cw laser impinges on the stem, this absorbs light and release energy to the xylem, heating it. this extra energy allows the phase change to gas in the water column, triggering cavitation inception. in this sense, the physical process of cavitation inception is similar to boiling, the major difference being the thermodynamic path which precedes the formation of the vapor. we found that uae generated using a cw laser are of the same kind of those registered on transpiring plants. we can conclude that this method allows, for the first time, to induce cavitation events in xylem in a controlled and reproducible way. regarding the mechanisms of uae generation, either natural or laser-induced, previous work has clearly shown that once cavitation inception is produced, embolism of the xylem immediately takes place. this means that the formed cavity remains, and there is no collapse of the void in the water column (as would occur in the so called inertial cavitation). then, uae generation can be produced by an oscillating source activated by the rupture of the water column. as mentioned in the introduction, tyree et al. [12] proposed four possible uae sources. the first one, oscillation of hydrogen bonds in water after tension release, seems unlikely because of its very low magnitude, undetectable by the kind of transducers we used. the second one, oscillations caused by a “snap back” of vessel walls, is also unlikely because of the rigidity of the xylem, and especially hard to explain under laser-induced cavitation inception in the dark, when xylem tension was nil or very small. the third one, torus aspiration, is impossible in our case because of the absence of these structures in corn. finally, the fourth one, structural failure in the sapwood, was elegantly rejected by tyree himself [13], who exposed xylem to pressure and detected a different kind of emission. we postulate that another possible source of the uae must be taken into account. it is the local oscillation of the liquid–gas interface of the water column produced by the expansion and compression of the formed cavity, i.e., the stress wave generated by rapid bonding energy release. during cavitation inception, after the cavity expands, it is expected to be compressed almost immediately by the water column. this “cavity oscillation” starts as a high frequency burst produced by the disruptive effect of the cavitation inception. as a consequence, ultrasonic acoustic signals are produced. in order for cavitation inception to occur, the cavitation “bubbles” generally need a surface on which they can nucleate. this could be provided by impurities in the liquid or the xylem walls, or by small undissolved micro-bubbles within the water, but most likely by air seeding through pit membranes [4]. these act as capillary valves that allow or prevent air seeding by adjusting local curvatures and interface positions [27]. air seeding induced by the heating at pit membranes under cw laser irradiation should also be taken into account as an initial stage in laser induced cavitation inception. the cw laser-induced cavitation opens the opportunity to study embolism in plants in a controlled manner. it also has the advantage of tracing the source, allowing the characterization of the signals and studying their propagation. by directing the laser beam to one point in the stem and recording acoustic emissions at different distances, we found that when cavitation was produced near the transducer, broadband frequency emissions were registered. but, if the transducer was installed further away, the rate and frequency of the emissions decreased with the distance to the cavitation source. this means that during signal propagation, absorption by the tissue takes place (rate decay) as well as frequency filtering. figures 2(e) and 2(f) show the frequency spectra of type 1 and type 2 signals. when comparing these figures, the frequency filtering effect is evident. our results confirm the hypothesis by ritman and milburn [28], who proposed that cavitation of xylem sap generally results in the production of a broadband acoustic emission with lower cutoff frequency determined by the dimensions of the resonating element. the larger a conduit dimen040003-6 papers in physics, vol. 4, art. 040003 (2012) / e. fernández et al. sion, the lower the frequency of its major resonance. thus, small cavitating elements, such as corn stem xylem, are expected to produce acoustic signals with a broadband frequency spectrum. our results can also explain the observations by tyree and dixon [12] who found and classified uae of different frequencies (between 0.1 and 1 mhz). according to our experiments, the different signals would be generated by cavitation events produced in different regions of the stem. broad band frequency signals would come from near the transducer while low frequency signals would come from regions far from to the transducer. according to these results, one might use the waveform of the emissions to determine the location of each cavitation event. in that case, a whole new field would be opened in the study of hydraulic architecture of plants. acknowledgements the authors are indebted to dr. h. f. ranea sandoval of fce-uncba-tandilargentina, professor silvia e. braslavsky from max-planck-institut für bioanorganische chemie mülheim an der ruhr, germany and dr. j. alvarado-gil from cinvestav-unidad, merida, merida, mexico for fruitful comments and suggestions. this work was partially supported by anpcyt, uncpba, uba and unlp. g.m.b. is member of the carrera del investigador cient́ıfico cic-ba, and r.j.f. of conicet. [1] h lambers, f s chapin, t l pons, plant physiological ecology, springer verlag, new york (1998). [2] f caupin, e herbert, cavitation in water: a review, c. r. phys. 7, 1000 (2006). [3] u zimmermann, h schneider, l h wegner, a haase, water ascent in tall trees: does evolution of land plants rely on a highly metastable state? new phytol. 162, 575 (2004). [4] m t tyree, the cohesion-tension theory of sap ascent: current controversies, j. exp. bot. 48, 1753 (1997). [5] j s sperry, f r adler, g s campbell, j p comstock, limitation of plant water use by rhizosphere and xylem conductance: results from a model, plant cell environ. 21, 347 (1998). [6] p h maseda, r j fernández, stay wet or else: three ways in which plants can adjust hydraulically to their environment, j. exp. bot. 57, 3963 (2006). [7] h h dixon, transpiration and the ascent of sap in plants, mcmillan & co., new york (1914). [8] h n v temperley, the behaviour of water under hydrostatic tension: iii., p. phys. soc. 59, 199 (1947). [9] j a milburn, r p c johnson, the conduction of sap. ii. detection of vibrations produced by sap cavitation in ricinus xylem, planta 69, 43 (1966). [10] d s crombie, j a milburn, m f hipkins, maximum sustainable xylem sap tensions in rhododendron and other species, planta 163, 27 (1985). [11] v g williamson, j a milburn, cavitation events in cut stems kept in water: implications for cut flower senescence, sci. hortic. (amsterdam) 64, 219 (1995). [12] m t tyree, m a dixon, cavitation events in thuja occidentalis l.? ultrasonic acoustic emissions from the sapwood can be measured, plant physiol. 72, 1094 (1983). [13] m t tyree, m a dixon, r g thompson, ultrasonic acoustic emissions from the sapwood of thuja occidentalis measured inside a pressure bomb, plant physiol. 74, 1046 (1984). [14] m t tyree, e l fiscus, s d wullschleger, m a dixon, detection of xylem cavitation in corn under field conditions, plant physiol. 82, 597 (1986). [15] g m a lo, s salleo, three different methods for measuring xylem cavitation and embolism: a comparison, ann. bot. (london) 67, 417 (1991). 040003-7 papers in physics, vol. 4, art. 040003 (2012) / e. fernández et al. [16] g e jackson, j grace, field measurements of xylem cavitation: are acoustic emissions useful? j. exp. bot. 47, 1643 (1996). [17] s b kikuta, p hietz, h richter, vulnerability curves from conifer sapwood sections exposed over solutions with known water potentials, j. exp. bot. 54, 2149 (2003). [18] r laschimke, m burger, h vallen, acoustic emission analysis and experiments with physical model systems reveal a peculiar nature of the xylem tension, j. plant physiol. 163, 996 (2006). [19] w t pockman, j s sperry, j w o’leary, sustained and significant negative water pressure in xylem, nature 378, 715 (1995). [20] h cochard, g damour, c bodet, i tharwat, m poirier, t ameglio, evaluation of a new centrifuge technique for rapid generation of xylem vulnerability curves, physiol. plantarum 124, 410 (2005). [21] p kafalas, a p ferdinand jr., fog droplet vaporization and fragmentation by a 10.6 mm laser pulse, appl. optics 12, 29 (1973). [22] w hentschel, w lauterborn, acoustic emission of single laser-produced cavitation bubbles and their dynamic, appl. sci. res. 38, 225 (1982). [23] s i kudryashov, k lyon, s d allen, photoacoustic study of relaxation dynamics in multibubble systems in laser-superheated water, phys. rev. e 73, 055301 (2006). [24] r zhao, r q xu, z h shen, j lu, x w ni, experimental investigation of the collapse of laser-generated cavitation bubbles near a solid boundary, opt. laser technol. 39, 968 (2007). [25] a c tam, applications of photoacoustic sensing techniques, rev. mod. phys. 58, 381 (1986). [26] m t tyree, m a dixon, e l tyree, r johnson, ultrasonic acoustic emissions from the sapwood of cedar and hemlock: an examination of three hypotheses regarding cavitations, plant physiol. 75, 988 (1984). [27] a g meyra, v a kuz, g j zarragoicoechea, geometrical and physicochemical considerations of the pit membrane in relation to air seeding: the pit membrane as a capillary valve, tree physiol. 27, 1401 (2007). [28] k t ritman, j a milburn, acoustic emissions from plants. ultrasonic and audible compared, j. exp. bot. 39, 1237 (1988). 040003-8 papers in physics, vol. 11, art. 110004 (2019) received: 19 november 2018, accepted: 10 may 2019 edited by: a. goñi, a. cantarero, j. s. reparaz licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.110004 www.papersinphysics.org issn 1852-4249 structural correlations in cs2cucl4: pressure dependence of electronic structures e. jara,1 j. a. barreda-argüeso,1 j. gonzález,1 r. valiente,2 f. rodŕıguez1∗ we have investigated the crystal structure of cs2cucl4 in the 0-20 gpa range as a function of pressure and how pressure affects its electronic properties by means of optical absorption spectroscopy. in particular, we focused on the electronic properties in the low-pressure pnma phase, which are mainly related to the tetrahedral cucl2−4 units distorted by the jahn–teller effect. this study provides a complete characterization of the electronic structure of cs2cucl4 in the pmna phase as a function of the cell volume and the cu–cl bond length, rcu−cl. interestingly, the opposite shift of the charge-transfer band-gap and the cu2+ d-d crystal-field band shift with pressure are responsible for the strong piezochromism of cs2cucl4. we have also explored the high-pressure structure of cs2cucl4 above 4.9 gpa yielding structural transformations that are probably associated with a change of coordination around cu2+. since the high-pressure phase appears largely amorphized, any structural information from x-ray diffraction is ruled out. we use electronic probes to get structural information of the high-pressure phase. i. introduction cs2cucl4 (orthorhombic pnma at ambient pressure) is a wide-band-gap charge-transfer (ct) semiconductor (e g = 2.52 ev), which exhibits a puzzling optical behaviour under pressure, associated with the cu2+ absorption and its structural changes [1]. both cl−→cu2+ ct and d-d absorption bands undergo unusually large pressure shifts and intensity changes showing abrupt jumps at about 5 gpa. this crystal exhibits a yellow–orange color at ambient conditions and below 5 gpa, which is mainly defined by the tail of the ct band (band gap) placed around 450 nm [2]. the isolated ∗e-mail: fernando.rodriguez@unican.es 1 malta team, dcitimac, facultad de ciencias, universidad de cantabria, 39005 santander, spain. 2 nanomaterials group-idival, dpto. f́ısica aplicada, universidad de cantabria, 39005 santander, spain. cucl2−4 tetrahedra in the pnma phase show a flattened (d 2d) distortion by the jahn-teller (jt) effect, which is responsible for the low-lying ct band gap, and thus its yellow–orange color, in comparison to other transition-metal ion (m ) isomorphous compounds cs2mcl4 (m = co, zn) [3]. unlike cs2cocl4, the d-d bands of cu 2+ (3d9), which are split by the jt distortion, do not affect the color as they appear in the near-infrared range at 1110 and 1820 nm [2, 3]. thanks to the study of electronic and crystal structures under high-pressure conditions of this relatively highly compressible material (bulk modulus: k0=15.0(2) gpa) [4], we are able to establish structural correlations to understand: (i) the electronic properties of cu2+ in tetrahedral coordination in the less compressible oxides; and (ii) how a lattice of independent cucl2−4 units under compression evolves towards denser phases. the variation of the crystal structure of cs2cucl4 and cs2cocl4 under pressure has been previously 110004-1 papers in physics, vol. 11, art. 110004 (2019) / e. jara et al. investigated by x-ray diffraction (xrd) in the 0-5 gpa range, where both crystals are in the pnma crystallograpic phase. however, cs2cucl4 undergoes a structural phase transition just above 5 gpa yielding a deep color change from orange to black. the high pressure phase could not be identified by xrd due to amorphization [4]. in general, the optical properties of cu2+ chlorides like cs2cucl4 are strongly dependent on the crystal structure (polymorphism), particularly, the cu2+ coordination — symmetry and crystal-field strength— and the way cu2+ ions are coupled to each other, i.e. either as isolated units or as interconnected cu-cu links through cl− ligand sharing [5, 6]. therefore, the knowledge of how these links and crystal-field effects express in the optical spectra are essential to extract structural information from the electronic spectra at high-pressure conditions. an important goal is to establish correlations between structure and electronic properties [3]. in this work, we investigate the relationship between dihedral cl-cucl angle of the jt-distorted flattened tetrahedra and the cu2+ d -orbital splitting experimentally observed by optical absorption and its pressure dependence. these correlations will be used to analyze how the band gap energy and d-d bands vary with pressure in the cs2cucl4 pnma phase, and how they change after the structural phase transition above 5 gpa. ii. experimental single crystals of cs2cucl4 were grown by slow evaporation at 30oc from acidic (hcl) solution containing a 2:1 stoichiometric ratio of the cscl and cucl2.h2o . the pnma space group was checked by xrd on powder samples using a bruker d8 advance diffractometer. the measured cell parameters at ambient conditions were: a = 9.770 å, b = 7.617 å, c = 12.413 å. a boehler-almax plate diamond anvil cell (dac) was used for the high-pressure studies. 200 µm thickness inconel gaskets were pre-indented and suitable 200 µm diameter holes were perforated with a betsa motorized electrical discharge machine. given that cs2cucl4 is soluble in common pressure transmitting media like methanol-ethanolwater (16:4:1), spectroscopic paraffin oil (merck) was used as alternative pressure transmitting media. it must be noted, however, that according to the ruby line broadening non-hydrostatic effects were significant in the explored range, as previously reported [6]. the microcrystals used for optical absorption in the high-pressure experiments were extracted by cleavage from a cs2cucl4 single crystal. the crystal quality was checked by means of a polarizing microscope. the d-d spectra were obtained using powdered cs2cucl4 filling the gasket hole of the dac for obtaining suitable optical and infrared absorption spectra due to the high oscillator strength of these transitions. the experimental set-up for optical absorption measurements with a dac has been described elsewhere [8–11]. the spectra were obtained by means of an ocean optics usb 2000 and a nirquest 512 monochromators equipped with siand ingaas-ccd detectors for the vis and nir, respectively. a thermo nicolet continuµm ftir provided with a reflective-optic microscope was used in the ir range. pressure was calibrated from the ruby r-line luminescence shift. iii. results and discussion i. electronic structure, optical absorption spectra and piezochromism of cs2cucl4 the optical absorption spectrum of cs2cucl4 at ambient pressure in the pmna phase consists of two intense bands in the near infrared associated with d-d electronic transitions within the cucl2−4 (d 2d) and a ligand-to-metal ct absorption in the visible, which is responsible for the band-gap and the concomitant yellow–orange color of this crystal (fig. 1). d-d peaks can be assigned to tetrahedral crystal-field transitions of cucl2−4 using both td and d2d irreps notation) [12]. within d 2d, the two main absorption peaks correspond to spin-allowed d-d electric-dipole transitions from the 2b2 ground state to the 2e and 2a1 excited states and are located at 0.55 and 1.3 ev, respectively. it must be noted that the first transition 2b2 →2e is associated with the splitting of the parent-tetrahedral 2t2 state into 2b2+ 2e due to the jt distortion of d2d symmetry with 2b2 being the electronic ground state corresponding to a flattened tetrahedron (inset of fig. 1). thus, the presence of this transition in the optical spectra constitutes 110004-2 papers in physics, vol. 11, art. 110004 (2019) / e. jara et al. 0 200 400 600 800 0.4 0.8 1.2 1.6 2.0 2.4 2.8 α (c m -1 ) e g =2.52 ev charge-transfer band gap crystal field 2b 2 2e 2b 2 2a 1 +2b 1 e (ev) 2e 2t2 2b1 2a1 2e 2b2 e2 e1 figure 1: optical absorption spectrum of cs2cucl4 at ambient conditions. the blue line represents the fit of the experimental points to the sum of two gaussian profiles. the crystal-field bands correspond to crystalfield d-d transitions: 2b2 → 2e (0.55 ev) and 2b2→ 2a1 (1.30 ev) in d2d symmetry. the high-energy absorption threshold corresponds to the cl−→cu2+ ct band gap, which is eg = 2.52 ev. the fingerprint of a jt distortion; in td the corresponding transition energy, i.e. 2t2 splitting, would be zero besides splitting contributions caused by the spin–orbit interaction. as it is shown in fig. 1, the splitting of 2b2 →2a1+2b1 transitions (a single 2t2 →2e transition in td) is not observed spectroscopically as they appear as a single band in the absorption spectrum due to symmetry selection rules. actually, in d2d, there are only two allowed electric-dipole transitions from the 2b2 ground state: 2b2 →2e (x,y-polarized) and 2b1 →2a1 (z-polarized) [7], in agreement with experimental observations. figure 2 shows the peak energy variations of d-d transitions as a function of pressure in both pnma and high-pressure phases of cs2cucl4. their transition energies and corresponding pressure rates are given in fig. 2. interestingly, the first jtrelated band associated to the 2b2 →2e transition shows a large redshift with pressure at a rate of −73 mev/gpa, while the second one, associated to 2b2 →2a1, shifts slightly towards higher energies (+7.5 mev/gpa). it must be noted that the transition energy variation of both bands e (p ) undergoes a change of slope at the structural phase transition at 4.9 gpa, thus being an adequate probe to explore phase transition phenomena. the ct direct band gap is also very sensitive to pressure. unlike d-d bands, pressure-induced ct redshift is responsible for the strong piezochromism of cs2cucl4, the color of which changes with pressure from yellow–orange to black, particularly at the structural transition to the high-pressure phase (fig. 3). cs2cucl4 is a ct semiconductor with a direct gap of 2.52 ev at ambient conditions which redshifts with pressure at a rate of −20 mev/gpa. this means that significant color changes are expected at pressures well above 5 gpa as shown in fig. 3. the direct band gap, e g, was determined from the tail of the absorption threshold by plotting (hν×α)2 against hν, with α being the absorption coefficient, once the absorption background was subtracted. e g, was obtained by the interception of this plot with α= 0. as fig. 3 shows, e g(p ) experiences an abrupt jump of about 0.3 ev at the phase transition at 4.9 gpa. above this pressure, we observe a band structure with at least two noticeable absorption peaks at 0.43 and 1.43 ev, the pressure dependence of which is shown in fig. 2. the pnma phase is recovered in down-stroke below about 3 gpa, thus having a hysteresis of 2 gpa at room temperature. it must be also noted that the difference between the transition pressure measured in single crystal (5.4 gpa) and powder (4.9 gpa) of cs2cucl4 must be associated to a lack of hydrostaticity in powdered samples, which reduces the transition pressure. however, the phase transition can be established at 4.9 gpa in upstroke, which corresponds to initial observation of traces of the high-pressure phase within the pressure range of phase coexistence. ii. angular overlap model for cucl2−4 the unusual pressure shifts of the two d-d bands (fig.2) can be explained semi-quantitatively within the ligand-field theory through the angular overlap model (aom) [13,15–17]. the initial flattenedtetrahedron symmetry (d2d) of cucl 2− 4 , which splits the parent tetrahedral t2 and e orbitals into b2 + e and a1 + b1, respectively, will change upon cs2cucl4 compression. the corresponding splitting will change depending on how the relative variations of the t d crystal-field strength and the jtrelated dihedral cl–cu–cl angle evolve with pressure. according to crystal-field theory and experimental observations [1, 3], the crystal-field strength 110004-3 papers in physics, vol. 11, art. 110004 (2019) / e. jara et al. (a) (b) 0 0.5 1 1.5 2 2.5 e (ev) o pt ic al d en si ty (a rb . u n. ) 0.0 gpa 3.5 gpa 4.9 gpa 6.3 gpa 7.4 gpa 0 0.5 1 1.5 2 0 1 2 3 4 5 6 7 8 e (e v ) p(gpa) pnma hp phasept 1.25 +0.0075 p 0.66 0.073 p 0.81-0.068 p 0.51+0.178 p figure 2: (a) variation of the absorption spectrum of cs2cucl4 with pressure in the pnma and high-pressure phase (p > 4.9 gpa). (b) variation of the peak energy of the main absorption bands with pressure. pressure coefficients dereived by fitting in the low-pressure pnma phase are included. 1.8 2.0 2.2 2.4 2.6 2.8 0 5 10 15 e (e v ) p(gpa) pnma hp phasep t direct gap 2.53-0.02 p 2.29-0.03 p 0.1 gpa 2.1 gpa 5.4 gpa 7.9 gpa 18.2 gpa11.5 gpa (b)(a) figure 3: (a) variation of the charge-transfer band gap with pressure in both pnma phase and high-pressure phase. the direct gap shifts toward lower energies with pressure. (b) images of a single crystal of cs2cucl4 in a dac at different pressures. note how the crystal varies from yellow to dark red and eventually to black upon increasing pressure. the piezochromism is associated with the redshifted charge-transfer band gap with pressure. single crystal dimensions (ambient pressure): 80 × 80 × 22 µm3. usually increases by decreasing the cu–cl bond distance, r, whereas the dihedral angle tends to decrease with pressure, approaching the td angle (109.47o) under high compression. xrd results show that r and γcl−cu−cl change from r = 2.230 å and γcl3−cu−cl3 = 127.4 o [17] at ambient pressure to r = 2.199 å and γcl3−cu−cl3 = 122.3o at 3.9 gpa [4], in agreement with expectations for a jt system. in order to apply the aom to determine the d-d transition energies of cucl2−4 as a function of structural parameters, instead of γcl−cu−cl we will use the angle β = 1/2(γcl−cu−cl − 109.47o), which represents the deviation of the cl–cu–cl angle from its td value. then, β = 0 in td symmetry and β = 35.3 o in a square-planar d4h symmetry. for cucl 2− 4 in cs2cucl4, β = 8.5±0.5o at ambient conditions and β = 6.4±0.5o at 3.9 gpa. we will use the aom to simulate the transition energies as a function of r and β for explaining why the first band largely shifts to lower energy whereas the second one, more sensitive to the crystal-field strength, shifts slightly to higher energies with pressure. within the aom, the expressions to calculate the electronic energies in a mx2−4 system are given as a function of the aom parameters eσ,eπ,esd and epd and the x–m–x bond angle γ as shown in eq. (1) [16]. 110004-4 papers in physics, vol. 11, art. 110004 (2019) / e. jara et al. d2d d2dtd td β = 8.5º β = 0º β = 0º β = 6.4º pressure β d-levels 2t2 2e 10dq10dq0 0.0 0.5 1.0 1.5 2.0 2.5 0 5 10 15 20 25 30 35 40 pe ak e ne rg y, e (e v ) β (º) cucl4 2experimental cs2cucl4 p=0 gpa p=0.0 gpa p=3.9 gpa2b2 2a1 2e 2b1 2a1 2b1 2e experimental cs2cucl4 p=3.9 gpa 2e 2t2 2e 2b2 2b1 2a1 2e 2b2 2b1 2a1 (a) (b) figure 4: (a) calculated crystal-field energies for cucl2−4 as a function of the distortion angle (β) using aom. β = 0◦ corresponds to td (regular tetrahedron), and β = 32.5 ◦ to d4h cucl 2− 4 (square-planar). solid and dashed lines correspond to calculations at ambient pressure and 3.9 gpa, respectively. the spin-orbit coupling have been included in the calculations with λ= -829 cm−1 (see text for details). filled color symbols correspond to experimental data from the compound series providing different cl–cu–cl bond angles for cucl2−4 [15, 17]. empty circles correspond to present experimental data for cs2cucl4 at ambient pressure (orange) and 3.9 gpa (dark red). note that the trends of the variation is in agreement with structural data obtained by xrd [4]. (b) schematic diagram of the cu2+ d-orbital splitting in d2d and td symmetry for four different configurations corresponding to ambient pressure (left) and high-pressure conditions (3.9 gpa, right). in td only r is changed whereas both r and β are modified in d2d. ∆e(2b2 →2 e) = 3[sin4(γ/2) − 1/2 sin2(γ)]eσ + [sin2(γ) − 2 cos2(γ) − 2 cos2(γ/2)]eπ, ∆e(2b2 →2 b1) = 3 sin4(γ/2)eσ + [sin2(γ) − 4 sin2(γ/2)]eπ − 13.3 sin4(γ/2) cos2(γ/2)epd, (1) ∆e(2b2 →2 a1) = 3 sin4(γ/2)eσ − 4[cos2(γ/2) − 1/2 sin2(γ/2)]2eσ − 2 sin2(γ)eπ + 16[cos2(γ/2) − 1/2 sin2(γ/2)]2esd − 13.3 sin4(γ/2) cos2(γ/2)epd. these expressions are suitable to account for the transition energies in different mx2−4 systems having different dihedral angles [17]. this has been especially useful for explaining the variation of transition energies obtained from absorption spectra as a function of the dihedral angle in series of cu2+ chlorides, providing dihedral angles for cucl2−4 ranging between 127o and 180o or, equivalently, from β = 8.5o to β = 35.3o [13, 15–17]. the spectroscopic series of cucl2−4 can be explained using the following aom parameters: eσ =0.635 ev, eπ = 0.113 ev, esd = 0.114 ev and epd = −0.0025 ev [13, 16, 17]. figure 4 shows the energy of the d-d transitions of cucl2−4 as a function of β, where, additionally, we have included the spin-orbit interaction -λ~l.~s using λ = 0.103 ev [17]. the d2d states appear additionally split as 2b2(γ7); 2e(γ6+γ7); 2b1(γ7); and 2a1(γ6) following double group irrep notation (see fig. 4a) [7]. the pressure-induced energy shifts in cs2cucl4 have been simulated by scaling the aom parameters to structural data at 3.9 gpa on the assumption of a power law for the volume as (v0/v ) 5/3 using the equation of state of cs2cucl4 in the pnma phase [4]. so, we obtained the following aom parameters at 3.9 gpa: eσ = 0.78 ev, eπ = 0.139 ev, esd = 0.172 ev and epd = −0.0025 ev. although this may be a rough approximation for describing the variation of aom parameters with pressure/volume, the result of these simulations allows us to explain the band shifts with pressure (fig. 110004-5 papers in physics, vol. 11, art. 110004 (2019) / e. jara et al. 4). pressure-induced r (or v ) reduction increases the energy separation of the parent td orbitals, e and t2, by 10dq, while reduction of β decreases the t2 and e orbital splittings in d2d. as illustrated in fig. 4, both effects induce band shifts in the two d-d bands similar to those observed experimentally. therefore, these structural correlations, which are based on the energy shifts of the crystal-field bands in cs2cucl4, indicate that the main pressure effect on the jt-flattened cucl2−4 is reducing the cl–cu– cl bond angle from 8.5o at ambient pressure to 6.4o at 3.9 gpa, consistently with structural data [4]. therefore, these results support the adequacy of the d-d spectra to explore structural changes induced by pressure in transition-metal chlorides involving jt ions like cu2+. iv. conclusions electronic absorption spectra allow us to elucidate that the pressure dependence of the electronic structure of cs2cucl4 can be explained to a great extent on the basis of td cucl 2− 4 , the volume of which is roughly eight times more incompressible than cs2cucl4 bulk. the piezochromic phase transition at 4.9 gpa is mainly associated with the ct redshifts, particularly in the high-pressure phase well above 4.9 gpa. the new high-pressure phase, although it has not been identified yet, probably involves a change of coordination from cucl2−4 flattened tetrahedra to a structure consisting of ligand sharing cucl4−6 octahedra as suggested by its d-d transition energies. correlations between crystalfield bands and structure of cucl2−4 through the aom allow us to infer structural changes undergone by cucl2−4 induced by pressure on the basis of the crystal-field energy shifts. the measured shifts are consistent with a reduction of both the bond distance and the cl–cu–cl angle, i.e. reduction of the jt distortion, with pressure in agreement with previous xrd data. acknowledgements financial support from the spanish ministerio de economı́a, industria y competitividad (project ref. mat2015-69508p) and malta-consolider (ref. mat201571010redc). ej also thanks the spanish ministerio de ciencia, innovación y universidades for a fpi research grant (ref. no. bes-2016-077449). [1] h g drickamer, k l bray, pressure tuning spectroscopy as a diagnostic for pressureinduced rearrangements (piezochromism) of solid-state copper(ii) complexes, acc. chem. res. 23, 55 (1990). [2] j ferguson, electronic absorption spectrum and structure of cucl4=, j. chem. phys. 40, 3406 (1964). [3] l nataf, f aguado, i hernández, r valiente, j gonzález, m n sanz-ortiz, h wilhelm, a p jephcoat, f baudelet, f. rodŕıguez. volume and pressure dependences of the electronic, vibrational, and crystal structures of cs2cocl4: identification of a pressure-induced piezochromic phase at high pressure, phys. rev. b 95, 014110 (2017). [4] y xu, s carlson, k s oderberg, r norrestam, high-pressure studies of cs2cucl4 and cs2cocl4 by x-ray diffraction methods, j. sol. st. chem. 153, 212 (2000). [5] h b yi, f f xia, q zhou, d zeng, [cucl3]and [cucl4] 2hydrates in concentrated aqueous solution: a density functional theory and ab initio study, j. phys. chem. a 115, 4416 (2011). [6] r valiente, f rodŕıguez, effects of chemical pressure on the charge-transfer spectra of cux2−4 -complexes formed in cu 2+-doped a2mx4 (m= zn, mn, cd, hg; x= cl, br), j. phys.: condens. matter, 10, 9525 (1998). [7] r valiente, f rodŕıguez, thermochromic properties of the ferroelectric cu2+-doped [(ch3)4n]hgbr3: study of the temperatureinduced dichroism, j. phys.: condens. matter 11, 2595 (1999). [8] s klotz, j c chervin, p munsch, g le marchand, hydrostatic limits of 11 pressure transmitting media, j. phys. d: appl. phys. 42, 075413 (2009). [9] a moral, f rodŕıguez, new double beam spectrophotometer for microsamples. application 110004-6 papers in physics, vol. 11, art. 110004 (2019) / e. jara et al. to hydrostatic pressure experiments, rev. scient. inst. 66, 5178 (1995). [10] k syassen, ruby under pressure, high pressure res. 28, 75 (2008). [11] j a barreda-argüeso, f rodŕıguez, microscopio para la caracterización espectroscópica de una muestra, spanish patent no. es 2 461 015 a1 (2014). [12] j s griffith, the theory of transition-metal ions, cambridge univ. press (1980). [13] m atanasov, b delley, d reinen, a dft study of the energetical and structural landscape of the tetrahedral to square-planar conversion of tetrahalide complexes of copper(ii), z. anorg. allg. chem. 636, 1740 (2010). [14] f rodŕıguez, d hernández, j garćıa-jaca, h ehrenberg, h weitzel,optical study of the piezochromic transition in cumoo4 by pressure spectroscopy, phys. rev. b. 61, 16497 (2000). [15] m a hitchman, the influence of vibronic coupling on the spectroscopic properties and stereochemistry of simple 4and 6-coordinate copper(ii) complexes, comments inorg. chem. 15, 197 (1994). [16] r g mcdonald, m j riley, m a hitchman, angular overlap treatment of the variation of the intensities and energies of the dd transitions of the tetrachlorocuprate(2-) ion on distortion from a planar toward a tetrahedral geometry: interpretation of the electronic spectra of bis(n-benzylpiperazinium) tetrachlorocuprate(ii) bis(hydrochloride) and n-(2-ammonioethyl) morpholinium tetrachlorocuprate(ii), inorg. chem. 27, 894 (1988). [17] m atanasov, d ganyushin, k sivalingam, f neese, a modern first-principles view on ligand field theory through the eyes of correlated multireference wavefunctions, struct. bond 143, 149 (2012). 110004-7 papers in physics, vol. 5, art. 050010 (2013) received: 25 june 2013, accepted: 10 december 2013 edited by: r. dickman reviewed by: f. reis, instituto de f́ısica, univ. fed. fluminense, brazil. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.050010 www.papersinphysics.org issn 1852-4249 invited review: kpz. recent developments via a variational formulation horacio s. wio,1∗ roberto r. deza,2† carlos escudero,3‡ jorge a. revelli4§ recently, a variational approach has been introduced for the paradigmatic kardar–parisi– zhang (kpz) equation. here we review that approach, together with the functional taylor expansion that the kpz nonequilibrium potential (nep) admits. such expansion becomes naturally truncated at third order, giving rise to a nonlinear stochastic partial differential equation to be regarded as a gradient-flow counterpart to the kpz equation. a dynamic renormalization group analysis at one-loop order of this new mesoscopic model yields the kpz scaling relation α + z = 2, as a consequence of the exact cancelation of the different contributions to vertex renormalization. this result is quite remarkable, considering the lower degree of symmetry of this equation, which is in particular not galilean invariant. in addition, this scheme is exploited to inquire about the dynamical behavior of the kpz equation through a path-integral approach. each of these aspects offers novel points of view and sheds light on particular aspects of the dynamics of the kpz equation. i. introduction although readers whose careers span mostly on the 21th century might not care about this, back in the sixties (when transistors and lasers had already been invented) equilibrium critical phenomena were still a puzzle. in fact, although a sense of “universality” had been gained in 1950 through a field theory based on the innovative concept of order parameter [1, 2], its predicted critical exponents were ∗e-mail: wio@ifca.unican.es †e-mail: deza@mdp.edu.ar ‡e-mail: cel@icmat.es §e-mail: revelli@famaf.unc.edu.ar 1 ifca (uc-csic), avda. de los castros s/n, e-39005 santander, spain. 2 ifimar (unmdp-conicet), funes 3350, 7600 mar del plata, argentina. 3 depto. matemáticas & icmat (csic-uam-uc3mucm), cantoblanco, e-28049 madrid, spain. 4 famaf-ifeg (conicet-unc), 5000 córdoba, argentina. almost as a rule wrong. it was not until the seventies that a far more sophisticated field-theory approach [3] brought order home: equilibrium universality classes are determined solely by the dimensionalities of the order parameter and the ambient space. since then, one of statistical physics’ “holy grials” has been to conquer a similar achievement for non-equilibrium critical phenomena [4]. in such a (still unaccomplished) enterprise, a valuable field-theoretical tool has been in the last quarter of century the kardar–parisi–zhang (kpz) equation [5–7]. the kpz equation [5–7] has become a paradigm for the description of a vast class of nonequilibrium phenomena by means of stochastic fields. the field h(x,t), whose evolution is governed by this stochastic nonlinear partial differential equation, describes the height of a fluctuating interface in the context of surface-growth processes in which it was originally formulated. from a theoretical point of view, the kpz equation has many interesting properties, for instance, its close relationship with the burgers equation [8] or with a dif050010-1 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. fusion equation with multiplicative noise, whose field φ(x,t) can be interpreted as the restricted partition function of the directed polymer problem [9]. many of the efforts put in investigating the behavior of its solutions were focused on obtaining the scaling laws and critical exponents in one or more spatial dimensions [11–17]. however, other questions of great interest are the development of suitable algorithms for its numerical integration [18, 19], the construction of particular solutions [20–23], the crossover behavior between different regimes [10, 24–26], as well as related ageing and pinning phenomena [27–29]. among all the classical theoretical developments concerning this equation [6, 7], two have recently drawn our attention. one was the scaling relation α + z = 2, which is expected to be exact for the kpz equation in any dimension. the exactness of this relation has been traditionally attributed to the galilean invariance of the kpz equation. nevertheless, the assumed central role of this symmetry has been challenged in this as well as in other nonequilibrium models from both a theoretical [30–33] and a numerical [34–36] point of view. the second one is the generally accepted lack of existence of a suitable functional allowing to formulate the kpz equation as a gradient flow. in fact, a variational approach to the closely related sunguo-grant [37] and villain-lai-das sarma [38, 39] equations was developed in [40, 41] by means of a geometric construction. in [46], a lyapunov functional (with an explicit density) was found for the deterministic kpz equation. also, a nonequilibrium potential (nep), a functional that allows the formal writing of the kpz equation as a (stochastically forced) exact gradient flow, was introduced. in this work we shortly review the consistency constraints imposed by the nonequilibriumpotential structure on discrete representations of the kpz equation and show that they lead to explicit breakdown of galilean invariance, despite the fact that the obtained numerical results are still those of the kpz universality class. a taylor expansion of the previously introduced nep has (in terms of fluctuations ) an explicit density and a thought-provoking structure [47], and leads to an equation of motion (for fluctuations, in the continuum) with exact gradient-flow structure, but different from the kpz one. this equation has a lower degree of symmetry: it is neither galilean invariant nor even translation invariant. its scaling properties are studied by means of a dynamic renormalization group (drg) analysis, and its critical exponents fulfill at one-loop order the same scaling relation α + z = 2 as those of the kpz equation, despite the aforementioned lack of galilean invariance. the concern with stability leads us to suggest the introduction of an equation related to the kuramoto-sivashinsky one, also with exact gradient-flow structure. we close this article exposing some novel developments based on a pathintegral-like approach. ii. brief review of the nonequilibrium potential scheme loosely speaking, the notion of nep is an extension to nonequilibrium situations of that of equilibrium thermodynamic potential. in order to introduce it, we consider a general system of nonlinear stochastic equations (admitting the possibility of multiplicative noises ) q̇ν = kν(q) + gνi (q) ξi(t), ν = 1, . . . ,n; (1) where repeated indices are summed over. equation (1) is stated in the sense of itô. the {ξi(t)}, i = 1, . . . ,m ≤ n are mutually independent sources of gaussian white noise with typical strength γ. the fokker–planck equation corresponding to eq. (1) takes the form ∂p ∂t = − ∂ ∂qν kν(q) p + γ 2 ∂2 ∂qν ∂qµ qνµ(q) p (2) where p(q,t; γ) is the probability density of observing q = (q1, . . . ,qn) at time t for noise intensity γ, and qνµ(q) = gνi (q) g µ i (q) is the matrix of transport coefficients of the system, which is symmetric and non-negative. in the long time limit (t → ∞), the solution of eq. (2) tends to the stationary distribution pst(q). according to [42–44], the nep φ(q) associated to eq. (2) is defined by φ(q) = − lim γ→0 γ ln pst(q,γ). (3) in other words, pst(q) d nq = z(q) exp [ − φ(q) γ + o(γ) ] dωq, 050010-2 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. where φ(q) is the nep of the system and the prefactor z(q) is defined as the limit ln z(q) = lim γ→0 [ ln pst(q,γ) + 1 γ φ(q) ] . here dωq = d nq/ √ g(q) is the invariant volume element in the q-space and g(q) is the determinant of the contravariant metric tensor (for the euclidean metric it is g = 1). it was shown [42] that φ(q) is the solution of a hamilton–jacobi-like equation (hje) kν(q) ∂φ ∂qν + 1 2 qνµ(q) ∂φ ∂qν ∂φ ∂qµ = 0, and z(q) is the solution of a linear first-order partial differential equation depending on φ(q) (not shown here). equation (3) and the normalization condition ensure that φ is bounded from below. furthermore, it follows that dφ(q) dt = kν(q) ∂φ(q) ∂qν = − 1 2 qνµ(q) ∂φ ∂qν ∂φ ∂qµ ≤ 0, i.e., φ is a lyapunov functional for the dynamics of the system when fluctuations are neglected. under the deterministic dynamics, q̇ν = kν(q), φ decreases monotonically and takes a minimum value on attractors. in particular, φ must be constant on all extended attractors (such as limit cycles or strange attractors) [42]. an alternative way to look into this problem is due to ao [45]. the interesting feature of this approach is that it resorts neither to pst(q) nor to the small-noise limit, thus being applicable in principle to more general situations. iii. variational approach for kpz the kardar–parisi–zhang (kpz) equation reads ∂h(x,t) ∂t = ν∇2h(x,t)+ λ 2 [∇h(x,t)]2 +ξ(x,t), (4) where ξ(x,t) is a gaussian white noise, of zero mean (〈ξ(x,t)〉 = 0) and correlation 〈ξ(x,t)ξ(x′, t′)〉 = 2γδ(x − x′)δ(t − t′). as it is well known, this nonlinear differential equation describes the fluctuations of a growing interface with a surface tension given by ν; λ is proportional to the average growth velocity and arises because the surface slope is paralleled transported in such a growth process. lyapunov functional the deterministic kpz equation—obtained by setting γ = 0—is exactly solvable by means of the hopf–cole transformation (φ(x,t) = e λ 2ν h(x,t), which maps the nonlinear kpz equation onto the (deterministic) linear diffusion equation [5] ∂φ(x,t) ∂t = ν∇2φ(x,t). (5) also, the multiplicative reaction-diffusion (rd) equation ∂φ(x,t) ∂t = ν∇2φ(x,t) + φ(x,t)ξ(x,t), (6) which is associated to the directed polymer problem [6,7,9] results, using the inverse transformation (h(x,t) = 2ν λ ln φ(x,t)) to be mapped into the complete kpz equation (4). the deterministic part of eq. (6) (i.e., eq. (5)), can be written as ∂φ(x,t) ∂t = − δf[φ(x,t)] δφ(x,t) , (7) where f[φ(x,t)] is the lyapunov functional of the deterministic rd problem given by f[φ(x,t)] = ν 2 ∫ [∇φ(x,t)]2 dx. applying to this functional the above indicated inverse transformation we get [46] f[h] = λ2 8ν ∫ e λ ν h(x,t) [∇h(x,t)]2 dx, (8) that allows the kpz equation to be written as ∂ ∂t h(x,t) = −γ[h] δf[h] δh(x,t) + ξ(x,t). (9) one can check the lyapunov property ḟ[h] ≤ 0, with the motility γ[h] given by γ[h] = ( 2ν λ )2 e− λ ν h(x,t), and that its minimum is achieved by constant functions. hence we have a lyapunov functional for the deterministic kpz equation that displays simple dynamics: the asymptotic stability of constant 050010-3 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. solutions indicates an approach to constant profiles at long times, for arbitrary initial conditions. despite this simplicity in the deterministic case, the stochastic situation is far from trivial and gives rise to self-affine fractal profiles. in particular, the existence of this lyapunov functional provides no a priori intuition on the stochastic dynamics. the nonequilibrium potential an alternative functional was also proposed in [46]. by starting from the functional fokker-planck equation, we look for the stationary solution (in fact steady-state solution), and after some integration by parts, it is possible to arrive to another form of lyapunov functional φ[h] = ν 2 ∫ dx (∇h)2− λ 2 ∫ dx ∫ h(x,t) href dψ (∇ψ)2 . (10) it is somehow inspired in the analytical form of “model a”, according to the classification of critical phenomena in [48]. here, the interpretation of the integral in the 2nd term on the rhs is∫ dx ∫h(x,t) href dψ = ∑ j 4x ∫hj href,j dψj. according to this definition, the kpz equation can be formally written as a stochastically forced gradient flow ∂ ∂t h(x,t) = − δφ[h] δh(x,t) + ξ(x,t). (11) the functional so defined fulfills the lyapunov condition φ̇[h] = − ( δφ[h] δh(x,t) )2 ≤ 0 as well, and could be identified as the nonequilibrium potential (nep) for the kpz case [53, 54]. we will not pursue here the development of rigorous result concerning the functional (10). our present interest falls in the calculation of quantities of physical interest rather than in building a completely rigorous mathematical theory. it is worth remarking that, as indicated in [46] and above, such a form has a discrete definition. it is also interesting to point out that analogous functionals involving functional integrals which are not carried out explicitly were obtained for the problem of interface fluctuations in random media [49–52]. nep expansion we now proceed to formally taylor expand the nep defined in eq. (10) around a given reference (or initial) state, denoted by h0 φ[h] = ν 2 ∫ dx (∇h)2 − λ 2 ∫ dx ∫ h(x,t) h0 dψ (∇ψ)2 ≈ φ[h0] + δφ[h0] + 1 2 δ2φ[h0] + 1 6 δ3φ[h0] + · · · . (12) the successive terms in the expansion of φ[h] are δφ[h0] = − ∫ dx [ ν∇2h0 + λ 2 (∇h0) 2 ] δh, δ2φ[h0] = − ∫ dxδh ( ν∇2 + λ∇h0 ·∇ ) δh, δ3φ[h0] = −λ ∫ dxδh (∇δh)2 . (13) clearly, for higher order (n ≥ 4) terms we have δnφ[h0] ≡ 0, (14) indicating that this formal expansion has a natural cut-off after the third order. it is worth indicating that in this computation— as in all the other computations within this work— boundary terms vanish provided one of the following types of boundary conditions is assumed: homogeneous dirichlet boundary conditions, homogeneous neumann boundary conditions, periodic boundary conditions or an infinite space with the derivatives of δh vanishing as they approach an infinite distance from the origin. the reference state h0 is arbitrary (i.e., any initial condition), but it is particularly useful to take it as one that makes δφ[h0] = 0, that is: a solution to the stationary counterpart of the deterministic kpz equation. the complete set of solutions is h0 = c, where c is an arbitrary constant (arbitrary up to the application of the boundary conditions, whenever this consideration applies), what physically corresponds to a flat interface. hence we have (δh = h−h0) φ[h] = φ[h0] + 1 2 δ2φ[h0,δh] + 1 6 δ3φ[h0,δh]. (15) 050010-4 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. the equation for fluctuations from here we can define an effective nep, which drives the dynamics of the fluctuations δh and has an explicit density. clearly, it corresponds to the last two terms in eq. (15). to simplify the notation we adopt u(x,t) := δh(x,t), and so the nep reads i[u] = ∫ dx [ ν 2 − λ 6 u(x,t) ] (∇u)2 (16) = − ∫ dxu(x,t) [ ν∇2u + λ 6 (∇u)2 ] . the deterministic equation for u results ∂u ∂t = − δi[u] δu , ∂u ∂t = ( ν − λu 3 ) ∇2u− λ 6 (∇u)2. (17) clearly, patterns like u0 = constant are stationary solutions of eq. (17): for all of them i[u] = 0, indicating that all such states have the same “energy”. finally, let us remark that although the formal taylor expansion becomes naturally truncated at third order, the deterministic kpz equation is not recovered. we call the stochastic version of this new equation “kpzw”. there is a remarkable difference between both equations (kpz and eq. (17)). it arises due to the fact that in the first case we have a fixed equation for h and for any initial condition, while in the second case we have a fixed initial condition (u = 0) with a variable equation whose coefficients depend on ho! (it is an equation for the departure from the given initial condition). the question of the relevance of this aspect to ageing problems (as discussed for instance in [29]) arises naturally. this point, worth to be analyzed, will be the subject of further work. non-local kernel in previous works [36, 46] it was indicated that the following functional, including a nonlocal contribution, f[h] = ∫ ω {( λ2 8ν ) (∇h)2 + e− λ 2ν h(x,t) × ∫ ω dx′g(x, x′)e λ 2ν h(x′,t) } e λ ν h(x,t)dx, (18) leads, after functional derivation, to a generalized kpz equation ∂th(x, t) = ν∇2h(x, t) + λ 2 [∇h(x, t)]2 −e− λ 2ν h(x,t) ∫ ω dx′g(x, x′)e λ 2ν h(x′,t) +ξ(x, t). (19) it was also shown that if the nonlocal kernel has translational invariance (g(x, x′) = g(x−x′)), and also, if it is of (very) “short” range, it can be expanded as g(x − x′) = ∞∑ n=0 a2nδ (2n)(x − x′), (20) with δ(n)(x−x′) = ∇nx′δ(x−x ′), and where symmetry properties were taken into account. exploiting this form of the kernel, and considering different approximation orders, it is possible to recover contributions having the same form as the ones arising in several previous works, where scaling properties, symmetry arguments, etc., have been used to discuss the possible contributions to a general form of the kinetic equation [55–57]. such different contributions are tightly related to several of other previously studied equations, like the sun–guo–grant equation [37], as well as others [55, 58]. we will not pursue this aspect here, but we will briefly refer again to it in a forthcoming section. iv. discretization issues, symmetry violation and all that in this section we will review aspects related to two main symmetries associated with the 1d kpz equation: galilean invariance and the fluctuation– dissipation relation. on the one hand, galilean invariance has been traditionally linked to the exactness of the relation α + z = 2 among the critical exponents, in any spatial dimensionality (the roughness exponent α, characterizing the surface morphology in the stationary regime, and the dynamic exponent z, indicating the correlation length scaling as ξ(t) ∼ t1/z). however, this interpretation has been criticized in this and other nonequilibrium models [31, 32, 59]. on the other hand, the second symmetry essentially tells us that in 1d, the nonlinear (kpz) term is not operative at long times. 050010-5 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. even when recognizing the interesting analytical properties of the kpz equation, it is clear that investigating the behavior of its solutions requires the (stochastic) numerical integration of a discrete version. such an approach has been used ,e.g., to obtain the critical exponents in one and more spatial dimensions [10–15,60]. although a pseudo-spectral spatial discretization scheme has been recently introduced [18, 61], real-space discrete versions of eq. (4) are still used for numerical simulations [62, 63]. one reason is their relative ease of implementation and of interpretation in the case of nonhomogeneous substrates, for example a quenched impurity distribution [64]. consistency here, we use the standard, nearest-neighbor discretization prescription as a benchmark to elucidate the constraints to be obeyed by any spatial discretization scheme, arising from the mapping between the kpz and the diffusion equation (with multiplicative noise) through the hopf–cole transformation. the standard spatially discrete version of eq. (6) is φ̇j = ν a2 (φj+1 − 2φj + φj−1) + λ √ γ 2ν φjξj, (21) with 1 ≤ j ≤ n ≡ 0, because of the assumed periodic b.c. (the implicit sum convention is not meant in any of the discrete expressions). here a is the lattice spacing. then, using the discrete version of hopf–cole transformation φj(t) = exp [ λ 2ν hj(t) ] , we get ḣj = 2ν2 λa2 ( eδ + j a + eδ − j a − 2 ) + √ γ ξj, (22) with δ±j ≡ λ 2νa (hj±1 − hj). by expanding the exponentials up to terms of order a2, and collecting equal powers of a (observe that the zero-order contribution vanishes) we retrieve ḣj = ν a2 (hj+1 − 2hj + hj−1) + λ 4 a2 [ (hj+1 −hj)2 + (hj −hj−1)2 ] + √ γ ξj. (23) as we can see, the first and second terms on the r.h.s. of eq. (23) are strictly related by virtue of the hopf-cole transformation. in other words, the discrete form of the laplacian in eq. (21) constrains the discrete form of the nonlinear term in the transformed equation. later we return, in another way, to the tight relation between the discretization of both terms. known proposals [60] fail to comply with this natural requirement. an important feature of the hopf–cole transformation is that it is local, i.e., it involves neither spatial nor temporal transformations. an effect of this feature is that the discrete form of the laplacian is the same, regardless of whether it is applied to φ or h. the aforementioned criterion dictates the following discrete form for f[φ] (the one just before eq. (8)), thus a lyapunov function for any finite n f[φ] = ν 2 n∑ j=1 a ( (∂xφ) 2 ) j = ν 4a n∑ j=1 [ (φj+1 −φj)2 + (φj −φj−1)2 ] . (24) it is a trivial task to verify that the laplacian is (∂2xφ)j = −a−1∂φjf[φ]. now, the obvious fact that this functional can also be written as f[φ] = ν 2 a ∑n j=1(φj+1 − φj) 2 illustrates a fact that for a more elaborate discretization requires explicit calculations: the laplacian does not uniquely determine the lyapunov function [34–36]. equation (22) has also been written in [14], although with different goals than ours. their interest was to analyze the strong coupling limit via mapping to the directed polymer problem. an accurate consistent discretization since the proposals of [60] already involve next-tonearest neighbors, one may seek for a prescription that minimizes the numerical error. an interesting choice for the laplacian is [65] 1 12 a2 [16(φj+1 + φj−1) − (φj+2 + φj−2) − 30 φj] , (25) 050010-6 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. which has the associated discrete form for the kpz term (∂xφ) 2 = 1 24 a2 { 16 [ (φj+1 −φj)2 +(φj −φj−1)2 ] − [ (φj+2 −φj)2 + (φj −φj−2)2 ]} +o(a4). (26) replacing this into the first line of eq. (24), we obtain eq. (25). since this discretization scheme fulfills the consistency conditions, it is accurate up to o(a4) corrections, and its prescription is not more complex than other known proposals, we expect that it will be the convenient one to use when high accuracy is required in numerical schemes [34–36]. relation with the lyapunov functional in sect. iii we have indicated the form of the nep for kpz, and the way in which the functionals f[φ] and f[h] are related [46]. according to the previous results, we can write the discrete version of eq. (8) as f[h] = λ2 8ν 1 2 a ∑ j e λ ν hj [ (hj+1 −hj)2 +(hj −hj−1)2 ] . introducing this expression into ∂thj = γj δf[h] δhj , and through a simple algebra, we obtain eq. (23). this reinforces our previous result, and moreover indicates that the discrete variational formulation naturally leads to a consistent discretization of the kpz equation. the fluctuation–dissipation relation this relation is, together with galilean invariance, a fundamental symmetry of the one-dimensional kpz equation. it is clear that both symmetries are recovered when the continuum limit is taken in any reasonable discretization scheme. thus, an accurate enough partition must yield suitable results. the stationary probability distribution for the kpz problem in 1d is known to be [6, 7] pstat[h] ∼ exp { − ν 2 γ ∫ dx (∂xh) 2 } . for the discretization scheme in eq. (23), this is ∼ exp   ν2ε 12a ∑ j [ (hj+1 −hj)2 + (hj −hj−1)2 ] . (27) inserting this expression into the stationary fokker–planck equation, the only surviving term has the form 1 2a3 ∑ j [ (hj+1 −hj)2 + (hj −hj−1)2 ] × [hj+1 − 2hj + hj−1] . (28) the continuum limit of this term is ∫ dx (∂xh) 2 ∂2xh, that is identically zero [6, 7]. a numerical analysis of eq. (28) indicates that it is several orders of magnitude smaller than the value of the exponents’ pdf [in eq. (27)], and typically behaves as o(1/n), where n is the number of spatial points used in the discretization. moreover, it shows an even faster approach to zero if expressions with higher accuracy [like eqs. (25) and (26)] are used for the differential operators. in addition, when the discrete form of (∂xh) 2 from [60] is used together with its consistent form for the laplacian, the fluctuation– dissipation relation is not exactly fulfilled. this indicates that the problem with the fluctuation– dissipation theorem in 1 + 1, discussed in [18, 60] can be just circumvented by using more accurate expressions. galilean invariance this invariance means that the transformation x → x−λvt, h → h + vx, f → f − λ 2 v2, (29) where v is an arbitrary constant vector field, leaves the kpz equation invariant. the equation obtained using the classical discretization ∂xh → 1 2 a (hj+1 −hj−1), (30) is invariant under the discrete galilean transformation ja → ja−λvt, hj → hj + vja, f → f − λ 2 v2. (31) 050010-7 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. however, the associated equation is known to be numerically unstable [14], at least when a is not small enough. besides, eq. (23) is not invariant under the discrete galilean transformation. in fact, the transformation h → h + vja yields an excess term which is compatible with the gradient discretization in eq. (30); however, this discretization does not allow to recover the quadratic term in eq. (23), indicating that this finite-difference scheme is not galilean-invariant. since eq. (21) is invariant under the transformation indicated in eq. (31), it is the nonlinear hopf–cole transformation (within the present discrete context) which is responsible for the loss of galilean invariance. note that these results are independent of whether we consider this discretization scheme or a more accurate one. galilean invariance has always been associated with the exactness of the one-dimensional kpz exponents, and with a relation that connects the critical exponents in higher dimensions [68]. if the numerical solution obtained from a finite-difference scheme as eq. (23), which is not galilean invariant, yields the well known critical exponents, this will be an indicative that galilean invariance is not strictly necessary to get the kpz universality class. the numerical results presented in [34–36] clearly show that this is the case. we will not discuss here the simulation procedure but only indicate that to make the simulations we introduced a discrete representation of h(x,t) along the substrate direction x with lattice spacing a = 1, and that a standard second-order runge–kutta algorithm (with periodic boundary conditions) was employed (see [66]). in [34–36] it was shown that all the cases (consistent or not) exhibit the same critical exponents. moreover, we want to note that the discretization used in refs. [60], which also violates galilean invariance, yields the same critical exponents too. additionally, stochastic differential equations which are not explicitly galilean invariant have been shown to obey the relation α+z = 2 ([33], see also next section). hence, our numerical analysis indicates that there are discrete schemes of the kpz equation which, even not obeying galilean invariance, show kpz scaling. the moral from the present analysis is clear: due to the locality of the hopf–cole transformation, the discrete forms of the laplacian and the nonlinear (kpz) term cannot be chosen independently; moreover, the prescriptions should be the same, regardless of the fields they are applied to. for further details we refer to [34–36]. v. renormalization-group analysis for fluctuations in section iii we have built a gradient flow counterpart of the deterministic kpz equation. in this section we consider the corresponding stochastically forced gradient flow ∂tu = − δi δu + ξ(x,t), (32) with the density indicated in eq. (16). we obtain the kpzw equation, which is the following spde ∂tu = ν∇2u− λ 6 (∇u)2 − λ 3 u∇2u + ξ(x,t). (33) our present goal will be to analyze the scaling behavior of the fluctuations of the solution to this equation. since eq. (33) is nonlinear, we focus on a perturbative technique. we choose the dynamic renormalization group as employed in [67, 68]. employing this method, we find at one-loop order the following flow equations [47] dλ d` = λ(α + z − 2), (34) dν d` = ν ( z − 2 − 1 36 λ2d ν3 kd 1 −d d ) , (35) dγ d` = γ ( z −d− 2α + kd 72 λ2γ ν3 ) , (36) where kd = sd/(2π) d, sd = 2π d/2/γ(n/2) is the surface area of the d−dimensional unit sphere, and γ is the gamma function. we find that the coupling constant ḡ := kdλ 2γ/ν3 obeys the one-loop differential equation dḡ d` = (2 −d)ḡ + 6 − 5d 72d ḡ2, (37) revealing that the critical dimension of this model is dc = 2 as could be anticipated by means of power counting. for d > 2 the coupling constant approaches zero exponentially fast in the scale `; for d = 2, this approach is algebraic. so for these dimensions one expects the large-scale space-time 050010-8 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. properties of eq. (33) to be dominated by its linear counterpart (up to marginal corrections in d = 2). in d = 1, the coupling constant runs to infinity for finite `, suggesting the presence of a nonperturbative fixed point (as the one in the kpz equation for d = 2). the values of the critical exponents which yield scale invariance can be formally calculated by identifying with zero the right hand sides of eqs. (34)– (36). we get α = 2(2 −d)(1 −d) 6 − 5d , (38) z = 12 − 10d− 2(2 −d)(1 −d) 6 − 5d , (39) which in particular obey the relation α + z = 2 in any dimensionality, despite the fact that eq. (33) does not obey any sort of galilean invariance. we note that in both d = 1 and d = 2 it is α = 0 and z = 2, whereas α becomes negative in higher dimensions. hence, in all dimensions, the exponent α indicates that the interface is either flat or at most marginally rough. the values for d = 1 make both diffusion and nonlinearity in eq. (33) invariant under the scale transformation {x,t,u} → {bx,bzt,bαu}, as far as b > 1. in this case, the noise grows with the scale (a fact that might explain the growth of the coupling constant in the renormalization group flow). in d = 2, the exponents are those of the linear equation. an interesting result is that for d = 0, the exponents become those of the kpz equation: α = 2/3 and z = 4/3, although this limit is highly singular for eq. (37). of course, these results have been obtained by means of a perturbative dynamic renormalization group and could be modified by non-perturbative contributions. one possible path to study such a possibility could be to adapt some non perturbative renormalization group techniques used for kpz [69] to the present kpzw case. among all the results in this section, we would like to highlight the one given by eq. (34). we recall that the rg analysis of the kpz equation yields non-renormalization of the vertex and renormalization of propagator and noise. our variational equation yields exactly the same result. vertex non-renormalization at one-loop order is expressed by eq. (34). the origin of this result is analogous to that of its equivalent in the kpz equation: three non-vanishing feynman diagrams contribute to vertex renormalization, but they cancel out each other [5] (a fact that has been traditionally attributed to the galilean invariance of the kpz equation). here we have shown that the same result appears in a spde that is not even invariant under the translation u → u+constant. vi. stability we have carried out the nep expansion about a constant solution of the kpz equation and found that constants are still solutions to kpzw (eq. (17)). in this section we will study the linear stability of such solutions. we start considering the solution u(x,t) = c + �υ(x,t), (40) where c is an arbitrary constant and � is the small parameter. substituting in eq. (17), we find ∂tυ = 3ν −λc 3 ∇2υ, (41) at first order in �. so υ obeys a diffusion equation whose diffusion constant depends on c. for c < 3ν/2λ, the diffusion constant is positive and correspondingly the constant solution is linearly stable. for c > 3ν/2λ, the diffusion constant is negative and consequently the constant solution is unstable. furthermore, in this case the problem becomes linearly ill posed. since for large values of c, the problem becomes linearly ill posed, numerical solutions are not available. in order to solve this disadvantage, we could include a higher order term in our problem. we concentrate on the gradient flow ∂tu = − δj δu + ξ(x,t), (42) with density j [u] = ν 2 ∫ dx (∇u)2 − λ 6 ∫ dx u(∇u)2 + µ 2 ∫ dx (∇2u)2, (43) leading to the following equation ∂tu = ν∇2u− λ 6 (∇u)2 − λ 3 u∇2u−µ∇4u + ξ(x,t). (44) 050010-9 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. note that the deterministic counterpart of this fourth-order equation can be considered as a variational version of the kuramoto-sivashinsky equation. it is worth remarking that this ad hoc construction resembles the one that, as indicated in [46], could more formally be obtained by considering the expansion of a nonlocal, short range interaction. the regime of linear stability/instability of this equation is identical to that of eq. (33) but in this case, the problem is always linearly well posed. furthermore, the term proportional to µ is presumably irrelevant in the large spatiotemporal scale (as simple power counting of the linear terms reveals) so the results of the previous rg analysis could possibly hold for this case too. anyway, due to the presence of the deterministic instability, further analysis are needed in order to assure this (note that both linear terms in the equation are stabilizing and that this instability has its origin in the vertex structure). vii. crossover: a path integral point of view another recently discussed related aspect [70] is based in a path-integral monte carlo-like method for the numerical evaluation of the mean rugosity and other typical averages whose approach, which radically differs from one introduced before [71], exploits some of our previous results [34–36]. here we limit ourselves to quote the temporally (µ) and spatially (j) discrete form of the “stochastic action” s[h] = 1 2τ ∑ j,µ {hj,µ+1 −hj,µ −τ[αlj,µ+1 + (1 −α)lj,µ]} 2 −2ναnt −τα λ 2 ∑ j,µ [hj+1,µ − 2hj,µ + hj−1,µ], (45) and briefly discuss the obtained numerical results. τ is the time step, 0 < α < 1 a time-discretization parameter meant to be fixed for explicit calculation 1 10 100 10000 100000 1000000 1e7 1e8 a ct io n t figure 1: crossover-like behavior from ew to kpz regime for λ = 1, on a lattice of 1028 sites (ν = d = 1). red solid line: kpz action; blue dash-dotted line: ew action; black-dotted line: difference. 0.1 1 1 10 100 t * figure 2: same data as the previous figure. solid line: time t∗ vs. λ; dashed line: trend for t∗ ∼ λ−1.35, included for comparison. [72, 73], and lj,µ the “stochastic lagrangian” lj,µ = ν (hj+1,µ − 2hj,µ + hj−1,µ) + λ 4 [ (hj+1,µ −hj,µ)2 +(hj,µ −hj−1,µ)2 ] . (46) figure 1 shows the crossover-like behavior from the edwards–wilkinson (ew) regime to the kpz one. we take as estimator of such a transition the time at which the difference (dotted black line) between kpz (red solid curve) and ew actions (blue 050010-10 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. dash-dotted line) crosses the ew one (it grossly coincides with the time at which the asymptotes cross). this estimator numerically agrees neither with the results in [10] (where a value of φ ∼ 4 was found) nor with the one in [24] (with φ ∼ 3, but corresponding to a 2d case). in fig. 2 we have plotted the dependence of this estimator on λ. for comparison, we have also included the trend for λ−φ with φ = 1.35 (dotted line). preliminary results for λ > 7 seem to indicate a marked change in the value of φ, maybe a hint that the system is entering a strong coupling region [71]. short-time propagator our aim here is to work out a variant of the method introduced in [70] by exploiting the first form of lyapunov functional found in [46], namely eq. (10), that leads us to eq. (11), the full kpz equation. whereas eq. (45) is valid whatever the value of τ, we now seek for a simpler expression valid for τ � 1. this idea parallels in some sense other studies in the literature [71], but here we exploit the functional f[h] of eq. (8). we denote as {h} = (h1,µ,h2,µ, . . . ,hj,µ, . . . ,hn,µ) the interface configuration at time µ. the transition pdf between patterns h0 at t0 and hf at tf can be written as p({hf}, tf|{h0}, t0) =∫ d[h] exp ( − 1 γ ∫ tf t0 l[h,ḣ] ) , (47) with l[h,ḣ] = 1 2 ∫ l 0 dx [( ∂th + γ[h] δ δh f[h] )2 +α δ δh ( γ[h] δ δh f[h] )] , (48) whose discrete form is given by eq. (46). a key observation is the (temporally and spatially) “diagonal” character of eq. (9), highlighted in its discrete version ḣj(t) = −γj δf δhj + √ γ ξj(t). (49) guided by eq. (46), we propose the following form of p({hf}, tf|{h0}, t0) for τ � 1, or short-time propagator (stp) p(hf,τ|h0, 0) =∫ hf h0 d[h]e [ − 1 2γ ∫ τ 0 ds ∫ l 0 dx(∂th+γ δfδh ) 2 ] ≈ exp { − τ 2γ ∫ l 0 dx[( hf −h0 τ + 1 2 [ γf δf δhf + γ0 δf δh0 ])2]} . (50) here, for simplicity, we have chosen a discretization with α = 0. as it is well known [72, 73], the jacobian of the transformation from the noise variable to the height variable depends on α. with this choice, the jacobian results equal to 1. incidentally, the form in eq. (50) coincides with the discretization used in [71] for determining the least-action trajectory. the “quasi-gaussian” character of this stp is better evidenced in the following approximate form p(hf, tf = τ|h0, t0 = 0) ∼ e[− 1 2γτ ∫ l 0 dx(hf−h0)2] × { 1 − 1 2γ ∫ l 0 dx [ (hf −h0) 1 2 ( γf δf δhf +γ0 δf δh0 ) + o(τ) ]} (51) where the exponential term has been separated out since it is of order τ−1, whereas the following two are of order τ0 and τ1, respectively (of lesser weight and negligible respectively, in the limit τ → 0). it is worth remarking that the term that could come from the jacobian is also of order τ1. it is easy to check that we can recover the known fpe from the proposed form of stp (adopting α = 0 for simplicity). we will not reiterate this calculation here. an immediate result of this form is that at very short times, behavior of the edwards– wilkinson type is obtained√ 〈h2〉≈ τ 1 2 . viii. conclusions herein, in addition to reviewing some recent results [34–36, 47, 70], we have furthered the study in [46], where it was shown that the deterministic kpz 050010-11 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. equation admits a lyapunov functional, and a (formal) definition of a nonequilibrium potential was introduced. we have carried out a taylor expansion of such a nonequilibrium potential, what led us to a different equation of motion than the kpz one, the kpzw which is an exact gradient flow and has an explicit density. in particular, it has a lower degree of symmetry: it is neither galilean invariant, nor even translational invariant. the critical exponents determining its scaling properties were obtained through a one-loop dynamic renormalization group analysis. these exponents fulfill the same scaling relation as the kpz equation, α + z = 2, traditionally attributed to the galilean invariance of the latter. the fact that the same scaling relation arises in a spde (i.e., the kpzw) that is not only non-galilean invariant but even non-invariant under the translation u → u+constant supports recent theoretical and numerical results indicating that galilean invariance does not necessarily play the relevant role previously assumed in defining the universality class of the kpz equation and different nonequilibrium models [30–36]. we have, moreover, analyzed the stability properties of the solutions to the present equation, finding the threshold condition for the appearance of diffusive instabilities, which indicates that in this case the problem becomes linearly ill posed. after considering the simplest way to correct such an illposed problem, we have met a kind of kuramoto– sivashinsky equation, resembling the one that, as indicated in [46], could be obtained by considering a nonlocal, short range interaction. this equation has an exact gradient flow structure with an explicit density. furthermore, when subject to stochastic forcing, its scaling properties could be formally described by the same critical exponents because the stabilizing term is irrelevant in the large scale from a dimensional analysis viewpoint. exploiting some elements of a path integral description of the problem, we have also shown what seems to be a simple form of viewing and studying the crossover from the ew to the kpz regimes. the present review-like study aims to open new points of view on, as well as alternative routes to study, the kpz problem. among the many aspects to be further studied, an interesting one is to test the (kind) of stability of the recently found exact solutions [20–23] by exploiting the indicated form of the nep. acknowledgements financial support from mineco (spain) is especially acknowledged, through projects pri-aibar-2011-1323 (which enabled international cooperation), fis2010-18023 (hsw and ce) and ryc-2011-09025 (ce). also acknowledged is the support from conicet, unc (jar) and unmdp (rrd) of argentina. collaboration with m.s. de la lama and e. korutcheva during different stages of this research is highly appreciated. [1] l d landau, e m lifshitz, statistical physics, butterworth heinemann, oxford, 1980. [2] l p kadanoff, w götze, d hamblen, r hecht, e a s leis, v v palciauskas, m rayl, j swift, d aspnes, j kane, static phenomena near critical points: theory and experiment, rev. mod. phys. 39, 395 (1967). [3] k g wilson, j kogut, the renormalization group and the � expansion, phys. rep. 12, 75 (1974); k g wilson, the renormalization group: critical phenomena and the kondo problem, rev. mod. phys. 47, 773 (1975). [4] j marro, r dickman, nonequilibrium phase transitions in lattice models, cambridge u. press, cambridge, uk (1999); m henkel, h hinrichsen, s lübeck, nonequilibrium phase transitions i, springer, berlin (2008); m henkel, m pleimling, nonequilibrium phase transitions ii, springer, berlin (2010); g ódor, universality in nonequilibrium lattice systems, world scientific, singapore (2008). [5] m kardar, g parisi, y-c zhang, dynamic scaling of growing interfaces, phys. rev. lett. 56, 889 (1986). [6] t halpin-healy, y-c zhang, kinetic roughening phenomena, stochastic growth, directed polymers and all that. aspects of multidisciplinary statistical mechanics, phys. rep. 254, 215 (1995). [7] a-l barabási, h e stanley, fractal concepts in surface growth, cambridge u. press, cambridge, uk (1995). [8] v gurarie, a migdal, instantons in the burgers equation, phys. rev. e 54, 4908 (1996). 050010-12 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. [9] m kardar, replica bethe ansatz studies of twodimensional interfaces with quenched random impurities, nucl. phys. b 290, 582 (1987). [10] b m forrest, r toral, crossover and finitesize effects in the (1+1)-dimensional kardar– parisi–zhang equation, j. stat. phys. 70, 703 (1993). [11] m beccaria, g curci, numerical simulation of the kardar–parisi–zhang equation, phys. rev. e 50, 4560 (1994). [12] k moser, d e wolf, vectorized and parallel simulations of the kardar–parisi–zhang equation in 3+ 1 dimensions, j. phys. a 27, 4049 (1994). [13] m scalerandi, p p delsanto, s biancotto, time evolution of growth phenomena in the kpz model, comput. phys. commun. 97, 195 (1996). [14] t j newman, a j bray, strong-coupling behaviour in discrete kardar–parisi–zhang equations, j. phys. a 29, 7917 (1996). [15] c appert, universality of the growth velocity distribution in 1+ 1 dimensional growth models, comput. phys. commun. 121-122, 363 (1999). [16] e marinari, a pagnani, g parisi, critical exponents of the kpz equation via multi-surface coding numerical simulations, j. phys. a 33, 8181 (2000). [17] t j oliveira, s g alves, s c ferreira, kardar– parisi–zhang universality class in (2+1) dimensions: universal geometry-dependent distributions and finite-time corrections, phys. rev. e 87, 040102 (2013). [18] l giada, a giacometti, m rossi, pseudospectral method for the kardar–parisi– zhang equation, phys. rev. e 65, 036134 (2002). [19] v g miranda, f d a aarão reis, numerical study of the kardar–parisi–zhang equation, phys. rev. e 77, 031134 (2008). [20] t sasamoto, h spohn, one-dimensional kardar–parisi–zhang equation: an exact solution and its universality, phys. rev. lett. 104, 230602 (2010). [21] t sasamoto, h spohn, the 1+1-dimensional kardar–parisi–zhang equation and its universality class, j. stat. mech. p11013 (2010). [22] g amir, i corwin, j quastel, probability distribution of the free energy of the continuum directed random polymer in 1+ 1 dimensions, commun. pure appl. math. 64, 466 (2011). [23] p calabrese, p le doussal, exact solution for the kardar–parisi–zhang equation with flat initial conditions, phys. rev. lett. 106, 250603 (2011). [24] h guo, b grossmann, m grant, crossover scaling in the dynamics of driven systems, phys. rev. a 41, 7082 (1990). [25] c m horowitz, e v albano, relationships between a microscopic parameter and the stochastic equations for interface’s evolution of two growth models, eur. phys. j. b 31, 563 (2003). [26] f d a aarão reis, scaling in the crossover from random to correlated growth, phys. rev. e 73, 021605 (2006). [27] s bustingorry, l cugliandolo, j l iguain, out-of-equilibrium relaxation of the edwardswilkinson elastic line, j. stat. mech. p09008 (2007); s bustingorry, aging dynamics of nonlinear elastic interfaces: the kardar–parisi– zhang equation, j. stat. mech. 10002 (2007). [28] s bustingorry, p ledoussal, a rosso, universal high-temperature regime of pinned elastic objects, phys. rev. b 82, 140201 (2010); s bustingorry, a b kolton, t giamarchi, random-manifold to random-periodic depinning of an elastic interface, phys. rev. b 82, 094202 (2010). [29] m henkel, j d noh, n pleimling, phenomenology of aging in the kardar–parisi–zhang equation, phys. rev. e 85, 030102 (2012). 050010-13 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. [30] w d mccomb, galilean invariance and vertex renormalization in turbulence theory, phys. rev. e 71, 037301 (2005). [31] a berera, d hochberg, gauge symmetry and slavnov-taylor identities for randomly stirred fluids, phys. rev. lett. 99, 254501 (2007). [32] a berera, d hochberg, gauge fixing, brs invariance and ward identities for randomly stirred flows, nucl. phys. b 814, 522 (2009). [33] m nicoli, r cuerno, m castro, unstable nonlocal interface dynamics, phys. rev. lett. 102, 256102 (2009). [34] h s wio, j a revelli, r r deza, c escudero, m s de la lama, kpz equation: galilean-invariance violation, consistency, and fluctuation-dissipation issues in real-space discretization, europhys. lett. 89, 40008 (2010). [35] h s wio, j a revelli, r r deza, c escudero, m s de la lama, discretization-related issues in the kardar–parisi–zhang equation: consistency, galilean-invariance violation, and fluctuation-dissipation relation, phys. rev. e 81, 066706 (2010). [36] h s wio, c escudero, j a revelli, r r deza, m s de la lama, recent developments on the kardar–parisi–zhang surface-growth equation, phil. trans. r. soc. a 369, 396 (2011). [37] t sun, h guo, m grant, dynamics of driven interfaces with a conservation law, phys. rev. a 40, 6763 (1989). [38] j villain, continuum models of crystal growth from atomic beams with and without desorption, j. phys. i (france) 1, 19 (1991). [39] z-w lai, s das sarma, kinetic growth with surface relaxation: continuum versus atomistic models, phys. rev. lett. 66, 2348 (1991). [40] c escudero, geometric principles of surface growth, phys. rev. lett. 101, 196102 (2008). [41] c escudero, e korutcheva, origins of scaling relations in nonequilibrium growth, j. phys. a: math. theor. 45, 125005 (2012). [42] r graham, weak noise limit and nonequilibrium potentials of dissipative dynamical systems, in: instabilities and nonequilibrium structures, eds. e tirapegui, d villaroel, d, reidel pub. co., dordrecht (1987). [43] h s wio, nonequilibrium potential in reactiondiffusion systems, in: 4th granada seminar in computational physics, eds. p garrido, j marro, pag. 135, springer-verlag, berlin (1997). [44] h s wio, r r deza, j m lópez, introduction to stochastic processes and nonequilibrium statistical physics, revised edition, world scientific, singapore (2013). [45] p ao, potential in stochastic differential equations: novel construction, j. phys. a 37, l25 (2004). [46] h s wio, variational formulation for the kpz and related kinetic equations, int. j. bif. chaos 19, 2813 (2009). [47] c escudero, e korutcheva, h s wio, r r deza, j a revelli, kpz equation as a gradient flow: nonequilibrium-potential expansion and renormalization-group treatment of fluctuations, unpublished. [48] p hohenberg, b halperin, theory of dynamic critical phenomena, rev. mod. phys. 49, 435 (1977). [49] g grinstein, s k ma, surface tension, roughening, and lower critical dimension in the random-field ising model, phys. rev. b 28, 2588 (1983). [50] j koplik, h levine, interface moving through a random background, phys. rev. b 32, 280 (1985). [51] r bruinsma, g aeppli, interface motion and nonequilibrium properties of the random-field ising model, phys. rev. lett. 52, 1547 (1984). [52] d kessler, h levine, y tu, interface fluctuations in random media, phys. rev. a 43, 4551 (1991). 050010-14 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. [53] h s wio, s bouzat, b von haeften, stochastic resonance in spatially extended systems: the role of far from equilibrium potentials, physica a 306, 140 (2002). [54] h s wio, r r deza, aspects of stochastic resonance in reaction-diffusion systems: the nonequilibrium-potential approach, eur. phys. j.-spec. top. 146, 111 (2007). [55] h g e hentschel, shift invariance and surface growth, j. phys. a: math. gen. 27, 2269 (1994). [56] s j linz, m raible, p hänggi, stochastic field equation for amorphous surface growth, in: stochastic processes in physics, chemistry, and biology, eds. j a freund, t pöschel, 557, pag. 473, springer, berlin (2000). [57] j m lópez, m castro, r gallego, scaling of local slopes, conservation laws, and anomalous roughening in surface growth, phys. rev. lett. 94, 166103 (2005). [58] m castro, j muñoz-garćıa, r cuerno, m m garćıa-hernández, l vázquez, generic equations for pattern formation in evolving interfaces, new j. phys. 9, 102 (2007). [59] e hernández-garćıa, t ala-nissila, m grant, interface roughening with a time-varying external driving force, europhys. lett. 21, 401 (1993). [60] c-h lam, f g shin, improved discretization of the kardar–parisi–zhang equation, phys. rev. e 58, 5592 (1998); c-h lam, f g shin, formation and dynamics of modules in a dual-tasking multilayer feed-forward neural network, phys. rev. e, 57, 6506 (1998). [61] r gallego, m castro, j m lópez, pseudospectral versus finite-difference schemes in the numerical integration of stochastic models of surface growth, phys. rev. e 76, 051121 (2007). [62] s m a tabei, a bahraminasab, a a masoudi, s s mousavi, m r r tabar, intermittency of height fluctuations in stationary state of the kardar–parisi–zhang equation with infinitesimal surface tension in 1+1 dimensions, phys. rev. e 70, 031101 (2004). [63] k ma, j jiang, c b yang, scaling behavior of roughness in the two-dimensional kardar– parisi–zhang growth, physica a 378, 194 (2007). [64] m s de la lama, j m lópez, j j ramasco, m a rodŕıguez, activity statistics of a forced elastic string in a disordered medium, j. stat. mech., p07009 (2009). [65] m abramowitz, i a stegun, handbook of mathematical functions: with formulas, graphs, and mathematical tables, pag. 884, dover, new tork (1965). [66] m san miguel, r toral, stochastic effects in physical systems, in: instabilities and nonequilibrium structures vi, eds. e tirapegui, j mart́ınez-mardones, r tiemann, pag. 35, kluwer academic publishers (2000). [67] d forster, d r nelson, m j stephen, largedistance and long-time properties of a randomly stirred fluid, phys. rev. a 16, 732 (1977). [68] e medina, t hwa, m kardar, y-c zhang, burgers equation with correlated noise: renormalization-group analysis and applications to directed polymers and interface growth, phys. rev. a 39, 3053 (1989). [69] l canet, h chate, b delamotte, general framework of the non-perturbative renormalization group for non-equilibrium steady states, j. phys. a 44, 495001 (2011); l canet, h chate, b delamotte, n wschebor, nonperturbative renormalization group for the kardar–parisi–zhang equation: general framework and first applications, phys. rev. e 84, 061128 (2011); th kloss, l canet, n wschebor, nonperturbative renormalization group for the stationary kardar–parisi–zhang equation: scaling functions and amplitude ratios in 1+1, 2+1, and 3+1 dimensions, phys. rev. e 86, 051124 (2012). [70] h s wio, r r deza, j a revelli, c escudero, a novel approach to the kpz dynamics, acta phys. pol. b 44 889 (2013). [71] h c fogedby, w ren, minimum action method for the kardar–parisi–zhang equation, phys. rev. e 80, 041116 (2009). 050010-15 papers in physics, vol. 5, art. 050010 (2013) / h. s. wio et al. [72] f langouche, d roekaerts, e tirapegui, functional integration and semiclassical expansions, d. reidel pub. co., dordrecht (1982). [73] h s wio, path integrals for stochastic processes: an introduction, world scientific, singapore (2013). 050010-16 papers in physics, vol. 8, art. 080008 (2016) received: 6 september 2016, accepted: 5 november 2016 edited by: j. p. paz licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.080008 www.papersinphysics.org issn 1852-4249 green’s functions technique for calculating the emission spectrum in a quantum dot-cavity system edgar a. gómez,1∗ j. d. hernández-rivero,2 herbert vinck-posada3 we introduce the green’s functions technique as an alternative theory to the quantum regression theorem formalism for calculating the two-time correlation functions in open quantum systems at the steady state. in order to investigate the potential of this theoretical approach, we consider a dissipative system composed of a single quantum dot inside a semiconductor cavity and the emission spectrum is computed due to the quantum dot as well as the cavity. we propose an algorithm based on the green’s functions technique for computing the emission spectrum that can easily be adapted to more complex open quantum systems. we found that the numerical results based on the green’s functions technique are in perfect agreement with the quantum regression theorem formalism. moreover, it allows overcoming the inherent theoretical difficulties associated with the direct application of the quantum regression theorem in open quantum systems. i. introduction the measurement and control of light produced by quantum systems have been the focus of interest of the cavity quantum electrodynamics [1, 2]. specially, the emission of light powered by solid-state devices coupled to nanocavities is an extensive area of research due to its promising technological applications, such as infrared and low-threshold lasers [3, 4], single and entangled photon sources [5, 6], as well as various applications in quantum cryptography [7] and quantum information theory [8]. experiments with semiconductor quantum dots (qds) ∗e-mail: eagomez@uniquindio.edu.co 1 universidad del quind́ıo, a.a. 2639, 630004 armenia, colombia. 2 departamento de f́ısica, universidade federal de minas gerais, a.a. 486, 31270-901 pampulha, belo horizonte, minas gerais, brazil. 3 universidad nacional de colombia, a.a. 055051, 111321 bogotá, colombia. embedded in microcavities have revealed a plethora of quantum effects and offer desirable properties for harnessing coherent quantum phenomena at the single photon level. for example, the purcell enhancement [9], photon antibunching [10], vacuum rabi splitting [11] and strong light matter coupling [12]. these and many other quantum phenomena are being confirmed experimentally by observing the power spectral density of the light (psd) emitted by the quantum-dot cavity systems (qdcavity). thus, the psd, or the so-called emission spectrum of the system, becomes the only relevant information that allows to study the properties of the light via measurements of correlation functions, as it is stated by the wiener-khintchine theorem [13]. in order to compute the emission spectrum of the qd-cavity systems in the framework of open quantum systems, different approaches have been elaborated from the theoretical point of view. for example, the method of thermodynamic green’s functions has been applied to the determination of the susceptibilities and absorption spectrum of 080008-1 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. atomic systems embedded in nanocavities [14], the time-resolved photo-luminescence approach whose application allows to determine the emission spectrum when an additional subsystem is considered, the so-called the photon reservoir [15]. these theoretical approaches are based on several approximations and therefore, they have their own limitations when they are considered in more general scenarios. in consequence, these methods are not used extensively. frequently, the emission spectrum of qd-cavity systems is computed through the quantum regression theorem (qrt) [16–18], since it relates the evolution of mean values of observables and the two-time correlation functions. it is worth mentioning that the qrt approach can be difficult to implement in a computer program because computational complexity increases significantly as the number of qds or modes inside the cavity are being considered; more precisely, the dimensionality associated with the hilbert space is large. in general, the qrt approach is time-consuming because it is required to solve a large system of coupled differential equations and numerical instabilities that can arise. moreover, theoretical complications related to dynamics of the operators involved can appear, as we will point out in the next section. in spite of this, the qrt approach is widely used in theoretical works, for example, in studies about photoluminescence spectra of coupled light-matter systems in microcavities in the presence of a continuous and incoherent pumping [19, 20]. also, in studies considering the relation between dynamical regimes and entanglement in qd-cavity systems [21, 22]. in the past, the green’s functions technique (gft) was successfully applied for calculating the emission spectrum for a very simple quantum system, e.g., the micro-maser [23]. nevertheless, this approach has not been widely noticed in many significant situations in open quantum systems. the purpose of this work is to provide a simple and efficient numerical method based on the gft in order to overcome the inherent difficulties associated with the direct application of the qrt approach by solving the dynamics of the system in the frequency domain directly. this paper is structured as follows: the theoretical background of the qrt as well as the gft are presented in section ii. a concrete application of our proposed methodology for calculating the emission spectrum of the qdcavity system is considered in section iii. moreover, for comparison purposes with the gft, we discuss in some detail the methodology of the qrt for calculating the emission spectrum of the cavity. the numerical results for the emission spectrum of the quantum dot, as well as of the cavity obtained from both the gft and the qrt, are shown in section iv. a discussion about our findings is summarized in section v. ii. theoretical background i. quantum regression theorem one of the most important measurements when the light excites resonantly a qd-cavity system is the emission spectrum of the system. from a theoretical point of view, it is assumed that it corresponds to a stationary and ergodic process which can be calculated as a psd of light by using the well-known wiener-khintchine theorem [13]. this theorem states that the emission spectrum is given by the fourier transform of the two-time correlation function of the operator field â; explicitly, it is s(ω) = re lim t→∞ ∫ ∞ 0 〈â†(t + τ)â(t)〉eiωτdτ. (1) in order to calculate the two-time correlation function involved in eq. (1), a theoretical approach based on the qrt is frequently considered. it states that if a set of operators {ôi(t + τ)} satisfy the dynamical equations d dτ 〈ôi(t + τ)〉 = ∑ j lij〈ôj(t + τ)〉 then d dτ 〈ôi(t + τ)ô(t)〉 =∑ j lij〈ôj(t+τ)ô(t)〉 is valid for any operator ô(t) at an arbitrary time t. here, lij represents the matrix of coefficients associated with the coupled linear equations of motion. it is worth mentioning that validity of this theorem holds whenever a closed set of operators is associated with the dynamics. in general, to obtain the closed set of operators can be difficult or an impossible task, since there must be added as many operators as necessary in order to close the dynamics of the system. for example, to calculate the emission spectrum in a model of qd-cavity system [20, 21], two new operators are required because the field operators in the interaction picture do not lead to a complete set. 080008-2 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. ii. green’s functions technique let us consider a qd-cavity system and an operator â which does not operate on the reservoirs, then its single-time expectation value in the heisenberg representation is given by 〈 ˆ̃a(t + τ)〉 = trs⊗r[ ˆ̃ a(t + τ) ˆ̃ρs⊗r(t)]. (2) the density operator system-reservoir can be evolved from an initial state at time 0 to an arbitrary time t via ˆ̃ρs⊗r(t) = û †(t, 0)ρ̂s⊗r(0)û(t, 0), with û(t, 0) being a unitary time-evolution operator involving the hamiltonian terms of the system and reservoirs. moreover, the operator ˆ̃ρs⊗r(t) = ˆ̃ρs(t) ⊗ ˆ̃ρr(t) depicts the composite density operator of the system and reservoir. it is worth pointing out that tilde means that the operator has been transformed to the heisenberg representation and that the dynamics of the system depend directly on ˆ̃ρs⊗r(t) for all times. the validity of the markovian approximation requires that the state of the system is sufficiently well described when it is considered that ˆ̃ρs(t) = trr( ˆ̃ρs⊗r(t)). therefore, it is sufficient to write ˆ̃ρs⊗r(t) = ˆ̃ρs(t) ⊗ ˆ̃ρr(t) for all times. if we assume that at t = 0 the initial state of the system is the steady state, then ˆ̃ρs⊗r(0) = ρ̂ (ss) s⊗r. here, the superscript ”(ss)” should be understood to be the steady state of the system-reservoir. after tracing over degrees of freedom of the reservoirs, we have that the eq. (2) takes the form 〈 ˆ̃a(τ)〉 = trs[ ˆ̃ a(0) ˆ̃ρs(τ)], (3) where the reduced density operator for the system is given by ˆ̃ρs(τ) = trr[û(τ, 0) ˆ̃ρs⊗r(0)û †(τ, 0)] and ˆ̃ a(0) = â. if ˆ̃ρs(τ) satisfies the lindblad master equation dˆ̃ρs(τ)/dτ = lˆ̃ρs(τ) with l the superoperator defined as lˆ̃ρs(τ) = −i[ĥs, ˆ̃ρs(τ)] +∑ j γj 2 (2x̂j ˆ̃ρs(τ)x̂ † j−x̂ † jx̂j ˆ̃ρs(τ)−ˆ̃ρs(τ)x̂ † jx̂j) for an operator x̂j, then the expectation value 〈 ˆ̃ a(τ)〉 can be computed by solving the dynamics associated to the lindblad master equation. it is worth mentioning that the hamiltonian operator ĥs describes the qd-cavity system and γj corresponds to the damping (pumping) rate associated to the operator x̂j. in order to calculate the two-time correlation function 〈 ˆ̃a(t + τ) ˆ̃b(t)〉 where ˆ̃a(t + τ) = û†(t + τ,t) ˆ̃ a(t)û(t + τ,t) and ˆ̃ b(t) = û†(t, 0)b̂û(t, 0) are arbitrary heisenberg operators which do not operate on the reservoirs, we proceed similarly to the case of the single-time expectation value, it is 〈 ˆ̃a(τ) ˆ̃b(0)〉 = trs⊗r[ ˆ̃ a(τ) ˆ̃ b(0) ˆ̃ρs⊗r(0)], = trs[ ˆ̃ a(0) ˆ̃ g(τ)], (4) where we have used the well-known properties of the unitary time-evolution operator and the fact that the system at time t = 0 is in the steady state. we have defined the operator ˆ̃ g(τ) = trr[û(τ, 0) ˆ̃ b(0) ˆ̃ρs⊗r(0)û †(τ, 0)] (5) where the trace operation is performed on the reservoirs only. by performing the time derivation of eq. (4), we have that d dτ 〈 ˆ̃a(τ) ˆ̃b(0)〉 = trs[ ˆ̃ a(0) d ˆ̃ g(τ) dτ ]. (6) where d ˆ̃ g(τ) dτ = d dτ trr[û(τ, 0) ˆ̃ b(0) ˆ̃ρs⊗r(0)û †(τ, 0)], = d dτ trr[ ˆ̃ b(τ) ˆ̃ρs⊗r(τ)], = trr[ ˆ̃ b(τ) dˆ̃ρs⊗r(τ) dτ + ˆ̃ρs⊗r(τ) d ˆ̃ b(τ) dτ ], = trr[ ˆ̃ b(τ) dˆ̃ρs⊗r(τ) dτ ], + trr[ d ˆ̃ b(τ) dτ ˆ̃ρs⊗r(τ)]. (7) notice that the last term vanishes since trr[ d ˆ̃ b(τ) dτ ˆ̃ρs⊗r(τ)] = d dτ trr[ ˆ̃ b(0) ˆ̃ρs⊗r(0)] is independent of time τ. thus, the eq. (7) can be reduced to the form d ˆ̃ g(τ) dτ = trr[ dˆ̃ρs⊗r(τ) dτ ˆ̃ b(τ)], = trr[lˆ̃ρs⊗r(τ) ˆ̃ b(τ)], = ltrr[ ˆ̃ρs⊗r(τ) ˆ̃ b(τ)], = ltrr[û(τ, 0) ˆ̃ b(0) ˆ̃ρs⊗r(0)û †(τ, 0)], = l ˆ̃g(τ) (8) 080008-3 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. where we have taken into account that the superoperator l acts only on the system hilbert space and not on the reservoir. it is straightforward to conclude that ˆ̃ g(τ) is an operator that obeys the same dynamical equations as ˆ̃ρs(τ). more precisely, d ˆ̃ g(τ)/dτ = l ˆ̃g(τ) with the boundary condition ˆ̃ g(0) = ˆ̃ b(0) ˆ̃ρs(0) at the steady state of the system. we also conclude that the two-time correlation function in the long-time limit can be written as lim t→∞ 〈 ˆ̃a(t + τ) ˆ̃b(t)〉 = trs[â ˆ̃ g(τ)], (9) where ˆ̃ g(τ) = trr[û(τ)b̂ρ̂ (ss) s⊗rû †(τ)] is defined as the green’s functions operator and the operators â, b̂ and ρ̂ (ss) s⊗r are in the schrödinger representation. â = ˆ̃ a(0) and b̂ = ˆ̃ b(0) are operators considered at the steady state of the system. in the remainder of the paper, we assume that û(τ) ≡ û(τ, 0). particularly, the eq. (9) takes the form of the eq. (1) after performing the integral transformation. more precisely, by taking the real part of the laplace transform to the eq. (9), we obtain an expression in terms of the green’s functions operator in the frequency domain as follows s(ω) = re trs[â ˆ̃ g(iω)]. (10) notice that the operators â and b̂ should be defined appropriately for describing the emission spectrum due to the cavity or the quantum dot. moreover, the wide tilde is used to indicate that the laplace transform was taken. the superscript ”(ss)” should be understood to be the steady state of the reduced density operator of the system. after taking the laplace transform of the eq. (9), we obtain an expression for the emission spectrum of the system in terms of the green’s functions operator in the frequency domain as follows s(ω) = 1 πnc retrs[â ˆ̃ g(iω)]. (11) prior to leaving this section, we mention that this result will be the starting point for calculating the emission spectrum due to the cavity as well as the quantum dot by considering the photon and fermionic operators in a separated way. iii. application to the qd-cavity system i. model in order to illustrate the potential of the green’s function technique for calculating the emission spectrum in a qd-cavity system, we will consider an open quantum system composed of a quantum dot interacting with a confined mode of the electromagnetic field inside a semiconductor cavity. this quantum system is well described by the jaynescummings hamiltonian [24] ĥs = ωxσ̂ †σ̂ + (ωx −∆)â†â + g(σ̂↠+ âσ̂†), (12) where the quantum dot is described as a fermionic system with only two possible states, e.g., |g〉 and |x〉 are the ground and excited state. σ̂ = |g〉〈x| and â (σ̂† = |x〉〈g| and â†) are the annihilation (creation) operators for the fermionic system and the cavity mode, respectively. the parameter g is the light-matter coupling constant. moreover, note that we have set h̄ = 1. we also define the detuning between frequencies of the quantum dot and the cavity mode as ∆ = ωx − ωa, where ωx and ωa are the energies associated to an exciton and the photons inside the cavity, respectively. this hamiltonian system is far from describing any real physical situation since it is completely integrable [25] and no measurements could be done since the light remains always inside the cavity. in order to incorporate the effects of the environment on the dynamics of the system, we consider the usual approach to model an open quantum system by considering a whole system-reservoir hamiltonian which is frequently split in three parts. namely, ĥ = ĥs + ĥsr + ĥr, where ĥs defines the hamiltonian term of the qd-cavity system as it is defined in the eq. (12). the hamiltonian terms ĥsr and ĥr corresponding to a bilinear coupling between the system-reservoir and its respective reservoirs ĥr have been discussed in detail by perea et al. in [26]. the reader can find a detailed discussion of the markovian master equation in [27, 28]. in the framework of open quantum systems, different reservoirs have been proposed in order to describe the dissipation, decoherence or decays. particularly, for qd-cavity systems, a reservoir is considered for describing the 080008-4 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. physical situation where the photons are absorbed in a semiconductor and electron-hole pairs (excitons) can be produced which can be associated to either electrical injection or the capture of excitons optically created at frequencies larger than the typical ones of our system. this process corresponds to the so-called continuous and incoherent pumping of the qd. also, when the excitons are coupled to the leaky modes of the cavity with energy different than the cavity mode, there is a residual density of states inside the cavity and this process is responsible for the spontaneous emission (radiative recombination) to an independent reservoir of photons. another physical process is known as the coherent emission and it is due to the direct dissipation of the cavity mode, more precisely, the cavity mode is coupled to the continuum of photonic modes out of the cavity. for obtaining the master equation for the qd-cavity system, it is convenient to consider the interaction picture with respect to ĥs + ĥsr and assume the validity of the born-markov approximation. after tracing out the degrees of freedom of all the reservoirs, one arrives to the lindblad master equation for the reduced density matrix of the system dˆ̃ρs dτ = −i [ ĥs, ˆ̃ρs ] + κ 2 (2âˆ̃ρsâ † − â†âˆ̃ρs − ˆ̃ρsâ†â) + γ 2 (2σ̂ ˆ̃ρsσ̂ † − σ̂†σ̂ ˆ̃ρs − ˆ̃ρsσ̂†σ̂) + p 2 (2σ̂† ˆ̃ρsσ̂ − σ̂σ̂† ˆ̃ρs − ˆ̃ρsσ̂σ̂†). (13) the parameter γ is the decay rate due to the spontaneous emission, κ is the decay rate of the cavity photons across the cavity mirrors, and p is the rate at which the excitons are being pumped. figure 1 shows a scheme of the simplified model of the qdcavity system showing the processes of continuous pumping p and cavity loses κ. the physical process begins when the light from the pumping laser enters into the cavity and excites one of the quantum dots in the qd layer. thus, light from this source couples to the cavity and a fraction of photons escapes through the partly transparent mirror from the cavity and goes to the spectrometer for measurements of the emission spectrum. a general approach for solving the dynamics of the system consists in writing the corresponding bloch equations for the reduced density matrix of the system in the bared basis. it is an extended hilbert space formed by taking the tensor product of the state vectors for each of the system components, {|g〉, |x〉}⊗ {|n〉}∞n=0. in this basis, the reduced density matrix ρ̂s can be written in terms of its matrix elements as ρ̃sαn,βm ≡〈αn|ˆ̃ρs(τ)|βm〉. hence, the eq. (13) explicitly reads dρ̃sαn,βm dτ = i [ (ωx − ∆)(m−n)ρ̃sαn,βm + ωx(δβxρ̃sαn,xm − δαxρ̃sxn,βm) ] + ig [(√ m + 1δβxρ̃sαn,gm+1 + √ mδβgρ̃sαn,xm−1 ) − (√ nδαgρ̃sxn−1,βm + √ n + 1δαxρ̃sgn+1,βm )] + κ 2 ( 2 √ (m + 1)(n + 1)ρ̃sαn+1,βm+1 − (n + m)ρ̃sαn,βm ) − γ 2 ( δαxρ̃sxn,βm − 2δαgδβgρ̃sxn,xm + δβxρ̃sαn,xm ) + p 2 ( 2δαxδβxρ̃sgn,gm − δαgρ̃sgn,βm − δβgρ̃sαn,gm ) . (14) note that we use the convention that all indices written in greek alphabet are used for the fermionic states and take only two possible values |g〉, |x〉. the indices written in latin alphabet are used for the fock states which take the possible values 0, 1, 2, 3 . . . additionally, it is worth mentioning that our proposed method does not require to solve a system of coupled differential equations, instead of it, we solve a reduced set of algebraic equations that speed up the numerical solution. prior to leaving this section, we point out that the number of excitations of the system is defined by the operator n̂ = â†â + σ̂†σ̂. the closed system and the number of excitations of the system is conserved, i.e., [ĥs,n̂] = 0. it allows us to organize the states of the system through the number of excitations criterion such that the density matrix elements ρ̃gn,gn, ρ̃xn−1,xn−1, ρ̃gn,xn−1 and ρ̃xn−1,gn are related by having the same number of 080008-5 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. figure 1: the picture represents a qd-cavity system showing the processes of continuous pumping p and cavity loses κ. figure 2: ladder of bared states for a two-level quantum dot coupled to a single cavity mode. the double headed green arrow depicts the matter coupling constant g, dashed red lines the emission of the cavity mode κ, solid black lines the exciton pumping rate p and solid blue lines the spontaneous emission rate γ. quanta; sub-spaces of a fixed number of excitation evolve independently from each other. the fig. 2 shows a schematic representation of the action of the dissipative processes involved in the dynamics of the system according to the excitation number (nexc). ii. emission spectrum of the cavity based on the gft in order to compute the emission spectrum of the cavity, we will consider the two-time correlation function accordingly with the eq. (9) for the field operator as follows lim t→∞ 〈ˆ̃a†(t + τ)ˆ̃a(t)〉 = trs[↠ˆ̃ g(τ)]. (15) where we have considered that the field operator is given by ˆ̃a(0) = â at the steady state. after performing the partial trace over the degrees of freedom of the system, we have that trs[â † ˆ̃g(τ)] = ∑ α,β,γ,l,m,n √ (l + 1)(m + 1) × trr[uαl,βm(τ) × 〈βm + 1|ρ̂(ss)s⊗r|γn〉u † γn,αl+1(τ)], (16) where the matrix elements for the time evolution operator are given by uαl,βm(τ) = 〈αl|û(τ)|βm〉 and u † γn,αl+1(τ) = 〈γn|û †(τ)|αl + 1〉. in what follows, we assume the validity of the markovian approximation, it means that the correlations between the system and the reservoir must be unimportant even at the steady state. thus, the density operator system-reservoir can be written as ρ̂ (ss) s⊗r = ρ̂ (ss) s ⊗ ρ̂ (ss) r which implies that 〈βm + 1|ρ̂(ss)s⊗r|γn〉 = ρ̂ (ss) r 〈βm + 1|ρ̂ (ss) s |γn〉. (17) replacing the previous expression in eq. (16), it is straightforward to show that the two-time correlation function reads trs[â † ˆ̃g(τ)] = ∑ αl √ l + 1〈αl| ˆ̃g(τ)|αl + 1〉, (18) where the green’s functions operator ˆ̃ g(τ) is given by ˆ̃ g(τ) = trr [ û(τ)ρ̂ (ss) r ∑ βγmn (√ m + 1 |βm〉〈γn| × 〈βm + 1| ρ̂(ss)s |γn〉 ) û†(τ) ] . (19) 080008-6 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. as we pointed out in section ii, this operator must obey the same master equation as the reduced density operator of the system. in fact, the terms that only contribute in the eq. (18) are given by the matrix elements g̃βm,γn(τ) ≡ 〈βm| ˆ̃ g(τ) |γn〉 of the green’s functions operator. this is due to the fact that the projection operator |βm〉〈γn| enters into ˆ̃ g(τ) in the same way as into the reduced density operator of the system. in order to identify these matrix elements, it should be considered that for the qd-cavity system, the dynamics of all coherences asymptotically vanish and there only remains the reduced density matrix elements which are ruled by the number of excitations criterion, i.e., ρgn,gn, ρxn−1,xn−1, ρgn,xn−1, ρxn−1,gn. thus, the eq. (17) can be written as follows 〈βm + 1|ρ̂(ss)s⊗r|γn〉 = ρ̂ (ss) r ( δβgδγgδm+1,n + δβxδγxδm,n−1 + δβgδγxδm,n + δβxδγgδm+1,n−1 ) × ρ(ss)sβm+1,γn. (20) by replacing the eq. (20) into eq. (19), we find that the green’s functions operator explicitly reads ˆ̃ g(τ) = trr [ û(τ)ρ̂ (ss) r ∑ m √ m + 1 × ( |gm〉〈gm + 1|ρ(ss)sgm+1,gm+1 + |xm〉〈xm + 1|ρ(ss)sxm+1,xm+1 + |gm〉〈xm|ρ(ss)sgm+1,xm + |xm〉〈gm + 2| × ρ(ss)sxm+1,gm+2 ) û†(τ) ] . (21) note that from this expression, it is easy to identify the nonzero matrix elements of the green’s functions operator that contribute to the emission spectrum. finally, after performing the laplace transform to the eq. (18), we have that the emission spectrum of the cavity is given by s(ω) = re ∑ l √ l + 1 ( g̃gl,gl+1(iω) + g̃xl,xl+1(iω) + g̃gl,xl(iω) + g̃xl,gl+2(iω) ) . (22) it is worth mentioning that the initial conditions may be obtained by evaluating the green’s function operator at τ = 0. moreover, by using the fact that the time evolution operators become the identity and trr[ρ̂ (ss) r ] = 1. we obtain a set of initial conditions given by g̃gl,gl+1(0) = √ l + 1ρ (ss) sgl+1,gl+1, g̃xl,xl+1(0) = √ l + 1ρ (ss) sxl+1,xl+1, g̃gl,xl(0) = √ l + 1ρ (ss) sgl+1,xl, g̃xl,gl+2(0) = √ l + 1ρ (ss) sxl+1,gl+2. (23) note that this set of initial conditions corresponds to the steady state of the reduced density matrix of the system. a general algorithm based on the gft for computing the emission spectrum is presented in the appendix. we mention that this approach can be adapted easily for calculating the emission spectrum due to the cavity as well as the quantum dot. iii. emission spectrum of the quantum dot based on the gft in order to compute the emission spectrum of the quantum dot, we will consider the two-time correlation function given by eq. (9), but for the case of the matter operator lim t→∞ 〈ˆ̃σ†(t + τ)ˆ̃σ(t)〉 = trs[σ̂† ˆ̃ g(τ)] (24) where we have considered that the matter operator is given by ˆ̃σ(0) = σ̂ at the steady state. it is straightforward to show, after performing the partial trace over the degrees of freedom of the system, that the two-time correlation function reads trs[σ̂ † ˆ̃g(τ)] = ∑ αl δαx〈gl| ˆ̃ g(τ)|αl〉, (25) 080008-7 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. where the green’s functions operator ˆ̃ g(τ) is given by ˆ̃ g(τ) = trr [ û(τ) ∑ βγmn ( δβx |gm〉〈γm| × 〈βm| ρ̂(ss)s⊗r |γn〉 ) û†(τ) ] . (26) assuming again the validity of the markovian approximation and taking into account the number of excitations criterion, we have that the density operator system-reservoir can be written as 〈βm|ρ̂(ss)s⊗r|γn〉 = ρ̂ (ss) r ( δβgδγgδm,n + δβxδγxδm,n + δβgδγxδm,n+1 + δβxδγgδm,n−1 ) ρ (ss) sβm,γn. (27) by inserting the eq. (27) into eq. (26), we find that the green’s functions operator explicitly reads ˆ̃ g(τ) = trr [ û(τ)ρ̂ (ss) r × ∑ m ( |gm〉〈xm|rho(ss)sxm,xm + |gm〉〈gm + 1|ρ(ss)sxm,gm+1 ) × û†(τ) ] . (28) analogously as in section ii, we identify the nonzero matrix elements of the green’s functions operator that contribute to the emission spectrum. after performing the laplace transform to eq. (25), the emission spectrum of the quantum dot is given by s(ω) = re ∑ l ( g̃gl,xl(iω)+g̃gl,gl+1(iω) ) . (29) taking into account that the initial conditions are obtained by evaluating the green’s function operator at τ = 0, we have the time evolution operators become the identity and trr[ρ̂ (ss) r ] = 1, thus, we obtain a set of initial conditions given by g̃gl,xl(0) = ρ (ss) sxl,xl, g̃gl,gl+1(0) = ρ (ss) sxl,gl+1, g̃xl,xl+1(0) = 0, g̃gl+2,xl(0) = 0. (30) at the steady-state. iv. emission spectrum of the cavity based on the qrt in what follows, we apply the qrt approach for the model of qd-cavity system described in section iii. in order to compute the emission spectrum of the cavity, the knowledge of the two-time correlation function for the field operator is required, it is 〈â†(τ)â(0)〉 in concordance with the eq. (1). moreover, by following the approach presented in ref. [26], it is straightforward to show that the twotime correlation function is given by 〈â†(τ)â(0)〉 = ∑ n √ n + 1 ( 〈â†gn(τ)â(0)〉 + 〈â†xn(τ)â(0)〉 ) , (31) where the following definitions have been used â † gn = |gn + 1〉〈gn| , â † xn = |xn + 1〉〈xn| , σ̂†n = |xn〉〈gn| , ζ̂n = |gn + 1〉〈xn− 1| . (32) it is worth mentioning that the last two operators should be added in order to close the dynamics of the system accordingly to the qrt as pointed out in section ii. more precisely, we are interested in solving the dynamical equations associated to the expectation values 〈â†gn(τ)â(0)〉 and 〈â † xn(τ)â(0)〉 as a function of τ. therefore, we need to solve a set of coupled differential equations given by d dτ 〈â†gn(τ)â(0)〉 = ∑ j lij〈â † gn(τ)â(0)〉, d dτ 〈â†xn(τ)â(0)〉 = ∑ j lij〈â † xn(τ)â(0)〉, d dτ 〈σ̂†n(τ)â(0)〉 = ∑ j lij〈σ̂†n(τ)â(0)〉, d dτ 〈ζ̂n(τ)â(0)〉 = ∑ j lij〈ζ̂n(τ)â(0)〉. (33) in order to find explicitly the set of dynamical equations and its corresponding initial conditions, 080008-8 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. we obtain first the set of differential equations for the single-time expectation values for the operators given by eq. (32), it is explicitly d dτ 〈â†gn(τ)〉 = ( −p − i∆ −nκ− κ 2 + iωx ) × 〈â†gn(τ)〉 + κ √ (n + 1)(n + 2) × 〈â†gn+1(τ)〉 + γ〈â † xn(τ)〉 − ig √ n〈ζ̂n(τ)〉 + ig √ n + 1 × 〈σ̂†n(τ)〉, d dτ 〈σ̂†n(τ)〉 = ig √ n + 1〈â†gn(τ)〉 − ig √ n〈â†xn−1(τ)〉 + 1 2 (−p −γ − 2nκ + 2iωx) × 〈σ̂†n(τ)〉 + (n + 1)κ〈σ̂†n+1(τ)〉, d dτ 〈â†xn−1(τ)〉 = p〈â † gn−1(τ)〉 + κ √ n(n + 1) × 〈â†xn(τ)〉 + (−γ − i∆ −nκ + κ 2 + iωx) × 〈â†xn−1(τ)〉 + ig √ n + 1〈ζ̂n(τ)〉 − ig √ n〈σ̂†n(τ)〉, d dτ 〈ζ̂n(τ)〉 = −ig √ n〈â†gn(τ)〉 + ig √ n + 1 × 〈â†xn−1(τ)〉 + (− p 2 − γ 2 − 2i∆ −nκ + iωx) × 〈ζ̂n(τ)〉 + √ n(n + 2)κ〈ζ̂n+1(τ)〉. (34) the qrt approach implies that the following two-time correlation functions 〈â†gn(τ)â(0)〉, 〈â†xn(τ)â(0)〉, 〈σ̂ † n(τ)â(0)〉 and 〈ζ̂n(τ)â(0)〉 satisfy the same dynamical equations given by eq. (34), subject to the initial conditions 〈â†gn(0)â(0)〉 = √ n + 1ρgn+1,gn+1(0), 〈â†xn(0)â(0)〉 = √ n + 1ρxn+1,xn+1(0), 〈σ̂†n(0)â(0)〉 = √ n + 1ρgn+1,xn(0), 〈ζ̂n(0)â(0)〉 = √ nρxn,gn+1(0). (35) 995 997 999 1001 1003 1005 ω(mev ) 0 0.4 0.8 1.2 1.6 s (ω ) a figure 3: emission spectrum of the cavity based on the gft as a solid blue line and the corresponding numerical calculation based on the qrt as a dashed red line. the parameters values are g = 1 mev, γ = 0.005 mev, κ = 0.2 mev, p = 0.3 mev, ∆ = 2 mev, ωa = 1000 mev. more precisely, it is done explicitly by performing the following replacements: 〈â†gn(τ)〉 → 〈â†gn(τ)â(0)〉, 〈â † xn(τ)〉 → 〈â † xn(τ)â(0)〉, 〈σ̂†n(τ)〉 → 〈σ̂†n(τ)â(0)〉 and 〈ζ̂n(τ)〉 → 〈ζ̂n(τ)â(0)〉. the parameters of the system ωx, ∆,g,κ,γ determine the dynamics of the two-time correlation function, as well as setting the initial conditions that will be propagated according to the dynamical equations given by eq. (34). since we are interested in the light that the quantum system emits, we have considered the steady state of the system as the initial state into equation eq. (35). iv. results and discussion in this section, we compare the numerical calculations based on the gft and the qrt approach for the emission spectrum of the cavity as well as the quantum dot. in particular, the qd-cavity system can display two different dynamical regimes by changing the parameters of the system and two regimes can be achieved when the loss and 080008-9 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. pump rates are modified. in fact, the relation g > |κ−γ|/4 holds for the strong coupling regime and the relation g < |κ−γ|/4 remains valid for the weak coupling regime. figure 3 shows the numerical results for the emission spectrum associated to the cavity in the strong coupling regime, where the emission spectrum of the cavity based on the gft is shown as a solid blue line and the emission spectrum based on the qrt approach as a dashed red line. the parameters of the system are g = 1 mev, γ = 0.005 mev, κ = 0.2 mev, p = 0.3 mev, ∆ = 2 mev and ωa = 1000 mev. particularly, for this set of parameters values, we can identify two different peaks which are associated to the energy of the cavity and the quantum dot, they are ωa ≈ 998.3 mev and ωx ≈ 1000.3 mev. we have considered the relative error as a quantitative measure of the discrepancy between the gft and the qrt approaches. more precisely, by monitoring the numerical computations of the emission spectrum, we have estimated that the maximum relative error is on the order of 10−3 in all numerical calculations that we have performed. similarly, the emission spectrum of the cavity based on the gft (solid blue line) and the qrt approach (dashed red line) for the strong coupling regime are shown in fig. 4. here, the parameters values of the system are given by g = 1 mev, γ = 0.005 mev, κ = 2 mev, p = 0.005 mev, ∆ = 0.0 mev and ωa = 1000 mev. note that we have considered the resonant case, more precisely, the same energy values for the cavity and the quantum dot. here, the emission spectrums do not match but repel each other, resulting in a structure of two separate peaks for a distance of approximately two times the coupling constant, i.e., 2g ≈ 2 mev. it is worth mentioning that this quantum effect is well-known as rabi splitting in qd-cavity systems. the emission spectrum of the quantum dot in the weak coupling regime is shown in fig. 5. the numerical result for the emission spectrum of the quantum dot based on the gft is shown as a solid blue line and the corresponding numerical result for qrt approach is shown as a dashed red line. we set the weak coupling regime by considering high values of the decay and pump rates κ = 5 mev and p = 1 mev, respectively. the rest of the parameters values are g = 1 mev, γ = 0.1 mev, ∆ = 5 mev and ωa = 1000 mev. we conclude that the method based on the gft is in perfect agreement with the qrt approach and reproduces 995 997 999 1001 1003 1005 ω(mev ) 0 0.15 0.30 0.45 s (ω ) a figure 4: emission spectrum of the cavity based on the gft as a solid blue line and the corresponding numerical calculation based on the qrt approach as a dashed red line. the parameters values are g = 1 mev, γ = 0.005 mev, κ = 2 mev, p = 0.005 mev, ∆ = 0 mev, ωa = 1000 mev. very well the emission spectrum of the qd-cavity system. for comparison purposes with our gft approach, we have also implemented the numerical method based on the qrt for the qd-cavity system (see details in section iv). in the conducted simulations, we have considered the same truncation level in the bare-state basis, e.g., nexc = 10. moreover, we have solved numerically the dynamical equations of the system given by the eq. (34) until time tmax = 2 17 ps for obtaining an acceptable resolution in frequency domain, it is ∆ω ≈ 0.048 mev. in order to test the performance of the gft approach in terms of efficiency, we have compared the computational time involved on the numerical calculation of the emission spectrum of the cavity at four different excitation numbers nexc. table 1 shows in first column the excitation number. second and third columns show the elapsed time (cpu time) in seconds during the simulations for the gft and the qrt approach, respectively. it is worth mentioning that we have considered, for this comparison, exactly the same 080008-10 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. table 1: comparison of computational time (cpu time) between the green’s functions technique (gft) and the quantum regression theorem (qrt) approach for the numerical calculation of the emission spectrum of the cavity. the computational times were made using a commercial intel(r) core(tm) i7 − 4770 processor of 3.4 ghz ×8, and 12 gb ram. excitation cpu time(s) cpu time(s) number (nexc) for the gft for the qrt 5 0.4 92.5 10 2.0 273.4 20 14.0 390.2 40 100.2 673.8 resolution in the frequency domain and the numerical calculations were carried out with the same parameters values as in fig. 4 for both methods. it is straightforward to observe that the qrt approach is time-consuming compared with the gft approach when the excitation number is increased. from the computational point of view, it is due to the fact that the qrt approach requires solving a large number of coupled differential equations in contrast to the gft approach which requires a relatively small system of algebraic equations. v. conclusions we have presented the gft as an alternative methodology to the qrt approach for calculating the two-time correlation functions in open quantum systems. we have applied the gft and the qrt approach for calculating the emission spectrum in a qd-cavity system. in particular, the performance of the gft in terms of accuracy and efficiency by comparison of the emission spectrum of the cavity and the quantum dot is demonstrated, as well as by comparison of the computational times involved during the numerical simulations. in fact, we have shown that the gft offers a computational advantage, namely, the speeding up numerical calculations. we conclude that the gft allows to overcome the inherent theoretical difficulties presented 995 997 999 1001 1003 1005 ω(mev ) 0 4 8 12 16 s (ω ) a figure 5: emission spectrum of the quantum dot based on the gft as a solid blue line and the corresponding numerical calculation based on the qrt approach as a dashed red line. the parameters values are κ = 5 mev, p = 1 mev, g = 1 mev, γ = 0.1 mev, ∆ = 5 mev and ωa = 1000 mev in the qrt approach, i.e., to find a closure condition on the set of operators involved in the dynamical equations. we mention that our methodology based on the gft can be extended for calculating the emission spectrum in significant situations where the quantum dots are in biexcitonic regime or when the quantum dots are coupled to photonic cavities. acknowledgements eag acknowledges the finacial support from vicerrectoŕıa de investigaciones at universidad del quind́ıo through research grant no. 752. hvp acknowledges the financial support from colciencias, within the project with code 11017249692, and hermes code 31361. appendix the emission spectrum in a qd-cavity system can easily be computed by taking into account that the dynamics of the operators ĝ(τ) and ρ̂s(τ) are gov080008-11 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. erned by the same linblad master equation, i.e., dĝ(τ)/dτ = lĝ(τ) with l the liouvillian superoperator discussed in section ii. moreover, it has effectively a larger tensor rank than the reduced density operator of the system. thus, we can write the dynamical equations for the green’s functions operator in a component form dgα̃(τ) dτ = ∑ β̃ lα̃β̃gβ̃(τ), (36) together with the initial condition gβ̃(0). here, the symbol α̃ corresponds to a composite index for labeling the states of the reduced density operator of the system, e.g., for indexing both matter and photon states in the qd-cavity system, see section iii. for details. it is worth mentioning that gβ̃ and lα̃β̃ act as a column vector and a matrix in this notation. in order to obtain the solution to the eq. (36) in frequency domain, we perform a laplace transform and it explicitly takes the form −g̃α̃(0) = ∑ β̃(lα̃β̃ − iωδα̃β̃)g̃β̃(iω). it is straightforward to obtain the solution by performing the matrix inversion to mα̃β̃ = (iωδα̃β̃ −lα̃β̃) and the emission spectrum is computed easily in terms of the initial conditions given by g̃β̃(iω) = ∑ α̃ m−1 β̃α̃ g̃α̃(0). (37) the initial conditions are obtained by evaluating the green’s function operator at τ = 0. [1] h walther et al., cavity quantum electrodynamics, rep. prog. phys. 69, 395603 (2006). [2] a kavokin, j j baumberg, g malpuech, f p laussy, microcavities, oxford university press (2007). [3] h altug, d englund, j vuckovic, ultrafast photonic crystal nanocavity laser, nat. phys. 2, 484 (2006). [4] y mu, c m savage, one-atom lasers, phys. rev. a 46, 5944 (1992). [5] r m stevenson, r j young, p atkinson, k cooper, d a ritchie, a j shields, a semiconductor source of triggered entangled photon pairs, nature 439, 179 (2006). [6] t m stace, g j milburn, c h w barnes, entangled two-photon source using biexciton emission of an asymmetric quantum dot in a cavity, phys. rev. b 67, 085317 (2003). [7] n gisin, g ribordy, w tittel, h zbinden, quantum cryptography, rev. mod. phys. 74, 145 (2002). [8] c monroe, quantum information processing with atoms and photons, nature 416, 238 (2002). [9] y todorov, i sagnes, i abram, c minot, purcell enhancement of spontaneous emission from quantum cascades inside mirror-grating metal cavities at thz frequencies, phys. rev. lett. 99, 223603 (2007). [10] j wiersig et al., direct observation of correlations between individual photon emission events of a microcavity laser, nature 460, 245 (2009). [11] g khitrova et al., vacuum rabi splitting in semiconductors, nat. phys. 2, 81 (2006). [12] j p reithmaier et al., strong coupling in a single quantum dot-semiconductor microcavity system, nature 432, 197 (2004). [13] l mandel, e wolf, optical coherence and quantum optics, cambridge university press, (1997). [14] o jedrkiewicz, r loudon, atomic dynamics in microcavities: absorption spectra by green function method, j. opt. b. quantum s. o. 2, r47 (2000). [15] n v hieu, n b ha, time-resolved luminescence of the coupled quantum dotmicrocavity system: general theory, adv. nat. sci.: nanosci. nanotechnol. 1, 045001 (2010). [16] d f walls, g j milburn, quantum optics, springer-verlag, berlin (1994). [17] m lax, quantum noise. iv. quantum theory of noise sources, phys. rev. 145, 110 (1966). [18] s swain, master equation derivation of quantum regression theorem, j. phys. a. math. gen. 14, 2577 (1981). 080008-12 papers in physics, vol. 8, art. 080008 (2016) / e. a. gómez et al. [19] e del valle, f p laussy, c tejedor, luminescence spectra of quantum dots in microcavities. ii. fermions, phys. rev. b 79, 235326 (2009). [20] n quesada, h vinck-posada, b a rodŕıguez, density operator of a system pumped with polaritons: a jaynes-cummings-like approach, j. phys. condens. matter 23, 025301 (2011). [21] c a vera, n quesada, h vinck-posada, b a rodŕıguez, characterization of dynamical regimes and entanglement sudden death in a microcavity quantum dot system, j. phys. condens. matter 21, 395603 (2009). [22] n ishida, t byrnes, f nori, y yamamoto, photoluminescence of a microcavity quantum dot system in the quantum strong-coupling regimes, sci. rep. 3, 1180 (2013). [23] t quang, g s agarwal, j bergou, m o scully, h walther, k vogel, w p schleich, calculation of the micromaser spectrum. i. greensfunction approach and approximate analytical techniques, phys. rev. a 48, 803 (1993). [24] e t jaynes, f w cummings, comparison of quantum and semiclassical radiation theories with application to the beam maser, proc. ieee 51, 89 (1963). [25] m o scully, m s zubairy, quantum optics, cambridge, cambridge university press (1996). [26] j i perea, d porras, c tejedor, dynamics of the excitations of a quantum dot in a microcavity, phys. rev. b 70, 115304 (2004). [27] h p breuer, f petruccione, the theory of open quantum systems, oxford, university press (2002). [28] a rivas, s f huelga, open quantum systems, springer (2012). 080008-13 papers in physics, vol. 12, art. 120004 (2020) received: 13 august 2019, accepted: 20 march 2020 edited by: m. c. barbosa licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.120004 www.papersinphysics.org issn 1852-4249 flavodoxin in a binary surfactant system consisting of the nonionic 1-decanoyl-rac-glycerol and the zwitterionic lauryldimethylamine-n-oxide: molecular dynamics simulation approach behnaz bazaziyan1, mohammad reza bozorgmehr1*, mohammad momen-heravi1, s. ali beyramabadi1 due to the short time constant of the spin-spin relaxation process, there is a limitation in the preparation of nmr sample solution for large proteins. to overcome this problem, reverse micelle systems are used. here, molecular dynamics simulation was used to study the structure of flavodoxin in a quaternary mixture of 1-decanoyl-rac-glycerol, lauryldimethylamine-n-oxide, pentane and hexanol. hexanol was used as co-solvent. simulations were performed at three different co-solvent concentrations. the proportion of components in the mixture was selected according to experimental conditions. for comparison, simulation of flavodoxin in water was also performed. the simulation results show that the cα-rmsd for the protein in water is less than for the surfactant mixture. also, the radius of gyration of flavodoxin increased in the presence of surfactants. the distance between the two residues trp-57 and phe-94, as a measure of protein activity, was obtained from the simulations. the results showed that in the surfactant mixtures this distance increases. analysis of the secondary structure of the protein shows that the n-terminal part of the flavodoxin is more affected by surfactants. the flavodoxin diffusion coefficient in the surfactant mixture decreased in relation to its diffusion coefficient in water. i. introduction nuclear magnetic resonance (nmr) spectroscopy is a powerful technique for studying the structure of proteins under different conditions [1]. this method is used for high-purity protein samples in an aqueous phase. in the case of spin-spin relaxation, the transverse component of the magnetization vector decays exponentially towards its equilibrium value in nmr. the time constant describing this process is known as t2. this name is based on t1, which is the time constant of the spin-lattice * bozorgmehr@mshdiau.ac.ir 1 department of chemistry, mashhad branch, islamic azad university, mashhad, iran. relaxation process. the short t2 time causes pulse loss during pulse sequences. on the other hand, with increasing protein size, this time is reduced. it is therefore difficult to prepare a solution for large proteins, so using this method for them is limited. by encapsulating the protein in the aqueous solution formed in the cavity of the reverse micelles, and then placing the complex in a non-polar solvent, this problem is largely resolved [2]. in this case, the viscosity of the non-polar solvent must be much less than water [3]. short chain alkanes are a good candidate for this purpose [4]. the nature of surfactants, the ability to dissolve surfactant in a non-polar solvent, and the ratio of water to surfactant are among the important parameters for the success of this method [5]. 120004-1 papers in physics, vol. 12, art. 120004 (2020) / b. bazaziyan et al. flavodoxin is an electron transfer protein that plays a role in photosynthetic reactions [6]. in flavodoxins, a flavin mono nucleotide molecule is attached to the protein in a non-covalent manner, which causes the redox property of proteins [7]. from a structural standpoint, flavdoxin is an α/β protein in which the β sheets are surrounded by α helices [8]. since flavodoxin is a simple onedomain protein, it is a suitable model for protein studies such as stability and folding. cheung et al. have investigated the dynamics of flavodoxin protein in crowded environments by molecular dynamics simulation [9]. they showed that protein stability increases in in silico crowding conditions, and folding routes experiencing topological frustrations can be either relieved or enhanced upon manipulation of the crowding conditions. studies have also shown that by removing the flavin mononucleotide molecule from flavodoxin and creating the apoprotein form, the binding position does not change greatly; only the aromatic rings of tyrosine-94 and tryptophan-57 are closer to each other [6]. genzor et al. studied the stability of apo-flavodoxin conformation in urea solution in different acidic and ionic strengths. they found that at a ph of between 6 and 11 apo-flavodoxin conformation is stable, whereas at a lower ph, between 2 and 3.5, proteins lose their stability [10]. the structure of flavodoxin in the presence of a cross-linked, spherical, sucrose-based polymer called ficoll 70 has been studied experimentally and theoretically [11, 12]. this circular dichroism and in silico study showed that the helical content of folded apoflavodoxin was increased by additions of ficoll 70. other experimental evidence indicated that correct contact formation around the third β-strand of flavodoxin in the central sheet is crucial to continued folding to the native state in crowded media. research has also shown that flavodoxin in the presence of dextran retains about 70% of its secondary structure [13]. in this manuscript, a simulation of molecular dynamics was used to study the flavdoxin protein in a mixture of the surfactants 1-decanoyl-rac-glycerol and lauryldimethylamine-n-oxide. pentane solvent was used for solute solvation. for comparison, molecular dynamics simulation of protein in water was also performed. ii. computational details for the molecular dynamics simulation, four simulation boxes were designed. in the first box, the box dimensions were considered according to the size of the flavodoxin, 5.95 × 5.95 × 5.95 nm3, and the protein was placed in the center of the box. then the box was filled with water solvent (named as system a). in the second box, the protein was placed in the center of the box, with dimensions 6.75 × 6.75 × 6.75 nm3, then 130 1decanoyl-rac-glycerol molecules and 70 molecules of lauryldimethylamine-n-oxide were randomly added to the box. also, 50 molecules of water were added to this box, and finally, the box was filled with 1000 pentane molecules (named as system b). the specification of the third box was like the second box, only differing in that 5 hexanol molecules were also randomly placed in the box (named as system c). the fourth box was similar to the second box, but in this box, 10 hexanol molecules were randomly placed in it (named as system d). the components used in the various systems are summarized in table 1. since 1-decanoyl-rac-glycerol is non-ionic, hexanol solvent is used as auxiliary solvent. these molecular ratios are selected based on the reported optimal ratios in dodevski et al.’s experimental work [14]. the structure of flavodoxin was obtained from the protein data bank with code 1obv [15]. to perform molecular dynamics simulation, the gromacs software version 5.2.1 was used [16]. charm27 all-atom force field was used for calculations, since the force field parameters for 1-decanoyl-rac-glycerol, lauryldimethylaminen-oxide, hexanol and pentane are not present in the default version of the gromacs software. the structure of these compounds was optimized using the b3lyp density functional method with the basis function of 6-31g*. to control the optimization, frequency calculations were performed and virtual frequencies were not observed. the charge of each atom was calculated by the chelpg method. all ab initio calculations were done by gamess software package [17]. the optimized structure of the compounds is shown in fig. 1. the swissparam web server [18] was used to determine the force field parameters of the studied compounds. water model tip3p [19] was used for water in simulation boxes. in order to neutralize 120004-2 papers in physics, vol. 12, art. 120004 (2020) / b. bazaziyan et al. table 1: outline of designed systems. system box volume ( nm3 ) 1-decanoyl-rac-glycerol lauryldimethylamine-n-oxide pentane hexanol a 5.95 × 5.95 × 5.95 b 6.75 × 6.75 × 6.75 130 70 1000 c 6.75 × 6.75 × 6.75 130 70 1000 5 d 6.75 × 6.75 × 6.75 130 70 1000 10 figure 1: optimized geometries of studied molecules, with labeling. the systems, an appropriate number of na+ ions were added to each box. to fix a constant temperature and pressure during the simulations, system components were coupled with v-rescale and nose-hoover thermostat [20] respectively, in each of the equilibration steps and molecular dynamics simulations. the temperature and pressure used in the calculations are 298 kelvin and 1 bar, respectively. lincs algorithm [21] was employed to fix the chemical bonds between the atoms of the non-water molecules and settle algorithm [22] was used in the case of water molecules. the pme algorithm was used to calculate electrostatic interactions [23]. energy minimization was done using the steepest descent method [24] to eliminate the primary kinetic energy and to eliminate inappropriate contacts between the atoms. each simulation box achieved a two-stage equilibrium in nvt and npt ensemble. at this stage, the time of equilibration was considered 10 ns with a time step of 2fs. finally, molecular dynamics was performed by solving the second newton equation for 140 ns with the 2fs time step. each simulation was repeated four times under the different initial conditions, to avoid any dependency on the initial conditions and to increase the accuracy of the simulations. iii. results and discussion an important point when using surfactants in encapsulating proteins in nmr spectroscopy is to maintain the native protein structure and reduce its reorientation time. therefore, the structural and dynamic parameters of flavodoxin in the designed systems were calculated for analysis. one common method of displaying the stability of a protein’s structure is to calculate the rmsd values during the simulation. the root mean square deviation (rmsd) is defined as: rmsd(t1, t2) = [ 1 m n∑ i=1 mi|ri(t1) −ri(t2)| ]1 2 . (1) where ri is the atomic position at time t and m =∑n i=1 mi. cα-rmsd variations of flavodoxin for the studied systems are shown in fig. 2. 120004-3 papers in physics, vol. 12, art. 120004 (2020) / b. bazaziyan et al. 0 20000 40000 60000 80000 100000 120000 140000 time [ps] 0.25 0.2 0.15 0.1 0.05 0 r m s d [ n m ] figure 2: cα-rmsd values (nm) versus time (ps) for simulated systems defined as a, b, c and d. according to the figure, protein stability in surfactant systems is observed to be somewhat reduced compared to system a. this decrease in stability for the c system is greater than for b and d. however, the difference in cα-rmsd values in different systems is not high. on the other hand, given the fact that flavodoxin has 169 residues, this difference in stability in various systems is not significant. another factor that can determine the activity and stability of the protein is compression of the conformation [25], in which the radius of gyration (rg) indicates the compression of the protein structure. the rg of a protein is calculated using the following equation: rg = (∑ i |ri| 2mi∑ i mi )1 2 (2) where mi is the atomic mass of i and ri the atomic position of i relative to the center of mass of the molecule [26]. the rg of the studied systems was calculated and the results are shown in fig. 3. the rg values of the proteins vary from 1.48 nm to 1.63 nm in different systems. flavodoxin in system a has the smallest rg and in system c the highest rg. since the compactness of the structure of the proteins is attributed to their activity [27], according to 3 the presence of flavodoxin in the surfactant solution decreases its activity. the root mean square fluctuation (rmsf) values of flavodoxin sequences are calculated in different systems and shown in fig. 4. except for the residues valine-18, isoleucine-21, 0 20000 40000 60000 80000 100000 120000 140000 time [ps] r g [ n m ] 1.69 1.64 1.59 1.54 1.49 1.44 figure 3: rg values (nm) versus time (ps) for simulated systems defined as a, b, c and d. r m s f [ n m ] 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0 20 40 60 80 100 120 140 160 residue number figure 4: average rmsf values for each residue of flavodoxin during simulation in a, b, c and d systems. tryptophan-66 and glycine-68 in the flavodoxin of system c, the flexibility of other residues in system a is more than in the other systems. another noteworthy point is that the c-terminal of the protein is less affected by the non-aqueous environment. in fact, the greatest difference in the rmsf values in the various systems related to amino acids is 1 to 79. in this region, the helices and sheets in the secondary structure of flavodoxin have smaller lengths than the helices and sheets in the c-terminal of the protein. if two faces are considered for proteins, the face of the n-terminal is more exposed to the environment. in flavodoxins, the distance between two aromatic amino acids in the ligand binding site is considered as a measure of protein activity [8]. the distance between the tryptophan57 and phenylalanine-94 residues of flavodoxin in 120004-4 papers in physics, vol. 12, art. 120004 (2020) / b. bazaziyan et al. figure 5: the distance between trp-57 and phe-94 (system a). system a is shown in fig. 5. according to the figure, this distance is 7.61 angstroms. the distance between these two amino acids in systems b, c and d was 9.48, 10.05 and 10.42 angstroms, respectively. according to the values obtained, flavdoxin activity seems to decrease in surfactant solutions. sampling of the suitable structure for analysis in molecular dynamics trajectories has always been one of the interesting issues. averaging the coordinates of the molecular coordinates in the last few nanoseconds of the simulation is one of the methods used in this regard [28]. the sampling of coordinates at a time when the system is in equilibrium is another method used for this [29]. in these methods, variations of a quantity such as cα-rmsd in the area are used. however, there is no physical concept in the average of atomic coordinates [30], so for sampling of the simulationsthe free energy landscape (fel) analysis method is used [31]. there are three main stages of fel analysis: calculation of the cα-rmsd and the flavodoxin radius of gyration (rg), obtaining the possibility of the presence of flavodoxin protein conformation in each corresponding value of cα-rmsd and rg, and calculation of the free energy of configurations based on the probability values of presence. the results of the fel analysis are shown in 3d diagrams in fig. 6 for a, b, c and d systems, respectively. in these figures, the minimum regions of free energy are shown in blue. according to these figures, it can be seen that all systems have a local minimum with the least amount of free energy. regarding the values of cα-rmsd and rg corresponding figure 6: free energy landscape of the a, b, c and d simulated systems. to the lowest free energy, the structure of flavodoxin was sampled from the molecular trajectories in the simulations. then, using the software polyview 2d [32], the secondary structure of the flavodoxin obtained in the previous step was analyzed. the result is shown in fig. 7 for different systems. as the color darkens, it indicates that the level of solvent-accessible surface area for the residue decreases. with respect to the figures, it is observed that in the b system, the helices between aspartic-35 and aspartic-46 are eliminated. the beta sheet between the valin-31 and aspartic-35 sequences has also been shortened. the downsizing of this beta sheet in c and d systems is less than in the b system. in these areas, by changing the structure of the helices and sheets, the level of solvent-accessible surface area of residues has increased somewhat. the region of change for the c system is related to the helices between the tryptophan-57 and the aspartic-75 residues. in general, the secondary structure of the flavodoxin in the c-terminal region has changed little. these results are consistent with the results of the rmsf values. a protein contact map is a convenient way to determine changes in the tertiary structure of proteins in different conditions. the flavodoxin contact map was obtained in various simulations and the results are shown in fig. 8. in these figures, the flavodoxin structure of system a was considered as a wild structure, and the remaining structures were compared to system a. in these figures, the lower half is the contact map of flavodoxin, while the upper half is the protein difference contact map; they 120004-5 papers in physics, vol. 12, art. 120004 (2020) / b. bazaziyan et al. helix, β strand, coil. dark: completely buried, bright: completely exposed figure 7: the secondary structure of the flavodoxin, along with the solvent-accessible surface area for its residues in systems a (top-left), b (top-right), c (bottom-left) and d (bottom-right). figure 8: (left) the contact map of flavodoxin in b (left-lower triangle) and different contact map of flavodoxin in b and flavodoxin in a (right-upper triangle). (center) the contact map of flavodoxin in c (left-lower triangle) and different contact map of flavodoxin in c and flavodoxin in a (right-upper triangle). (right) the contact map of flavodoxin in d (left-lower triangle) and different contact map of flavodoxin in d and flavodoxin in a (right-upper triangle). the colors used in the figure are described in the main text. are shown in different systems relative to system a. in the lower part of the figure, the black color represents the common contacts and the green color indicates the contacts that are in the wild structure of system a, but not in the structure of the secondary system; the red color represents the contacts that are in the secondary structure but not in the structure of system a. in the upper part, the blue color indicates the regions with the smallest differences and the red color indicates the areas with the largest differences. the greater the difference in red color between figs. 8 (center) and 8 (right) in comparison with fig. 8 (left), the more the tertiary structure of flavodoxin in systems c and d has been affected. also, areas of tertiary structure change in these systems are more focused 120004-6 papers in physics, vol. 12, art. 120004 (2020) / b. bazaziyan et al. than in system b. finally, to determine the dynamics of flavodoxin, its diffusion coefficient was calculated. prediction of the diffusion coefficients of molecules in solvent is of theoretical importance. as the diffusion coefficient value increases, the mobility of the molecule increases [33]. the einstein relation was used to calculate the diffusion coefficient of flavodoxin in all simulated systems: d = 1 6 lim t→∞ d dt 〈|ri(t) −ri(0)|2〉. (3) where ri is the atom coordinate vector and the term inside the angled brackets is the mean square displacement (msd). in this approach, the selfdiffusion coefficient is proportional to the slope of the msd as a function of time in the diffusional regime [34]. the diffusion coefficient values of flavodoxin are listed in table 2. table 2: diffusion coefficient values of flavodoxin system d × 10−5cm2/s a 0.086 b 0.004 c 0.004 d 0.016 with regard to the diffusion coefficient, the flavodoxin dynamics are slower in systems b, c and d than in system a. thus, the time of reorientation of protein conformation in systems b, c and d is less than that of system a. in other words, using this surfactant mixture can lead to a better nmr spectrum. this result is consistent with the results in the experimental work of dodevski et al [14]. iv. conclusions by encapsulating the protein in the aqueous solution formed in the cavity of the reverse micelles, and then placing the complex in a non-polar solvent, the problem of slow tumbling will be greatly resolved in nmr spectroscopy. in this work, a mixture of the surfactants 1-decanoyl-rac-glycerol and lauryldimethylamine-n-oxide were used, together with the auxiliary solvent hexanol. changes in the structure and dynamics of flavodoxin protein were studied in this mixture. pentane solvent was used as a low viscosity solvent. compared with the structure of flavodoxin in water, the simulation results indicate that the protein’s secondary and tertiary structures in the mixture of surfactants are altered. however, its dynamics decrease. it was observed that in the absence of hexanol the helical and sheet content of the protein decreased in different regions. also, in the presence of hexanol the ordered secondary structure of the protein decreased relative to two component systems consisting of protein and water. however, the level of reduction of the regular secondary structure was lower than in the absence of hexanol. decreasing the areas of the regular secondary structure made the sequences more accessible to solvent. in general, the secondary structure of the flavodoxin in the c-terminal region changed little. the different data obtained from the simulation are consistent with each other. the results are also in agreement with previous work in this area [35]. [1] a d bax, s grzesiek, methodological advances in protein nmr, acc. chem. res. 26, 131 (1993). [2] a j wand, m r ehrhardt, p f flynn, highresolution nmr of encapsulated proteins dissolved in low-viscosity fluids, proc. natl. acad. sci. u.s.a. 95, 15299 (1998). [3] m r ehrhardt, p f flynn, a j wand, preparation of encapsulated proteins dissolved in low viscosity fluids, j. biomol. nmr 14, 75 (1999). [4] c r babu, p f flynn, a j wand, validation of protein structure from preparations of encapsulated proteins dissolved in low viscosity fluids, j. am. chem. soc. 123, 2691 (2001). [5] w d van horn, m e ogilvie, p f flynn, use of reverse micelles in membrane protein structural biology, j. biomol. nmr 40, 203 (2008). [6] j sancho, flavodoxins: sequence, folding, binding, function and beyond, cell. mol. life sci. 63, 855 (2006). [7] k fukuyama, h matsubara, l j rogers, crystal structure of oxidized flavodoxin from a red alga chondrus crispus refined at 1.8 å resolution: description of the flavin mononucleotide binding site, j. mol. biol. 225, 775 (1992). 120004-7 http://doi.org/10.1021/ar00028a001 http://doi.org/10.1021/ar00028a001 http://doi.org/10.1073/pnas.95.26.15299 http://doi.org/10.1073/pnas.95.26.15299 http://doi.org/10.1023/a:1008354507250 http://doi.org/10.1021/ja005766d http://doi.org/10.1007/s10858-008-9227-5 http://doi.org/10.1007/s00018-005-5514-4 http://doi.org/10.1007/s00018-005-5514-4 http://doi.org/10.1016/0022-2836(92)90400-e papers in physics, vol. 12, art. 120004 (2020) / b. bazaziyan et al. [8] s maldonado, a lostao, m p irún, j férnandez-recio, c g genzor, e b gonzález, j a rubio, a luquita, f daoudi, j sancho, apoflavodoxin: structure, stability, and fmn binding, biochimie 80, 813 (1998). [9] d homouz, l stagg, p wittung-stafshede, m s cheung, macromolecular crowding modulates folding mechanism of α/β protein apoflavodoxin, biophys. j. 96, 671 (2009). [10] c g genzor, c gómez-moreno, j sancho, a beldarráın, j l lópez-lacomba, m cortijo, conformational stability of apoflavodoxin, protein sci. 5, 1376 (1996). [11] l stagg et al, molecular crowding enhances native structure and stability of α/β protein flavodoxin, proc. natl. acad. sci. u.s.a. 104, 18976 (2007). [12] d venturoli, b rippe, ficoll and dextran vs. globular proteins as probes for testing glomerular permselectivity: effects of molecular size, shape, charge, and deformability, am. j. physiol. renal physiol. 288, f605 (2005). [13] e steensma, c p van mierlo, structural characterisation of apoflavodoxin shows that the location of the stable nucleus differs among proteins with a flavodoxin-like topology, j. mol. biol. 282, 653 (1998). [14] i dodevski, n v nucci, k g valentine, g k sidhu, e s o’brien, a pardi, a j wand, optimized reverse micelle surfactant system for high-resolution nmr spectroscopy of encapsulated proteins and nucleic acids dissolved in low viscosity fluids, j. am. chem. soc. 136, 3465 (2014). [15] a lostao, f daoudi, m p irún, á ramón, c fernández-cabrera, a romero, j sancho, how fmn binds to anabaena apoflavodoxin: a hydrophobic encounter at an open binding site, j. biol. chem. 278, 24053 (2003). [16] d van der spoel, e lindahl, b hess, g groenhof, a e mark, h j berendsen, gromacs: fast, flexible, and free, j. comput. chem. 26, 1701 (2005). [17] m w schmidt, k k baldridge, j a boatz, s t elbert, m s gordon, j h jensen, s koseki, n matsunaga, k a nguyen, s su, general atomic and molecular electronic structure system, j. comput. chem. 14, 1347 (1993). [18] v zoete, m a cuendet, a grosdidier, o michielin, swissparam: a fast force field generation tool for small organic molecules, j. comput. chem. 32, 2359 (2011). [19] d j price, c l brooks iii, a modified tip3p water potential for simulation with ewald summation, j. chem. phys. 121, 10096 (2004). [20] g bussi, d donadio, m parrinello, canonical sampling through velocity rescaling, j. chem. phys. 126, 014101 (2007). [21] b hess, h bekker, h j berendsen, j g fraaije, lincs: a linear constraint solver for molecular simulations, j. comput. chem. 18, 1463 (1997). [22] s miyamoto, p a kollman, settle: an analytical version of the shake and rattle algorithm for rigid water models, j. comput. chem. 13, 952 (1992). [23] u essmann, l perera, m l berkowitz, t darden, h lee, l g pedersen, a smooth particle mesh ewald method, j. chem. phys. 103, 8577 (1995). [24] d g luenberger, y ye, linear and nonlinear programming, springer international publishing, switzerland (1984). [25] s ghaderi, m r bozorgmehr, a morsali, structure study and predict the function of the diphtheria toxin in different ph levels (acidicbasic-natural) using molecular dynamics simulations, entomol. appl. sci. lett. 3, 49 (2016). [26] b honarparvar, a a skelton, molecular dynamics simulation and conformational analysis of some catalytically active peptides, j. mol. model. 21, 100 (2015). [27] h monhemi, m r housaindokht, m r bozorgmehr, m s s googheri, enzyme is stabilized by a protection layer of ionic liquids in 120004-8 http://doi.org/10.1016/s0300-9084(00)88876-8 http://doi.org/10.1016/j.bpj.2008.10.014 http://doi.org/10.1002/pro.5560050716 https://doi.org/10.1073/pnas.0705127104 https://doi.org/10.1073/pnas.0705127104 https://doi.org/10.1152/ajprenal.00171.2004 https://doi.org/10.1152/ajprenal.00171.2004 https://doi.org/10.1006/jmbi.1998.2045 https://doi.org/10.1006/jmbi.1998.2045 http://doi.org/10.1021/ja410716w http://doi.org/10.1021/ja410716w http://doi.org/10.1074/jbc.m301049200 http://doi.org/10.1002/jcc.20291 http://doi.org/10.1002/jcc.20291 http://doi.org/10.1002/jcc.540141112 http://doi.org/10.1002/jcc.21816 http://doi.org/10.1002/jcc.21816 http://doi.org/10.1063/1.1808117 http://dx.doi.org/10.1063/1.2408420 http://dx.doi.org/10.1063/1.2408420 https://doi.org/10.1002/(sici)1096-987x(199709)18:123c1463::aid-jcc43e3.0.co;2-h https://doi.org/10.1002/(sici)1096-987x(199709)18:123c1463::aid-jcc43e3.0.co;2-h http://doi.org/10.1002/jcc.540130805 http://doi.org/10.1002/jcc.540130805 http://doi.org/10.1063/1.470117 http://doi.org/10.1063/1.470117 http://doi.org/10.1007/978-3-319-18842-3 http://doi.org/10.1007/978-3-319-18842-3 http://doi.org/10.1007/978-3-319-18842-3 https://easletters.com/index.php?journal=journal&page=article&op=view&path5b5d=123 https://easletters.com/index.php?journal=journal&page=article&op=view&path5b5d=123 http://doi.org/10.1007/s00894-015-2645-x http://doi.org/10.1007/s00894-015-2645-x papers in physics, vol. 12, art. 120004 (2020) / b. bazaziyan et al. supercritical co2: insights from molecular dynamic simulation, j. supercrit. fluids 69, 1 (2012). [28] m r housaindokht, m r bozorgmehr, h e hosseini, r jalal, a asoodeh, m saberi, z haratipour, h monhemi, structural properties of the truncated and wild types of takaamylase: a molecular dynamics simulation and docking study, j. mol. catal. b: enzym. 95, 36 (2013). [29] n moghtaderi, m r bozorgmehr, a morsali, the study of self-aggregation behavior of the bilirubin molecules in the presence and absence of carbon nanotubes: molecular dynamics simulation approach, j. mol. liq. 208, 342 (2015). [30] d van der spoel, e lindahl, b hess, a van buuren, e apol, p meulenhoff, d tieleman, a sijbers, k feenstra, r van drunen, gromacs user manual version 3.3, (2008). [31] h lei, c wu, h liu, y duan, folding freeenergy landscape of villin headpiece subdomain from molecular dynamics simulations, proc. natl. acad. sci. u.s.a. 104, 4925 (2007). [32] a a porollo, r adamczak, j meller, polyview: a flexible visualization tool for structural and functional annotations of proteins, bioinformatics 20, 2460 (2004). [33] d brune, s kim, predicting protein diffusion coefficients, proc. natl. acad. sci. u.s.a. 90, 3835 (1993). [34] d frenkel, b smit, understanding molecular simulation: from algorithms to applications, computacional sciences series 1, academic press (2002). [35] b bazaziyan et al, reverse micelle surfactant system comprising the 1-decanoyl-rac-glycerol and the lauryldimethylamine-n-oxide: structure and dynamics of confined water, russ. j. phys. chem. a 93, 1122 (2019). 120004-9 http://doi.org/10.1016/j.supflu.2012.04.020 http://doi.org/10.1016/j.supflu.2012.04.020 http://doi.org/10.1016/j.molcatb.2013.05.011 http://doi.org/10.1016/j.molcatb.2013.05.011 http://doi.org/10.1016/j.molliq.2015.04.052 https://doi.org/10.1073/pnas.0608432104 https://doi.org/10.1073/pnas.0608432104 http://doi.org/10.1093/bioinformatics/bth248 http://doi.org/10.1073/pnas.90.9.3835 http://doi.org/10.1073/pnas.90.9.3835 https://doi.org/10.1134/s0036024419060050 https://doi.org/10.1134/s0036024419060050 introduction computational details results and discussion conclusions papers in physics, vol. 11, art. 110007 (2019) received: 1 june 2019, accepted: 23 august 2019 edited by: a. b. márquez licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.110007 www.papersinphysics.org issn 1852-4249 theory of terahertz smith–purcell radiation from a cylindrical grating z. rezaei,1∗ b. farokhi1 an analysis of an annular electron beam propagating along a cylindrical grating with external magnetic field b0 is presented. the grating comprises a dielectric in its slots. the dispersion relation of the modes is derived. the results demonstrate that the dielectric shifts the frequencies of the system modes to smaller values. the growth rates of the modes which are in phase with the beam are also considered. it is found that the decline in the growth rate is brought about by the dielectric. in addition, increasing the thickness of the dielectric and decreasing the height of the slots cause it to rise. the effect of beam thickness on growth rate is considered too. this is shown to increase and then fall as beam thickness increases. these results show that utilizing cylindrical grating loaded with dielectric has a promising effect on developing new kinds of compact high-efficient thz free-electron lasers based on smith–purcell radiation. i. introduction plasma and beam devices are employed in amplifiers, oscillators, charged particle accelerators, and high power sources of electromagnetic radiation. they are also used to transport electromagnetic energy and charged particles, and for basic plasma physics research [1]. when the electron beam passes near the grating surface (periodic structure), spontaneous radiations may be excited. this periodic structure can be a metallic corrugated surface with spatial periodicity d and corrugation height h. this radiation was first observed by smith and purcell [2] in 1953. the smith–purcell radiation (spr) is a tunable electromagnetic source, which is described by ∗e-mail: z-rezaei@phd.araku.ac.ir 1 department of physics, faculty of science, arak university, arak, p.o. box 38156-8-8349, iran λ = d n ( 1 β − cos θ ) , (1) where λ is the wavelength of the radiated wave from the grating, n is the order of this radiation, β is the relative velocity of the charge, and θ is the direction of the radiated wave with respect to this charge. in the far-infrared or terahertz (thz) region, several theories have been proposed to describe the operation of spr, and also its application, in a free electron laser (fel). schachter and ron proposed a theory based on the interaction of an electron beam with a wave traveling along the grating. they used some approximation to evaluate the reflection matrix of the grating and found a cubic equation for growth rate, which is consistent with cherenkov fel [3]. urata et al. considered this phenomenon experimentally, and observed high power coherent superradiant sp emission in the far infrared (30−100 µm) region [4]. sp radiation in the ultraviolet and near infrared regions was also detected by y. neo et al. [5]. kim and song, using the interaction of the electrons with a traveling wave, solved the initial problem of the sheet-beam and found a 110007-1 papers in physics, vol. 11, art. 110007 (2019) / z. rezaei et al. quadratic equation for the exponential growth rate [6]. later andrews and brau explained urata’s experiment as bunching of the beam electrons due to the interaction of an evanescent wave with this sufficiently high-current beam. they also derived the gain of this radiation, which had cube root dependence on the beam current [7]. also in 2004, freund et al. [8] developed a linear theory of a gratingcoupled smith–purcell traveling wave in a parallel plate wave guide. they found the linearized dispersion relation for the vacuum structures and the wave–particle interaction in an arbitrary magnetic field. then, d. li et al. performed simulation and confirmed the theory of andrews and coworkers about the mechanism of superradiation, which happens at integer multiples of the bunching frequency [9]. in addition, the growth rate of sp–fel was considered by d. n. klochkov et al. and found to be proportional to the square root of the sheet electron beam current [10]. loading dielectric is an important physical mechanism which has been successfully applied to some high power microwave and terahertz systems. a dielectric-loaded grating for 3d smith–purcell rectangular device is proposed by cao et al. [11, 12]. they used 3d particle-in-cell simulation to find the dispersion relations at the operating point, and the growth rates. w. liu et al. considered a rectangular grating filled with dielectric. they obtained the minimum current for starting the sp oscillator and deduced that the dielectric will decrease this current. also, they explained the effect of changing the beam parameters on the growth rate [13]. there is no edge effect issue in cylindrical grating driven by an annular beam. therefore, as the cylindrical gratings are more efficient, with fewer losses than rectangular ones, they are more applicable in radiation sources and considered in many types of research. h. p. bluem et al. worked on a cylindrical grating exposed by an annular beam. they observed both superradiance and sp radiation [14]. s. hasegawa et al. considered a cylindrical corrugation in a waveguide. they reported bwo operation, excited by a cylindrical surface wave in k-band signal region. also, they increased the voltage of the beam and observed sp radiation in the u-band and e-band of frequency, which was the result of interaction between the higher modes of the waveguide [15]. here we present a linear theory of an annular figure 1: the cross section view of the grating, filled with dielectric. in addition, the annular electron beam is drifting along the axial direction with an externally magnetic field b0. electron beam, magnetized, propagating along a cylindrical metallic grating. the slots of this grating are filled with dielectric. the results of this paper highlight the basic problem of developing sp– fel based on cylindrical grating loaded with dielectric. for simplicity, we assume that the system is uniform in the direction parallel to the slots of the grating. the fundamental dynamical equations are presented in section ii. the results and discussion are given in section iii. the conclusions are considered at the end. ii. theory consider a cylindrical grating which is made of an ideal metal. the inner and outer radii of the grating are r1 and r2, respectively as illustrated in fig. 1. as shown in this figure, d denotes the period of the grating, l is the length of the slot openings (which will be filled with dielectric) and h is the depth of the slots (r2 −r1). an annular electron beam with inner radius a1 and outer radius a2 = a1 + ∆ in a uniform static axial magnetic field b0 is drifting with velocity v0 along the axis of the grating and very close to it. we assume that there is no transverse disturbed movement in the electron beam. in addition, for simplicity, we assume that the system is uniform in the ϕ direction. i. dispersion relation the dispersion relation of the modes of this system is the result of considering maxwell’s equations with the continuity equation and the relativistic 110007-2 papers in physics, vol. 11, art. 110007 (2019) / z. rezaei et al. momentum equation for electron beams: ∇×e = − ∂b ∂t , (2) ∇×b = µj + 1 c2 ∂e ∂t , ∂n ∂t + ∇ · (nv) = 0, γ3m0n0 [ ∂ ∂t v + v0 ·∇v ] = −en0(ez + v×b). as a consequence, we expand all quantities in terms of an unperturbed part plus a small perturbation as follows: n = n0 +δn, v = v0 +δvz, j = j0 +δj, e = δe and b = b0ẑ + δb, where n and v are the electron density and velocity, respectively. unperturbed beam density, n0, is uniform and timeindependent, δe and δb are the electric and magnetic fields, j = env is the density of current, γ = (1 − v2/c2)1/2 is the relativistic factor, and c is the velocity of light in free space. the perturbed density of current is δj = −e(nδvz + δnv0), (3) which, with the help of the continuity and momentum equations is defined as δj = iωε0 γ3 ω2p (ω −knv0)2 δez. (4) here, ω2p = ne2 m0ε0 is the relativistic beam plasma frequency. the fields in the electron beam part will be derived from this wave equation ∇2δb + ω2 c2 επδb = 0, (5) where επ = 1− ω 2 p γ3(ω−knv0)2 is the relative dielectric constant in the electron beam [12]. also, in region 1 and 3, where there is no electron beam, the electromagnetic fields are the solution of the wave equation of this form (∇2⊥ + β 2 n) { δez δbz = 0. (6) we suppose that the tm mode propagates in this device. and, by applying floquet’s theorem, the radiation fields take the general form δf(r, t) = ∞∑ n=0 δfn(r)e i(knz−ωt), (7) where fn, kn = k0 + 2nπ d , and ω represent the fourier coefficient, wave number in the axial direction and frequency of the nth mode, respectively. region 1 this region is the vacuum above the electron beam. so the fields in a2 < r can be expressed as below ez(r,z) = ∞∑ n=−∞ bnk0(βnr)e i(knz−ωt), (8) bϕ(r,z) = ∞∑ n=−∞ iω c2βn nnk1(βnr)e i(knz−ωt). (9) region 3 in this region, r2 < r < a1, the fields are ez(r,z) = (10) ∞∑ n=−∞ [cnl0(βnr) + dnk0(βnr)]e i(knz−ωt), bϕ(r,z) = (11) ∞∑ n=−∞ −iω c2βn [cnl1(βnr) −dnk1(βnr)]ei(knz−ωt). region 2 in the electron beam region, a1 < r < a2, the evanescent waves are the solution of eq. (5), and have the following forms ez(r,z) = (12) ∞∑ n=−∞ i rωε0επ {gn[rκ1nl1(κ1nr) + l0(κ1nr)] + fn[−rκ1nk1(κ1nr) + k0(κ1nr)]}ei(knz−ωt), bϕ(r,z) = (13) ∞∑ n=−∞ µ{gnl0(κ1nr) + fnk0(κ1nr)}ei(knz−ωt). where, κ1n = √ k2n −ω2επ/c2. 110007-3 papers in physics, vol. 11, art. 110007 (2019) / z. rezaei et al. regions 4 and 5 the slot openings (region 4) are filled with dielectric εr. the solutions of the wave equation in this region are ez(r,z) = ∞∑ m=0 em [ h0(τmr) (14) − h0(τmr1) g0(τmr1 g0(τmr) ] cos (mπ l z ) , bϕ(r,z) = ∞∑ m=0 em iωεr c2τm [ h́0(τmr) (15) − h0(τmr1) g0(τmr1 ǵ0(τmr) ] cos (mπ l z ) . where, h0(τmr) = { j0(τmr) l0(τ́mr) , g0(τmr) = { n0(τmr) k0(τ́mr) , (16) h́0(τmr) = { −j1(τmr) l1(τ́mr) , ǵ0(τmr) = { −n1(τmr) −k1(τ́mr) , (17) kz = mπ l , τm = √ εr ω2 c2 −k2z > 0, τ́m = √ k2z −εr ω2 c2 > 0. (18) as we assumed k0d < 2π, it is enough to keep just one mode (m = 0) in the slots, so the standing waves will be the fields in this part of the system. also, there is no field in the ideal metal of region 5. after applying the continuity conditions for the fields in the border of regions 1–2, 2–3 and 3–4, the dispersion relation will be as below r(ω,kn,ε π) = 0. (19) in which, r(ω,kn,ε π) = ∞∑ n=−∞ { −1 k2nβnd [2 − 2 cos(knl)] × [ h0(τ0r2) − h0(τ0r1) g0(τ0r1) g0(τ0r2) ] × αai1(βnr2) + αbk1(βnr2) αai0(βnr2) −αbk0(βnr2) (20) − εrl τ0 [ h́0(τ0r2) − h0(τ0r1) g0(τ0r1) ǵ0(τ0r2) ]} , αa = −α1α6 + α2α5, αb = −α3α6 + α4α5, α1 = k1(βna1) βnεπa1 [a1κ1ni1(κ1na1) + i0(κ1na1)] + k0(βna1)i0(κ1na1), (21) α2 = k1(βna1) βnεπa1 [−a1κ1nk1(κ1na1) + k0(κ1na1)] + k0(βna1)k0(κ1na1), (22) α3 = − i1(βna1) βnεπa1 [a1κ1ni1(κ1na1) + i0(κ1na1)] + i0(βna1)i0(κ1na1), (23) α4 = − i1(βna1) βnεπa1 [−a1κ1nk1(κ1na1) + k0(κ1na1)] + i0(βna1)k0(κ1na1), (24) α5 = k1(βna2) βnεπa2 [a2κ1ni1(κ1na2) + i0(κ1na2)] + k0(βna2)i0(κ1na2), (25) α6 = k1(βna2) βnεπa2 [−a2κ1nk1(κ1na2) + k0(κ1na2)] + k0(βna2)k0(κ1na2). (26) if there is no beam (α1 = α5,α2 = α6), the dispersion relation will become as r(ω0,k0, 1) = ∞∑ n=−∞ { 1 k2nβnd [2 − 2 cos(knl)] × [ h0(τ0r2) − h0(τ0r1) g0(τ0r1) g0(τ0r2) ] k1(βnr2) k0(βnr2) − εrl τ0 [ h́0(τ0r2) − h0(τ0r1) g0(τ0r1) ǵ0(τ0r2) ]} = 0. (27) this is similar to the dispersion relation in [14] and [16] in the limit of εr = 1. 110007-4 papers in physics, vol. 11, art. 110007 (2019) / z. rezaei et al. figure 2: (a) comparison of dispersion relations for different εrs (solid curves). the beam line with voltage 20 kev is also plotted for reference. the growth rate (δ) for each curve, near the intersection (the circles), is indicated by a dash line. (b) the effect of relative dielectric εr on the growth rate. ii. growth rate so far, the dispersion relation of the modes in this configuration has been derived. one of these modes can grow if it is in resonance with the electron beam. so, we assume that the frequency of this mode is ω = ωr + δ. then the taylor expansion of the dispersion relation about the synchronous point (ωr,kr) will become r(ω,kn,ε π) = r(ωr,kr, 1) (28) + (ω −ωr) ∂r(ω,kn,ε π) ∂ω ∣∣∣∣ (ωr,kr,1) + (επ − 1) ∂r(ω,kn,ε π) ∂επ ∣∣∣∣ (ωr,kr,1) . by assuming that δ is small, the equation below will be found ( r0x 2 − ω2p γ3 ŕεπ ) + (2xr0 + x 2ŕω)δ + 2xŕωδ 2 = 0. (29) in which, x = ωr −krv, ŕεπ = ∂r(ω,kn,ε π) ∂επ ∣∣∣∣ (ωr,kr,1) , ŕω = ∂r(ω,kn,ε π) ∂ω ∣∣∣∣ (ωr,kr,1) , r0 = r(ωr,kr, 1). (30) the growth will occur if δ, the solution of eq. (29) is imaginary and positive. iii. results and discussion by assuming no beam in the system, the dispersion relation is calculated by solving eq. (27) numerically. the grating parameters are as follows: r1 = 240 µm, r2 = 400 µm, l = 80 µm, d = 160 µm, a1 = 400 µm, a2 = 480 µm and the beam energy is 20 kev, corresponding to the parameters chosen by y. zhou et al. [16]. the effect of εr on the dispersion relation has been shown in fig. 2(a). it is clear that increasing the εr results in smaller height of the dispersion relation. this means that the modes are propagating with smaller velocities in the system. the intersection points of beamwave also move down. in this figure, the corresponding growth rate for each curve is indicated by dash lines (of the same color). maximum growth 110007-5 papers in physics, vol. 11, art. 110007 (2019) / z. rezaei et al. rates occur in the vicinity of the synchronous points (ωr = krv0), and have the values: 1.279, 1.129 and 1.065 for εr = 1, 1.5 and 2.7, respectively. the influence of dielectric on the growth rate is clearer in fig. 2(b), which indicates that dielectrics with higher relative permittivities cause smaller values for the growth rate. the grating parameters are considered in fig. 3, fig. 4 and fig. 5, when εr = 2.7. in fig. 3 the slot depth has been changed. as depth increases, the dispersion relation becomes flatter, indicating that the effect of grating is increasing. the normalized maximum growth rate happens when resonance between the beam and the modes is possible (the circles). so, in these points δ = 1.192, 1.066 and 0.493 when h = 100 µm, 160 µm and 250 µm, respectively. figure 4 indicates how dielectric thickness has an effect on growth rate. again, lower frequency modes result from increasing dielectric thickness. however, this time the growth rate will increase by this effect: δ = 0.959, 1.232 and 1.295 for l = 30 µm, 80 µm and 110 µm, respectively. the effect of beam thickness (∆) on growth rate is depicted in fig. 5. first, increasing ∆ causes the growth rate to rise. its maximum value is 1.066 at ∆ = 80 µm. this happens because more electrons can participate in the beam wave interaction. then the growth rate falls. this can be justified by the figure 3: comparison of dispersion relation for different grating heights, when εr = 2.7. the growth rate corresponding to each frequency curve is plotted by dash lines, and maximum values are indicated at the circle points. figure 4: comparison of the dispersion relations (solid lines) and the growth rates (dash lines) for different thicknesses of the dielectric εr = 2.7. fact that although the thickness is increasing, the electrons which are far from the grating contribute less to the interaction. iv. conclusions in this paper, a metallic cylindrical grating filled with a dielectric is proposed. the dispersion relation of the modes propagating in this configuration with an annular electron beam is derived. it is shown that the dielectric causes modes with smaller frequencies, in comparison with results when it is absent. then, the growth rate of modes which figure 5: dependence of the growth rate on beam thickness. εr = 2.7, r1 = 240 µm, r2 = 400 µm, l = 80 µm, d = 160 µm, a1 = 400 µm. 110007-6 papers in physics, vol. 11, art. 110007 (2019) / z. rezaei et al. are in resonance with the beam is considered. it is found that the growth rate is under the influence of dielectric relative permittivity εr, the depth of the slots of the grating and the thickness of the dielectric (the width of the slots). a lower growth rate is the result of increasing the parameters of the dielectric relative permittivity and slot depth, and decreasing the thickness of the dielectric. also, beam thickness can increase and decrease the growth rate, depending on its amount. as we can see, by changing the grating parameters, as well as dielectric permittivity and thickness, the growth rate and operating frequencies of the device can be controlled. so, it is possible to make sp–fels with the desired frequencies and powers. these results can be of considerable interest for thz wave source research. [1] b maraghechi, b farokhi, j e willett, theory of high-frequency waves in a coaxial plasma wave guide, phys. plasmas 6, 3778 (1999). [2] s j smith, e m purcell, visible light from localized surface charges moving across a grating, phys. rev. 92, 1069 (1953). [3] l schachter, a ron, smith–purcell freeelectron laser, phys. rev. a 40, 876 (1989). [4] j urata, m goldstein, m f kimmitt, a naumov, c platt, j e walsh, superradiant smith– purcell emission, phys. rev. lett. 80, 516 (1998). [5] y neo, h shimawaki, t matsumoto, h mimura, smith–purcell radiation from ultraviolet to infrared using a si field emitter, j. vac. sci. technol. b 24, 924 (2006). [6] k j kim, s b song, self-amplified spontaneous emission in smith–purcell free-electron lasers, nucl. instrum. methods phys. res. sect. a, 475, 158 (2001). [7] h l andrews, c a brau, gain of a smith– purcell free-electron laser, phys. rev. spec. top. accel. beams 7, 70701 (2004). [8] h p freund, t m abu-elfadl, linearized field theory of a smith–purcell traveling wave tube, ieee trans. plasma sci. 32, 1015 (2004). [9] d li, z yang, k imasaki, g-s park, particlein-cell simulation of coherent and superradiant smith–purcell radiation, phys. rev. spec. top. accel. beams 9, 040701 (2006). [10] d n klochkov, a i artemyev, k b oganesyan, y v rostovtsev, m o scully, c-k hu, the dispersion equation of the induced smith–purcell instability, phys. scr. t140, 014049 (2010). [11] m cao, w liu, y wang, k li, threedimensional theory of smith–purcell freeelectron laser with dielectric loaded grating, j. appl. phys. 116, 103104 (2014). [12] m cao, w liu, y wang, k li, dispersion characteristics of three dimensional dielectricloaded grating for terahertz smith–purcell radiation, phys. plasmas 21, 23116 (2014). [13] w liu, m cao, y wang, k li, start current of dielectric-loaded grating in smith–purcell radiation, phys. plasmas 23, 33104 (2016). [14] h p bluem, r jackson, j d jarvis, a m m todd, j gardelle, p modin, j t donohue, first lasing from a high-power cylindrical grating smith–purcell device, ieee trans. plasma sci. 43, 9 (2015). [15] s hasegawa, k ogura, t lwasaki, k yambe, smith–purcell radiation based on cylindrical surface waves, fusion sci. technol. 63, 259 (2017). [16] y zhou, y zhang, s liu, electron-beam-driven enhanced terahertz coherent smith–purcell radiation within a cylindrical quasi-optical cavity, ieee trans. thz sci. technol. 6, 262 (2016). 110007-7 papers in physics, vol. 3, art. 030002 (2011) received: 8 march 2011, accepted: 30 may 2011 edited by: l. viña reviewed by: r. gordon, department of electrical and computer engineering, university of victoria, british columbia, canada. licence: creative commons attribution 3.0 doi: 10.4279/pip.030002 www.papersinphysics.org issn 1852-4249 plasmon-enhanced second harmonic generation in semiconductor quantum dots close to metal nanoparticles pablo m. jais,1, 2∗ catalina von bilderling,1, 3† andrea v. bragas1, 2‡ we report the enhancement of the optical second harmonic signal in non-centrosymmetric semiconductor cds quantum dots, when they are placed in close contact with isolated silver nanoparticles. the intensity enhancement is about 1000. we also show that the enhancement increases when the incoming laser frequency ω is tuned toward the spectral position of the silver plasmon at 2ω, proving that the silver nanoparticle modifies the nonlinear emission. i. introduction second harmonic generation (shg) is the second order nonlinear process for which two photons with the same frequency ω interact simultaneously with matter to generate a photon of frequency 2ω. the nonlinear response of a material to an applied field becomes evident in the presence of intense fields, as the one given by a focused pulsed laser. due to symmetry requirements for second order nonlinear processes, the leading dipolar term of the shg is forbidden for centrosymmetric materials. however, this condition is relaxed at surfaces, ∗e-mail: jaisp@df.uba.ar †e-mail: catalina@df.uba.ar ‡e-mail: bragas@df.uba.ar 1 instituto de f́ısica de buenos aires, conicet, argentina. 2 laboratorio de electrónica cuántica, departamento de f́ısica, facultad de ciencias exactas y naturales, universidad de buenos aires, intendente güiraldes 2160, pabellón i ciudad universitaria, buenos aires, c1428eha, argentina. 3 centro de microscoṕıas avanzadas, facultad de ciencias exactas y naturales, universidad de buenos aires, argentina. where the symmetry is broken. on the other hand, for a small and perfectly spherical particle of any type of material under homogeneous illumination, the shg from the surface is also zero since the geometrical shape recovers the symmetry [1]. the presence of inhomogeneities in the electromagnetic field recovers the shg even for symmetric materials, both from bulk and surface, as pointed out by brudny et al. [1]. those inhomogeneities may be, for instance, the consequence of a strong focusing of the field, the proximity of a surface or the presence of another particle. besides, electronic resonances and enhancement of the field around the particle will increase the shg and, in many cases, would make it detectable. it is very well-known, as one of the main results of nano-optics, that the field around a metal nanoparticle is confined and enhanced when the incoming wavelength is resonant with the surface plasmons sustained by the nanoparticle. this effect is exploited in many different applications of metal nanoparticles: volume reducers for diffusion measurements [2], optical probes for high resolution imaging [3–5], biosensors [6], nanoheaters [7] and surface enhanced raman scattering (sers) [8, 9], among others. in the present paper, we show the enhancement 030002-1 papers in physics, vol. 3, art. 030002 (2011) / p. m. jais et al. figure 1: experimental setup. pmt: photomultiplier tube; m1 to m3: mirrors, l1 to l3: lenses. of the shg when non-centrosymmetric cds quantum dots are placed in close contact with silver nanoparticles. the plasmon of the nanoparticles is resonant at the second harmonic (sh) frequency and enhances the emission. as far as we know, there are few works on the enhancement of shg (either in semiconductor qds or in molecules) produced by the resonant excitation of metal nanoparticles [10–13]. ii. materials and methods the experiment was performed in transmission, by performing spectrally-resolved photon counting (fig. 1). the laser was a kmlabs tunable modelocked ti:sapphire with 50 fs pulse width, 400 mw average power, 80 mhz repetition rate and tunable in the range 770–805 nm. the resolution of the monochromator was 2 nm in all experiments. silver nanoparticles (nps), of average diameter 20 nm, were synthesized as in marchi et al. [14], while the cds quantum dots (qds), 3 nm in diameter, as in frattini et al. [15]. the absorption spectra for both nps and qds in solution are shown in fig. 2. the arrows mark the first and second excitonic transitions in the qds, at (362 ± 4) nm and (425 ± 15) nm, obtained by the second derivative method [16]. figure 2 also shows the two-photon photoluminescence (tppl) spectrum of the qds in solution, excited at λ = 780 nm and measured with the setup shown in fig. 1. a stokes shift was observed in the tppl, in accordance with the same effect reported for the (one 350 400 450 500 550 600 0.0 0.2 0.4 0.6 0.8 1.0 qd abs. qd tppl agnp ext. in te ns ity ( a. u. ) wavelength (nm) figure 2: absorption (red line) and two-photon photoluminescence, tppl (red squares), of the cds qds in ethanol solution. the dotted blue line is the extinction of the ag np in aqueous solution. photon) photoluminescence of cds qds [17]. np samples were prepared by placing a drop of the np solution on a glass substrate until the solvent evaporated. the average np concentration was ≈ 10 particles per µm2. however, optical and sem images revealed that the metal nps are inhomogeneously distributed on the substrate, with a high concentration of particles in the fringe (drop boundary). a cds qd sample was prepared according to the same protocol. however, the qd solution has a very high concentration of aminosilanes. they form a thick layer when dried, so the qds are immersed in an aminosilane matrix. the estimated height of this layer was ≈ 5 µm, and the average qd concentration was ≈ 105 particles per µm2 (inferred from afm images taken at a 1:1000 diluted concentration). for the mixed np-qd sample, we first dried a drop of the metal np solution, and then we deposited a drop of cds qd solution on top of it. the resulting structure is schematically shown in fig. 3. 030002-2 papers in physics, vol. 3, art. 030002 (2011) / p. m. jais et al. figure 3: diagram of the np-qd sample. iii. results and discussion for the silver np samples we did not detect sh signal above the noise level of the experiment (of about 200 counts per second). this result agrees with the fact that silver is a centrosymmetric material and the nps are almost spherical, even considering that their interaction should produce a weak sh signal. on the other hand, fig. 4a shows that the cds qd sample presents a small sh peak (as expected for a non-centrosymmetric material) and a strong tppl signal. the central wavelength of the excitation laser for this measurement was 784 nm, and its fwhm was 26 nm. however, a strong sh signal was measured in the mixed np-qd sample, as shown in fig. 4b. the shg was 20 times larger in this sample, and the sh/tppl ratio increased significantly. in addition, in the mixed sample, the tppl decayed after a few seconds of laser irradiation while the shg remained stable over time. in this case, the central wavelength was 799 nm, and the fwhm was also 26 nm. the observations can be attributed to the enhancement of the nonlinear field around the qds due to silver np plasmon resonance at 2ω.the incoming light at ω is not enhanced by the silver np, but creates a weak nonlinear signal at 2ω in the qds. this outgoing signal is enhanced by the field enhancement factor (η), so that the shg intensity is enhanced by η2. note that in the analogous case of sers, the incoming and outgoing signals are enhanced giving an intensity enhancement of η4 [18]. 350 400 450 500 0 10 20 30 40 x 5 tppl sh (a)cds qds in te ns ity ( 10 3 co un ts /s ) wavelength (nm) 400 450 500 550 tppl sh (b)ag np-cds qds figure 4: sh and tppl spectra for (a) the cds qd sample (multiplied by 5) and (b) the mixed ag np-cds qd sample. note that the sh peak was 20 times higher in the mixed sample, and the ratio between the sh and the tppl changes drastically. the magnitude of this enhancement is not easy to calculate since the mixed np-qd sample contains several hot-spots (usually in the concentrated np fringe), and their intensities were wildly different. nevertheless, the highest intensities recorded for the np-qd sample were ≈ 104 cts/s, while the highest ones for the qd sample were ≈ 1700 cts/s. to estimate the enhancement, we used the analytical enhancement factor (aef), defined as [19] aef = inp−qd/ρnp−qd iqd/ρqd , (1) where i and ρ are the shg intensity and the surface concentration of qds in the corresponding samples. we must take into consideration that there is a thick layer of qds that are not in the enhancement region of the nps, i.e., the measured signal in the np-qd sample is the sum of the intensity coming from qds close to the nps that are enhanced, and the intensity from qds far from the nps that are not enhanced. if we keep only the enhanced fraction, aef ≈ inp−qd − iqd iqd ρqd ρ̃np−qd , (2) 030002-3 papers in physics, vol. 3, art. 030002 (2011) / p. m. jais et al. where ρ̃np−qd is the concentration of qds near the nps. since the np enhancement decays one order of magnitude in 5 nm [20], we can take ρ̃np−qd ≈ ρqd 5 nm 5 µm . this gives aef ≈ 5 103. this is just an estimation of the order of magnitude of the enhancement, since the uncertainties in the thicknesses and maximum intensities are very high, but this estimation is consistent with the enhancements usually found in sers experiments [18]. to study the spectral dependence of the enhancement, and gain a deeper insight into the behavior of the system, the incoming laser light was tuned while keeping the laser always on the same spot. this measurement is shown in fig. 5, where the intensity is the photon flux at half the laser wavelength, normalized to the signal provided by a 40 fs pulse with 275 mw average power, according to the following formula: inorm = i δt 40 fs ( 275 mw p )2 . (3) the signal at the sh wavelength was, for each point of fig. 5, a local maximum in the spectrum. figure 5 shows that the sh from the qd sample increases toward the excitonic resonance in about a factor of 2 for the whole tuning range, a value similar to the one reported by baranov et al. [21]. it is worth noting that the contribution of the tppl to this increase is much smaller than the sh signal throughout the measured range (see fig. 4). showing a very different spectral response, the sh for the mixed np-qd system sharply increases in more than a factor of 7 close to the resonance of the silver plasmon. this measurement reinforces the hypothesis that the silver np is modifying the nonlinear response of the qds thanks to the resonant excitation of the silver np plasmons. however, it must be noted that the enhancement does not exactly follow the plasmon resonance spectrum shown in fig. 2. the reason for this is still unknown, but it is possible that the plasmon resonance was shifted due to the interaction among nps. unfortunately, the tuning range of the laser was much smaller than the spectral width of the plasmon resonance. 386 388 390 392 394 396 398 400 402 0 2 4 6 8 10 12 14 16 cds qds ag np-cds qds s h in te ns ity ( 10 3 co un ts /s ) 2 wavelength (nm) figure 5: shg as a function of the laser excitation wavelength. the horizontal scale is the second harmonic of the incident photon. red symbols show the shg for the qd sample, while the black symbols show the shg for the mixed np-qd sample. the dotted line marks the spectral position of the maximum of the plasmon resonance. iv. conclusions we have observed a strong enhancement (≈ 103) of the shg when cds qds are mixed with silver nps, compared with a cds qd sample. the spectral dependence of the enhancement shows that the np plasmons are resonantly excited by the sh emission of the qds. these measurements are evidence that the sh enhancement is mediated by nanoparticle plasmons. this effect can be applied to significantly improve traditional applications of sh measurements such as the study of surface deposition and orientation of molecules, among others. acknowledgements we thank claudia marchi for her help with the np synthesis and nora pellegri for providing the qds. we also thank alejandro fainstein for the insightful discussions. this work was supported by the university of buenos aires under grant x010 and by anpcyt under grant pict 14209. 030002-4 papers in physics, vol. 3, art. 030002 (2011) / p. m. jais et al. [1] v l brudny, b s mendoza, w l mochan, second-harmonic generation from spherical particles, phys. rev. b 62, 11152 (2000). [2] l c estrada, p f aramend́ıa, o e mart́ınez, 10000 times volume reduction for fluorescence correlation spectroscopy using nano-antennas, opt. express 16, 20597 (2008). [3] a f scarpettini, n pellegri, a v bragas, optical imaging with subnanometric vertical resolution using nanoparticle-based plasmonic probes, opt. commun. 282, 1032 (2009). [4] t kalkbrenner, m ramstein, j mlynek, v sandoghdar, a single gold particle as a probe for apertureless scanning near-field optical microscopy, j. microsc. 202, 72 (2001). [5] p anger, p bharadwaj, l novotny, enhancement and quenching of single-molecule fluorescence, phys. rev. lett. 96, 113002 (2006). [6] a j haes, r p van duyne, a nanoscale optical biosensor: sensitivity and selectivity of an approach based on the localized surface plasmon resonance spectroscopy of triangular silver nanoparticles, j. am. chem. soc. 124, 10596 (2002). [7] a g skirtach, et al., laser-induced release of encapsulated materials inside living cells, angew. chem. int. ed. 45, 4612 (2006). [8] p etchegoin, et al., new limits in ultrasensitive trace detection by surface enhanced raman scattering (sers), chem. phys. lett. 375, 84 (2003). [9] p g etchegoin, p d lacharmoise, e c le ru, influence of photostability on single-molecule surface enhanced raman scattering enhancement factors, anal. chem. 81, 682 (2009). [10] m ishifuji, m mitsuishi, t miyashita, enhanced optical second harmonic generation in hybrid polymer nanoassemblies based on coupled surface plasmon resonance of a gold nanoparticle array, appl. phys. lett. 89, 011903 (2006). [11] h a clark, et al., second harmonic generation properties of fluorescent polymer-encapsulated gold nanoparticles, j. am. chem. soc. 122, 10234 (2000). [12] s baldelli, et al., surface enhanced sum frequency generation of carbon monoxide adsorbed on platinum nanoparticle arrays, j. chem. phys 113, 5432 (2000). [13] i barsegova, et al., controlled fabrication of silver or gold nanoparticle near-field optical atomic force probes: enhancement of secondharmonic generation, appl. phys. lett. 81, 3461 (2002). [14] m c marchi, s a bilmes, g m bilmes, photophysics of rhodamine b interacting with silver spheroids, j. colloid interface sci. 218, 112 (1999). [15] a frattini, n pellegri, d nicastro, o de sanctis, effect of amine groups in the synthesis of ag nanoparticles using aminosilanes, mater. chem. phys. 94, 148 (2005). [16] a i ekimov, et al., absorption and intensitydependent photoluminescence measurements on cdse quantum dots: assignment of the first electronic transitions, j. opt. soc. am. b 10, 100 (1993). [17] z yu, et al., large resonant stokes shift in cds nanocrystals, j. phys. chem. b 107, 5670 (2003). [18] e c le ru, p g etchegoin, principles of surface-enhanced raman spectroscopy, elsevier science ltd, amsterdam (2008). [19] e c le ru, e blackie, m meyer, p g etchegoin, surface enhanced raman scattering enhancement factors: a comprehensive study, j. phys. chem. c 111, 13794 (2007). [20] s a maier, plasmonics: fundamentals and applications, chap. 5.1, springer science, new york (2007). [21] a v baranov, et al., resonant hyper-raman and second-harmonic scattering in a cds quantum-dot system, phys. rev. b 53, r1721 (1996). 030002-5 papers in physics, vol. 4, art. 040002 (2012) received: 29 august 2011, accepted: 29 february 2012 edited by: j-p. hulin licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.040002 www.papersinphysics.org issn 1852-4249 beltrami flow structure in a diffuser. quasi-cylindrical approximation rafael gonzález,1, 2∗ ricardo page,3 andrés s. sartarelli1 we determine the flow structure in an axisymmetric diffuser or expansion region connecting two cylindrical pipes when the inlet flow is a solid body rotation with a uniform axial flow of speeds ω and u, respectively. a quasi-cylindrical approximation is made in order to solve the steady euler equation, mainly the bragg–hawthorne equation. as in our previous work on the cylindrical region downstream [r gonzález et al., phys. fluids 20, 24106 (2008); r. gonzález et al., phys. fluids 22, 74102 (2010), r gonzález et al., j. phys.: conf. ser. 296, 012024 (2011)], the steady flow in the transition region shows a beltrami flow structure. the beltrami flow is defined as a field vb that satisfies ωb = ∇× vb = γvb, with γ = constant. we say that the flow has a beltrami flow structure when it can be put in the form v = uez + ωreθ + vb, being u and ω constants, i.e it is the superposition of a solid body rotation and translation with a beltrami one. therefore, those findings about flow stability hold. the quasi-cylindrical solutions do not branch off and the results do not depend on the chosen transition profile in view of the boundary conditions considered. by comparing this with our earliest work, we relate the critical rossby number ϑcs (stagnation) to the corresponding one at the fold ϑcf [j. d. buntine et al., proc. r. soc. lond. a 449, 139 (1995)]. i. introduction we have recently conducted studies on the formation of kelvin waves and some of their features when an axisymmetric rankine flow experiences a soft expansion between two cylindrical pipes [1, 2]. one of the significant characteristics of this phenomenon is that the downstream flow ∗e-mail: rgonzale@ungs.edu.ar 1 instituto de desarrollo humano, universidad nacional de general sarmiento, gutierrez 1150, 1613 los polvorines, pcia de buenos aires, argentina. 2 departamento de f́ısica fceyn, universidad de buenos aires, pabellón i, ciudad universitaria, 1428 buenos aires, argentina . 3 instituto de ciencias, universidad nacional de general sarmiento, gutierrez 1150, 1613 los polvorines, pcia de buenos aires, argentina. shows a rankine flow superposing a beltrami flow (beltrami flow structure [4])). yet, upstream and downstream cylindrical geometries were considered without taking into account the flow in the expansion. this work considered that the base upstream flow, formed by a vortex core surrounded by a potential flow, would have the same beltrami structure at the expansion and downstream. nevertheless, the flow at the expansion was not determined. however, it has been seen that this flow is only possible when no reversed flow is present and if its parameters do not take the values where a vortex breakdown appears [6–8]. the starting point in the study of the expansion flow is an axysimmetric steady state resulting from the bragg–hawthorne equation [7, 9–11] for both the vortex breakdown and the formation of waves. therefore, the solution behavior, whether it branches off or shows a possible stagnation point on the axis, will be deter040002-1 papers in physics, vol. 4, art. 040002 (2012) / r gonzález et al. minant to delimit both phenomena. our previous research focused on the formation of kelvin waves with a beltrami flow structure downstream [1–3], when the upstream flow was a rankine one. this present investigation considers only a solid body rotation flow with uniform axial flow at the inlet. as a first step in the study of the flow at the expansion, we only study the rotational flow. however, comparisons with our previous work [1] will be drawn. the aim of this present work is to obtain the steady flow structure at the expansion, considering a quasi-cylindrical approximation when the inlet flow is a solid body rotation with uniform axial flow of speeds ω and u, respectively. if a is the radius of the cylindrical region upstream, a relevant parameter is the rossby number ϑ = u ωa . thus, we would like to determine how this flow depends on the rossby number, on the geometrical parameters of the expansion and on the critical values of the parameters. we focus on finding the parameter values for which a stagnation point emerges on the axis, or for which the solution of the bragg– hawthorne equation branches off. we take them as the conditions for the vortex breakdown to develop. first, this paper presents the inlet flow and the corresponding bragg–hawthorne equation written for the transition together with the boundary conditions in section ii. second, it works on the quasi-cylindrical approximation for the bragg– hawthorne equation and its solution is developed in section iii. third, results and discussions are offered in section iv together with a comparison with our previous work [1]. finally, conclusions are presented in section v. ii. the bragg–hawthorne equation we assume an upstream flow in a pipe of radius a as an inlet flow in an axisymmetric expansion of length l connecting to another pipe with radius b, b > a. the inlet flow filling the pipe consists of a solid body rotation of speed ω with a uniform axial flow of speed u: v = uez + ωreθ, (1) u and ω being constants. the equilibrium flow in the whole region is determined by the steady euler equation which can be written as the bragg– hawthorne equation [10] ∂2ψ ∂z2 + r ∂ ∂r ( 1 r ∂ψ ∂r ) + r2 ∂h ∂ψ + c ∂c ∂ψ = 0, (2) where ψ is the defined stream function vr = − 1 r ∂ψ ∂z , vz = 1 r ∂ψ ∂r , (3) and h(ψ),c(ψ) are the total head and the circulation, respectively h(ψ) = 1 2 (v2r + v 2 θ + v 2 z) + p ρ , c(ψ) = rvθ. (4) to solve eq. (2), the boundary conditions must be established. these consist of giving the inlet flow, of being both the centerline and the boundary wall, streamlines, and of being the axial velocity positive (vz > 0). for the upstream flow, the stream function is ψ = 1 2 ur2, and h(ψ),c(ψ) are given by h(ψ) = 1 2 u2 + ωγψ, c(ψ) = γψ, (5) γ = 2u ω being the eigenvalue of the flow with beltrami structure [3]. thus, by considering the inlet flow, eqs. (5) are valid for the whole region. the second condition regarding the streamlines implies the following relations ψ(r = 0,z) = 0, ψ(r = σ(z),z) = 1 2 ua2, 0 ≤ z ≤ l (6) where r = σ(z) gives the axisymmetric profile of the pipe expansion. deducing from eq. (6), the boundary conditions are determined by the inlet flow. additionally, curved profiles are considered, so ∂ψ ∂z (r,z = l) = 0, 0 ≤ r ≤ b. (7) 040002-2 papers in physics, vol. 4, art. 040002 (2012) / r gonzález et al. iii. quasi-cylindrical approximation if we consider that ∂ 2ψ ∂z2 = 0, the solutions to eqs. (2) and (5) for the cylindrical regions are given by [10] ψ = 1 2 ur2 + arj1[γr], (8) where a is a constant. the quasi-cylindrical approximation consists of taking the dependence of a(z) on z but with the condition ∂ 2ψ ∂z2 ≈ 0 compared with the remaining terms of (2). the amplitude a(z) is then obtained by imposing the boundary conditions (6) which depend on the wall profile r = σ(z), giving a(z) = 1 2 u ( a2 −σ2(z) ) σ(z)j1[γσ(z)] . (9) by using the dimensionless quantities r̃ = r a , z̃ = z a , ṽ = v u the stream function in the quasicylindrical approximation can be written as ψ̃ = 1 2 r̃2 + ã(z̃)r̃j1[ 2 ϑ r̃], ã(z̃) = 1 2 ( 1 − σ̃2(z̃) ) σ̃(z̃)j1[ 2 ϑ σ̃(z̃)] , (10) where ϑ = u ωa is the rossby number. hence the velocity field becomes ṽr(r̃, z̃) = −ã ′ (z̃)j1[ 2 ϑ r̃] (11) ṽθ(r̃, z̃) = 1 ϑ r̃ + 2 ϑ ã(z̃)j1[ 2 ϑ r̃] (12) ṽz(r̃, z̃) = 1 + 2 ϑ ã(z̃)j0[ 2 ϑ r̃], (13) where ã ′ (z̃) = dã(z̃)/dz̃. finally, it is necessary to give the wall profile σ̃(z) to completely determine the flow. two kinds of profiles were seen: iconical profile σ̃(z̃) = 1 + ( η − 1 l̃ ) z̃, 0 ≤ z̃ ≤ l̃ and η = b a . (14) iicurved profile σ̃(z̃) = 1 + η 2 − ( η − 1 2 ) cos ( πz̃ l̃ ) , 0 ≤ z̃ ≤ l̃. (15) the latter meets the boundary condition (7) as well. therefore, eqs. (11-15) together with the boundary conditions (6,7) allow to determine the flow structure for both the conical and curved wall profile. iv. results and discussion we note that the flow keeps a beltrami flow structure in the quasi-cylindrical approximation. effectively, giving (11-13) ṽr(r̃, z̃) = ṽbr(r̃, z̃) (16) ṽθ(r̃, z̃) = 1 ϑ r̃ + ṽbθ(r̃, z̃) (17) ṽz(r̃, z̃) = 1 + ṽbz(r̃, z̃), (18) it is easy to see that under this approximation ∇× vb(r̃, z̃) = 2 ϑ vb(r̃, z̃) and so, the whole flow is the sum of a solid body rotation flow with a uniform axial flow plus a beltrami flow, given the latter in a system with uniform translation velocity u = 1.ẑ and uniform rigid rotation velocity v = 1 ϑ r̃θ̂. given the flow field and its structure, the parameters are considered by evaluating the behavior of ṽz(r̃, z̃0) with z̃0 = l̃ i.e., taken at outlet, and with l̃ = 1. in order to do so, a wall profile is selected (14 or 15) and three different values of the expansion parameter are taken, mainly η1 = 1.1, η2 = 1.2 and η3 = 1.3. the first step is to analyze the flow dependence on the rossby number. in fig. 1, the contour flows corresponding to the conical and curved profiles for η1 = 1.1, ϑ1 = 0.695 are shown. graphics in fig. 2 represent the same configuration but for ϑ = 0.68 < ϑ1. the broken lines represent points for which ṽz = 0. inflow and recirculation are present but it is not a real flow because the model fails when considering inflow. it can be seen 040002-3 papers in physics, vol. 4, art. 040002 (2012) / r gonzález et al. 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 z r 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 z r figure 1: contour flow in the transition region for conical and curved profiles for η1 = 1.1, ϑ1 = 0.695. 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 z r 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 z r figure 2: contour flow in the transition region for conical and curved profiles for η1 = 1.1, ϑ1 = 0.68. the broken lines represent points with ṽz = 0. that for ϑ1 = 0.695, ṽz = 0 at the outlet, on the axis. for the rossby numbers with ϑ ≥ ϑc, the azimuthal flow vorticity is negative (ωφ < 0), resulting in an increase in the axial velocity with the radius, and so having a minimum on the axis where the stagnation point appears [6]. therefore, the critical rossby number can be defined ϑc as the value where ṽz is zero at the outlet on the axis i.e., where the flow shows a stagnation point. this is the necessary condition to produce a vortex breakdown [6]. we find the same critical rossby number for both wall profiles and so we will not treat them separately from now on. the critical rosssby values for η2 = 1.2 and η3 = 1.3 are ϑ2 = 0.869 and ϑ3 = 1.052, respectively. given the previous analysis, the second step is to show the behavior of ṽz on the axis at the outlet as a function of ϑ for each η in order to study the ex040002-4 papers in physics, vol. 4, art. 040002 (2012) / r gonzález et al. istence of folds in the rossby number-continuation parameter (equivalent to the swirl parameter in [5,7,11]); indeed, we have seen that ṽz has the minimum on the axis. besides, when using eq. (13) when r = 0, it is easy to see that ṽz decreases with z and so it reaches the minimum at the outlet being ṽz ≥ 0. in fig. 3, the radial dependence of ṽz is plotted at the outlet for η1,η2,η3 and its variation with ϑ when it is slightly shifted from ϑ1. in fig. 4, it can be seen that the minimum of ṽz on the axis increases with ϑ so there is no fold of ṽzmin as defined by buntine and saffman in a similar approximation [5]. η1 <η2 <η3 η1η3 h a l 0.5 1 v ~ z 1 r ~ 1 r ~ 3 r ~ j1 0.95 j1 1.2 j1 h b l 0.5 1 v ~ z 1 r ~ 1 r ~ figure 3: (a) ṽz at the outlet as a function of r for η1,η2,η3 and the corresponding critical rossby numbers ϑ1,ϑ2,ϑ3. (b) ṽz at the outlet as a function of r for ϑ1 and for values of ϑ slightly shifted from ϑ1 . in each case, the minimum of ṽz is reached on the axis. the dependence of the results on l is analyzed. it can be seen that when z = l in eqs. (14) and (15), σ̃(l̃) = η is obtained. by replacing this in eq. (13) for z = l and r = 0 it gives ṽzmin = 1 + ( 1 −η2 ) ϑηj1[ 2 ϑ η] , (19) and so ϑc is obtained as a function of η by solving the last equation when ṽzmin = 0, as shown in fig. 5. this result seems to be surprising, but it is not so if it is considered as derived from the quasi-cylindrical approximation: the dependence of 0.5 1.5j1 j2 j3 0.0 0.2 0.4 0.6 rossby number j a x ia l v e lo c it y v~ z η1 η2 η3 figure 4: ṽz at the outlet on the axis as a function of the rossby number ϑ for η1 = 1.1,η2 = 1.2, η3 = 1.3. here ϑ1 = 0.695,ϑ2 = 0.869 and ϑ3 = 1.052 correspond to stagnation points. the flow on z is obtained through the boundary conditions expressed by eq. (6). at the same time, these boundary conditions depend on the inlet flow and on the parameter η. this explains the fact that the same results, for both conical and curved profiles, have been obtained and that the condition given by eq. (7) at the outlet has not influenced them. 1.2 1.4 1.6 1.8 2.0 0.5 1.0 1.5 2.0 2.5 expansion η c ri ti c a l r o s s b y j c figure 5: critical rossby number ϑc as a function of η. differences with batchelor’s seminal work should be marked [10]. mainly, he works in cylindrical geometry and does not consider the dependence of the flow on z . we introduce this z dependence through the quasi-cylindrical approximation. this, therefore, allows us to find the structure of the flow in the transition together with the rossby critical number defined by considering this structure and by showing that the minimum of ṽz is reached at the outlet 040002-5 papers in physics, vol. 4, art. 040002 (2012) / r gonzález et al. on the axis. nevertheless, once the flow reaches the pipe downstream, the analysis coincides because, as shown, the problem depends on the inlet flow and on the parameter expansion η. this allows us to consider the issue of the vortex core that we have not considered at the inlet flow. as we know the structure of the flow in the downstream cylindrical region [1] and by assuming a quasi-cylindrical approximation for the vortex core in the transition region, the minimum of vcorez at the outlet on the axis is given by vcorezmin = 1 + ( 1 − η̂2 ) ϑ̂η̂j1[ 2 ϑ̂ η̂] , (20) where ϑ̂ = ϑ ι , η̂ = ξ ι and ξ and ι are the dimensionless radius of the core downstream and upstream, respectively. we note that η̂ is the expansion parameter of the core. hence eqs. (19) and (20) have the same structure. in the present work, we have not found any fold in the rossby numbercontinuation parameter of ṽz, as found in our previous work [1] where the fold was associated with a critical rossby number called ϑcf by buntine and saffman [5]. as we have already done, we define the rossby critical number for which vcorezmin = 0 where there is a stagnation point, and we will call it ϑcs. in [1], for ι = 0.272 and pipe expansion parameters η1,η2,η3, we have found that ϑcf were 0.35, 0.44 and 0.53, respectively, while the core expansion parameters η̂ were 1.25, 1.47 and 1.65, respectively. by replacing these values in eq. (20) when vcorezmin is zero, we get the corresponding ϑ̂cs and then ϑcs for the vortex core. these are respectively 0.26, 0.38 and 0.49. that is to say that in all the cases we have ϑcs < ϑcf . therefore, at the fold ṽz > 0. this coincides with the results found by buntine and saffman [5] in their analysis using a three-parameter family inlet flow. v. conclusions the main conclusions drawn from the previous sections are: 1. in the quasi-cylindrical approximation, the steady flow in the transition expansion region corresponding to a solid body rotation with uniform axial flow as inlet flow has the same beltrami flow structure as in the pipe downstream, which is compatible with the boundary conditions. therefore, findings from our previous work on stability [1–3] can hold. 2. for fixed values of η and ϑ ≥ ϑc, ωφ < 0 and then ṽz in the transition region is an increasing function of r and a decreasing function of z reaching its the minimum on the axis at the outlet. 3. for fixed values of η, the minimum of ṽz on the axis is an increasing function of ϑ (fig. 4), where the stagnation point corresponds to ϑc. 4. as a consequence, no branching off takes place for the solutions of bragg–hawthorne equation. 5. the critical rossby number ϑc corresponding to stagnation is an increasing function of η (fig. 5). 6. the whole picture can be reached by putting together these results with those obtained in [1], where there is a branching owing to the boundary conditions at the frontier between the vortex and the irrotational flow. moreover, since the results in [1] for the rotational flow depend on the inlet flow as well as on the rotational expansion parameter η̂ defined in eq. (20), given a quasi-cylindrical approximation, it can be concluded that this expression is the minimum of vz in the core. therefore, we can get the critical rossby number ϑcs and compare it with that corresponding to the fold ϑcf . this present work verifies that ϑcs < ϑcf , in accordance with buntine and saffman’s results [5]. 7. in the quasi-cylindrical approximation, previous results do not depend on the chosen profile. this can be explained by the boundary conditions chosen depending on the inlet flow and on the parameter expansion. acknowledgements we would like to thank unversidad nacional de general sarmiento for its support for this work, and our colleague gabriela di 040002-6 papers in physics, vol. 4, art. 040002 (2012) / r gonzález et al. gesú for her advice on the english version of this paper. [1] r gonzález, g sarasúa, a costa, kelvin waves with helical beltrami flow structure, phys. fluids 20, 24106 (2008). [2] r gonzález, a costa, e s santini, on a variational principle for beltrami flows, phys. fluids 22, 74102 (2010). [3] r gonzález, e s santini, the dynamics of beltramized flows and its relation with the kelvin waves, j. phys.: conf. ser. 296, 012024 (2011). [4] the beltrami flow is defined as a field vb that satisfies ωb = ∇ × vb = γvb, with γ = constant. we say that the flow has a beltrami flow structure when it can be put in the form v = uez + ωreθ + vb, being u and ω constants, i.e it is the superposition of a solid body rotation and translation with a beltrami one. for a potential flow γ = 0. [5] j d buntine, p g saffman, inviscid swirling flows and vortex breakdown, proc. r. soc. lond. a 449, 139 (1995). [6] g l brown, j m lopez, axisymmetric vortex breakdown part 2. physical mechanisms, j. fluid mech. 221, 573 (1990). [7] b benjamin, theory of the vortex breakdown phenomenon, j. fluid mech. 14, 593 (1962). [8] r guarga, j cataldo, a theoretical analysis of symmetry loss in high reynolds swirling flows, j. hydraulic res. 31, 35 (1993). [9] s l bragg, w r hawthorne, some exact solutions of the flow through annular cascade actuator discs, j. aero. sci. 17, 243 (1950). [10] g k batchelor, an introduction to fluids dynamics, cambridge university press, cambridge (1967). [11] s v alekseenko, p a kuibin, v l okulov, theory of concentrated vortices. an introduction, springer-verlag, berlin heidelberg (2007). 040002-7 papers in physics, vol. 1, art. 010002 (2009) received: 6 july 2009, accepted: 2 september 2009 edited by: m. c. barbosa reviewed by: h. fort (universidad de la república, uruguay) licence: creative commons attribution 3.0 doi: 10.4279/pip.010002 www.papersinphysics.org issn 1852-4249 a note on the consensus time of mean-field majority-rule dynamics damián h. zanette1∗ in this work, it is pointed out that in the mean-field version of majority-rule opinion dynamics, the dependence of the consensus time on the population size exhibits two regimes. this is determined by the size distribution of the groups that, at each evolution step, gather to reach agreement. when the group size distribution has a finite mean value, the previously known logarithmic dependence on the population size holds. on the other hand, when the mean group size diverges, the consensus time and the population size are related through a power law. numerical simulations validate this semi-quantitative analytical prediction. much attention has been recently paid, in the context of statistical physics, to models of social processes where ordered states emerge spontaneously out of disordered initial conditions (homogeneity from heterogeneity, dominance from diversity, consensus from disagreement, etc.) [1]. not unexpectedly, many of them are adaptations of well-known models for coarsening in interacting spin systems, whose dynamical rules are reinterpreted in the framework of social-like phenomena. the voter model [2, 3] and the majority rule model [4, 5] are paradigmatic examples. in the latter, consensus in a large population is reached by accumulative agreement events, each of them involving just a group of agents. the present note is aimed at briefly revisiting previous results on the time needed to reach consensus in majority-rule dynamics, stressing the role of the size distribution of the involved groups. it is found that the growth of the consensus time with the population size shows ∗e-mail: zanette@cab.cnea.gov.ar 1 consejo nacional de investigaciones cient́ıficas y técnicas, centro atómico bariloche and instituto balseiro, 8400 san carlos de bariloche, ŕıo negro, argentina. distinct behaviors depending on whether the mean value of the group size distribution is finite or not. consider a population of n agents where, at any given time, each agent has one of two possible opinions, labeled +1 and −1. at each evolution step, a group of g agents (g odd) is selected from the population, and all of them adopt the opinion of the majority. namely, if i is one of the agents in the selected group, its opinion si changes as si → sign ∑ j sj, (1) where the sum runs over the agents in the group. of course, only the agents, not the majority, effectively change their opinion. in the mean-field version of this model, the g agents selected at each step are drawn at random from the entire population. it is not difficult to realize that the mean-field majority-rule (mfmr) dynamics is equivalent to a random walk under the action of a force field. for a finite-size population, this random walk is moreover subject to absorbing boundary conditions. think, for instance, of the number n+ of agents with opinion +1. as time elapses, n+ changes randomly, with transition probabilities that depend on n+ itself, until it reaches one of the extreme values, 010002-1 papers in physics, vol. 1, art. 010002 (2009) / d. h. zanette n+ = 0 or n. at this point, all the agents have the same opinion, the population has reached full consensus, and the dynamics freezes. in view of this overall behavior, a relevant quantity to characterize mfmr dynamics in finite populations is the consensus time, i.e. the time needed to reach full consensus from a given initial condition. in particular, one is interested in determining how the consensus time depends on the population size n. the exact solution for three-agent groups (g = 3) [5] shows that the average number of steps needed to reach consensus, sc, depends on n as sc ∝ n log n, (2) for large n. the proportionality factor depends in turn on the initial unbalance between the two opinions all over the population. the analogy of mfmr dynamics with random walks suggests that this result should also hold for other values of the group size g, as long as g is smaller than n. this can be easily verified by solving a rate equation for the evolution of n+ [1]. numerical results and semi-quantitative arguments [6] show that eq. (2) is still valid if, instead of being constant, the value of g is uniformly distributed over a finite interval. what would happen, however, if, at each step, g is drawn from a probability distribution pg that allows for values larger than the population size? if, at a given step, the chosen group size g is equal to or largen than n, full consensus will be instantly attained and the evolution will cease. in the randomwalk analogy, this step would correspond to a single long jump taking the walker to one of the boundaries. is it possible that, for certain forms of the distribution pg, these single large-g events could dominate the attainment of consensus? if it is so, how is the n-dependence of the consensus time modified? to give an answer to these questions, assume that g is drawn from a distribution which, for large g, decays as pg ∼ g−γ, (3) with γ > 1. tuning the exponent γ of this powerlaw distribution, large values of g may become sufficiently frequent as to control consensus dynamics. the probability that at the s-th step the selected group size is g ≥ n, while in all preceding steps g < n, reads ps = ( n−1∑ g=gmin pg )s−1 ∞∑ g=n pg, (4) where gmin is the minimal value of g allowed for by the distribution pg. the average waiting time (in evolution steps) for an event with g ≥ n is thus sw = ∞∑ s=1 sps = ( ∞∑ g=n pg )−1 ∝ nγ−1, (5) where the last relation holds for large n when pg verifies eq. (3). compare now eqs. (2) and (5). for γ > 2 (respectively, γ ≤ 2) and asymptotically large population sizes, one has sw � sc (respectively, sw � sc). this suggests that above the critical exponent γcrit = 2, the attainment of consensus will be driven by the asymptotic random-walk features that lead to eq. (2). for smaller exponents, on the other hand, consensus will be reached by the occurrence of a large-g event, in which all the population is entrained at a single evolution step. note that γcrit stands at the boundary between the domain for which the mean group size is finite (γ > γcrit) and the domain where it diverges (γ < γcrit). in order to validate this analysis, numerical simulations of mfmr dynamics have been performed for population sizes ranging from 102 to 105. the probability distribution for the group size g has been introduced as follows. first, define g = 2g+1. choosing g = 1, 2, 3, . . . ensures that the group size is odd and g ≥ 3. then, take for g the probability distribution pg = 1 ζ(γ) g−γ, (6) where ζ(z) is the riemann zeta function. with this choice, pg satisfies eq. (3). the average waiting time for a large-g event, given by eq. (5), can be exactly given as sw = ζ(γ) ζ(γ, 1 + n/2) , (7) 010002-2 papers in physics, vol. 1, art. 010002 (2009) / d. h. zanette where ζ(z,a) is the generalized riemann (or hurwitz [7]) zeta function. in the numerical simulations, both opinions were equally represented in the initial condition. the total number of steps needed to reach full consensus, s, was recorded and averaged over series of 102 to 106 realizations (depending on the population size n). figure 1: numerical results for the number of steps needed to reach consensus, s, normalized by the population size n, as a function of n, for three values of the exponent γ. the straight dotted lines emphasize the validity of eq. (2) for γ = 2.5 and 3. for γ = 2 the line is horizontal, suggesting s ∝ n. the two upper data sets in fig. 1 show the ratio s/n for two values of the exponent γ > γcrit. since the horizontal scale is logarithmic, a linear dependence in this graph corresponds to the proportionality given by eq. (2). dotted straight lines illustrate this dependence. for these values of γ, therefore, the relation between the consensus time and the population size coincides with that of the case of constant g. for the lowest data set, which corresponds to γ = γcrit, the relation ceases to hold. the horizontal dotted line suggests that now s ∝ n, as predicted for γ = 2 by eq. (5). the log-log plot of fig. 2 shows the number of steps to full consensus as a function of the population size for three exponents γ ≤ γcrit. the dotted straight line has unitary slope, representing the proportionality between s and n for γ = 2. for lower exponents, the full curves are the graphic representation of sw as given by eq. (5). the excellent figure 2: number of steps needed to reach consensus as a function of the population size, for three values of the exponent γ. the slope of the straight dotted line equals one. full curves correspond to the function sw given in eq. (7). agreement between sw and the numerical results for s demonstrates that, for these values of γ, the consensus time in actual realizations of the mfmr process is in fact dominated by large-g events. figure 3: fraction of realizations where consensus is attained through a large-g event as a function of the population size, for several values of the exponent γ. a further characterization of the two regimes of consensus attainment is given by the fraction of realizations where consensus is reached through a 010002-3 papers in physics, vol. 1, art. 010002 (2009) / d. h. zanette large-g event. this is shown in fig. 3 as a function of the population size. for γ < γcrit, consensus is the result of a step involving the whole population in practically all realizations. as n grows, the frequency of such realizations increases as well. the opposite behavior is observed for γ > γcrit. for the critical exponent, meanwhile, the fraction of largeg realizations is practically independent of n, and fluctuates slightly around 0.57. in summary, it has been shown here that in majority-rule opinion dynamics, the dependence of the consensus time on the population size exhibits two distinct regimes. if the size distribution of the groups of agents selected at each evolution step decays fast enough, one reobtains the logarithmic analytical result for constant group sizes. if, on the other hand, the distribution of group sizes decays slowly, as a power law with a sufficiently small exponent, the dependence of the consensus time on the population size is also given by a power law. the two regimes are related to two different mechanisms of consensus attainment: in the second case, in particular, consensus is reached during events which involve the whole population at a single evolution step. the logarithmic regime occurs when the mean group size is finite, while in the power-law regime the mean value of the distribution of group sizes diverges. in connection with the random-walk analogy of majority-rule dynamics, this is reminiscent of the contrasting features of standard and anomalous diffusion [8]. [1] c castellano, s fortunato, v loreto, statistical physics of social dynamics, rev. mod. phys. 81, 591 (2009). [2] m scheucher, h spohn, a soluble kinetic model for spinodal decomposition, j. stat. phys. 53, 279 (1988). [3] p l krapivsky, kinetics of a monomermonomer model of heterogeneous catalysis, phys. rev. a 45, 1067 (1992). [4] s galam, minority opinion spreading in random geometry, eur. phys. j. b 25, 403 (2002). [5] p l krapivsky, s redner, dynamics of majority rule in two-state interacting spin systems, phys. rev. lett. 90, 238701 (2003). [6] c j tessone, r toral, p amengual, h s wio, m san miguel, neighborhood models of minority opinion spreading, eur. phys. j. b 39, 535 (2004). [7] j spanier, k b oldham, the hurwitz function ζ(ν; u), in: an atlas of functions, pag. 653 hemisphere, washington, dc (1987). [8] u frisch, m f shlesinger, g zaslavsky, eds. lévy flights and related phenomena in physics, springer, berlin (1995). 010002-4 papers in physics, vol. 5, art. 050002 (2013) received: 9 october 2012, accepted: 1 march 2013 edited by: g. mindlin licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.050002 www.papersinphysics.org issn 1852-4249 a mathematically assisted reconstruction of the initial focus of the yellow fever outbreak in buenos aires (1871) m l fernández,1 m otero,2 n schweigmann,3 h g solari2∗ we discuss the historic mortality record corresponding to the initial focus of the yellow fever epidemic outbreak registered in buenos aires during the year 1871 as compared to simulations of a stochastic population dynamics model. this model incorporates the biology of the urban vector of yellow fever, the mosquito aedes aegypti, the stages of the disease in the human being as well as the spatial extension of the epidemic outbreak. after introducing the historical context and the restrictions it puts on initial conditions and ecological parameters, we discuss the general features of the simulation and the dependence on initial conditions and available sites for breeding the vector. we discuss the sensitivity, to the free parameters, of statistical estimators such as: �nal death toll, day of the year when the outbreak reached half the total mortality and the normalized daily mortality, showing some striking regularities. the model is precise and accurate enough to discuss the truthfulness of the presently accepted historic discussions of the epidemic causes, showing that there are more likely scenarios for the historic facts. i. introduction yellow fever (yf) is a disease produced by an arthropod borne virus (arbovirus) of the family �aviviridae and genus flavivirus. the arthropod vector can be one of several mosquitoes and the usual hosts are monkeys and/or people. wild mosquitoes of genus haemagogus, sabetes and aedes are responsible for the transmission of the ∗e-mail: solari@df.uba.ar 1 departamento de computación, facultad de ciencias exactas y naturales (fcen), universidad de buenos aires (uba) and conicet. intendente güiraldes 2160, ciudad universitaria, c1428ega buenos aires, argentina. 2 departamento de física, fcen�uba and ifiba� conicet. 1428 buenos aires, argentina. 3 departamento de ecología, genética y evolución, fcen�uba and iegeba�conicet. 1428 buenos aires, argentina. virus among wild monkeys, such as the brown howler monkey (alouatta guariba) associated to recent outbreaks of yf in brazil, paraguay and argentina [1]). in contrast, urban yf is transmitted by a domestic and anthropophilic mosquito, aedes aegypti, human beings being the host [2]. aedes aegypti is a tree hole mosquito, with origins in africa, that has been dispersed through the world thanks to its association with people. during the end of the xviii and the xix centuries, yf caused large urban outbreaks in the americas from boston (1798), new york (1798) and philadelphia (1793, 1797, 1798, 1799) in the north [3] to montevideo (1857) and buenos aires (1858, 1870, 1871) [4] in the south. these historical episodes arise as ideal cases for testing the capabilities of yf models in urban settings. is it possible to reconstruct the evolution of one of these epidemic outbreaks? can enough information be recovered to produce a thorough test on the models? this 050002-1 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. is seldom the case, for example, for the study of the memphis (1878) epidemic, with over 10000 casualties, only 1965 were considered potentially usable [5]. in contrast, the records of the outbreak in buenos aires 1871, unearthed and digitized for this work, left us with an amount of 1274 death cases located in time and space for the initial focus in the quarter of san telmo, about 78% of the total mortality in the quarter [6]. according to the 1869 national census [7] buenos aires had 177787 inhabitants, 12329 of them living in san telmo, about half of them just immigrated into the country mostly from europe. in this work, we will compare the initial development of the epidemic outbreak (buenos aires, 1871) with the simulations resulting from an ecoepidemiological model developed in refs. [8�10], testing the worth of the predictive model. the simulations were performed under a number of assumptions, most of them essentially forced by the lack of better information. we will assume that: 1. now and before, yf is the same illness, i.e., we can use current information on yf development such as: the average extent of the incubation, infection, recovery, and toxic periods, as well as the mortality level in 1871. in other words, the virus presents no substantial changes since 1871 to present days. we do not expect this hypothesis to be completely correct: the yf virus is an rna-virus as opposed to the stable dna-viruses, as such, mutations in about 140 years of continuous replications in mosquitoes and primates can hardly be ruled out. furthermore, present-day yf has been subject to di�erent evolutionary pressures than the yf in the xix century. while in the xix century yellow fever circulated continuously in human populations, today the wild part of the cycle involving wild populations of monkeys plays a substantial role. 2. the epidemic was transmitted by aedes aegypti. there is no evidence of this fact since the scienti�c society and medical doctors in general were not aware of the role played by the mosquito until the con�rmation given by reed [3] of finlay's ideas [11]. 1 1according to other sources, it was beauperthuy [12] the �rst one to accurately describe the transmission of yf we assume that aedes aegypti has not changed since then, and/or there are no substantial changes in the life cycle, vector capabilities and adaptation between the (assumed) population in 1871 and present-day populations in buenos aires city. after the eradication campaign (1958�1965) [14], aedes aegypti was eradicated from buenos aires [15]. hence, the present populations result from a re-infestation and they are not the direct descendants of the mosquitoes population of 1871. 3. lacking time statistics for the duration of the di�erent stages in the development of the illness, reproduction of the virus and life cycle of the mosquito, we use, as distribution for such events, a maximum likelihood distribution subject to the constrain of the average value for the cycle. in short, we use exponentially distributed times for the next event for all type of events. 4. finally, and most importantly, we assume that the human population mobility is not a factor in the local spread of the disease. we anticipate one of our conclusions: this assumption is likely to be false for the full development of the epidemic outbreak but seems reasonable for the early (silent) development. the study of the secondary foci of the epidemic outbreak merits a detailed analysis of the social and political circumstances related to human mobility and it is beyond the possibilities of this study. since we want the test to be as demanding as possible, more information is needed to simulate the outbreak eliminating sources of ambiguity and parameters to be �tted using the same test data. we recovered the following information: 1. estimations of daily temperatures. they are relevant since the temperature regulates the developmental rates of the mosquitoes. 2. a very rough, anecdotal, estimation of the availability of breeding sites (bs) that, ultimately, control the carrying capacity of the by mosquitoes, as observed in the epidemic outbreak at cumaná, venezuela (1853), as well as the e�cient measures of protection taken by native americans, the use of nets to prevent the spread of the epidemic [13]. 050002-2 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. environment, the number of vectors and the infection rate. 3. human populations discriminated by block in the city. 4. estimations of the date of arrival of the virus to the city, putting bounds to the reasonable initial conditions for the simulation. this climatological, social and historical information represents a determining part of the reconstruction as it is integrated into the model jointly with the entomological and medical information to produce stochastic simulations of possible outbreaks to be compared with the historic records of casualties. we will show that the model predicts large probabilities for the occurrence of yf in the given historical circumstances and it is also able to answer why a minor outbreak in 1870 did not progress towards a large epidemic. the total number of deaths and the time-evolution of the death record will be shown to agree between the historical record and the simulated episodes as well, within the original focus. the rest of the manuscript will be organized as follows: we will begin with the description of yf in section ii, including the eco-epidemiological model. in section iii, we will address the relevant climatological, social and historical aspects. in section iv, we will explore the sensitivity of the model to initial conditions and the number of available breeding sites, discussing the statistics more clearly in�uenced by vector abundance. the historic mortality records and the simulated records are compared in section v.. we will �nally discuss the performance of the model in section vi. ii. the disease we will simply quote the fact sheet provided by the world health organization [16] as the standardized description: �yf is a viral disease, found in tropical regions of africa and the americas. it principally a�ects humans and monkeys, and is transmitted via the bite of aedes mosquitoes. it can produce devastating outbreaks, which can be prevented and controlled by mass vaccination campaigns. the �rst symptoms of the disease usually appear 3�6 days after infection. the �rst, or acute, phase is characterized by fever, muscle pain, headache, shivers, loss of appetite, nausea and vomiting. after 3�4 days, most patients improve and symptoms disappear. however, in a few cases, the disease enters a toxicphase: fever reappears, and the patient develops jaundice and sometimes bleeding, with blood appearing in the vomit (the typical vomito negro). about 50% of patients who enter the toxic phase die within 10�14 days�. we add that the remission period lasts between 2 and 48 hours [17], and as it was mentioned in the introduction, not only aedes mosquitoes transmit the disease. i. the model the yellow fever model is rather similar to the already presented dengue model [10], the similarity corresponds to the fact that dengue is produced by a flavivirus as well, it is transmitted by the same vector and follows the same clinical sequence in the human being, although with substantially lesser mortality. the model describes the life cycle of the mosquito [8] and its dispersalafter a blood meal, seeking oviposition sites [9]. the mosquito goes through several stages: egg, larva, pupa, adult (non parous), �yer (i.e., the mosquito dispersing) and adult (parous). in each stage, the mosquito can die or continue the cycle with a transition rate between the subpopulations that depends on the temperature. the mortality in the larva stage is nonlinear and it regulates the population as a function of the availability of breeding sites. thus, the transitions from adult to �yer are associated with blood meals, the event that can transmit the virus from human to mosquito and vice-versa. from the epidemiological point of view, the mosquito follows a sei sequence (susceptible, exposed �extrinsic period�, infective). correspondingly, the adult populations are subdivided according to their status with respect to the virus. we assume that there is no vertical transmission of the virus and eggs, larvae, 050002-3 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. pupae and non parous adults are always susceptible. the humans are subdivided in subpopulations according to their status with respect to the illness as: susceptible (s), exposed (e), infective (i), in remission (r), toxic (t) and recovered (r). the temporary remission period is followed by recovery with a probability between 0.75 and 0.85 or a toxic period (probability 0.25 to 0.15) which ends half of the times in death and half of the times in recovery. the yellow fever model di�ers in the structure from the dengue model in ref. [10], as the human part of the dengue model is seir and the yellow fever model is seirrtd. however, the additional stages do not alter the evolution of the epidemic since the �in remission�, toxic and dead stages do not participate in the transmission of the virus. the yf parameters are presented in table 1. period value range intrinsic incubation (iip) 4 days 3�6 days extrinsic incubation (eip) 10 days 9�12 days human viremic (vp) 4 days 3�4 days remission (rp) 1 days 0�2 days toxic (tp) 8 days 7�10 days probability value range recovery after remission (rar) 0.75 0.75�0.85 mortality for toxic patients (mt) 0.5 transmission host to vector (ahv) 0.75 transmission vector to host (avh) 0.75 table 1: parameters (mean value of state) adopted for yf. the range indicated is taken from paho [17]. the model is compartmental, all populations are counted as non-negative integers numbers and evolve by a stochastic process in which the time of the next event is exponentially distributed and the events compete with probabilities proportional to their rates in a process known as densitydependent-poisson-process [18]. the model can be understood qualitatively with the scheme of the fig. 1. the model equations are summarized in appendix a. the city is divided in blocks, roughly following the actual division (see fig. 2). the human populations are constrained to the block while the mosquitoes can disperse from block to block. figure 1: scheme of the yellow fever model. on the left side, the evolution of the mosquito and, on the right side, the evolution of human subpopulations. hollow arrows indicate the progression through the life cycle of the mosquito following the sequence: egg, larva, pupa, adult (non-parous), �yer, adult (parous) and the repetition of the two last steps. the mortality events are not shown to lighten the scheme. eggs are laid in the transition from �yer to adult. the adult mosquito populations are subdivided according to their status with respect to the virus as: susceptible (s), exposed (e) and infective (i). the virus is transmitted from mosquitoes to humans and vice-versa in the transition from adult to �yer (blood meal) when either the mosquito or the human is infective and the other susceptible (red arrows). the red bold arrows indicate the progression of the disease, from exposed to infective in the mosquito and, in humans, following the sequence: exposed (e), infective (i), in remission (r), toxic, recovered (r) or dead. iii. historical, social and climatological information i. when and how the epidemic started the yf outbreak in buenos aires (1871) was one of a series of large epidemic outbreaks associated to the end of the war of the triple alliance or paraguayan war. the war confronted argentina, brazil and uruguay (the three allies) on one side and paraguay on the other side, and ended by march, 1870. by the end of 1870, asunción, paraguay's capital city, was under the rule of the triple alliance. the return of paraguay's war pris050002-4 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. oners from brazil (where yf was almost endemic at that time) to asunción triggered a large epidemic outbreak [4]. the allied troops received their main logistic support from corrientes (argentina), a city with 11218 inhabitants according to the 1869 census [7], located about 300 km south of asunción (following the waterway) and 1000 km north of buenos aires along the paraná river (see fig. 3). on december 14, 1870, the �rst case of yf was diagnosed in corrientes [19], and a focus developed around this case imported from asunción. according to some sources, the epidemic produced panic, resulting in about half the population leaving the city between december 15 and january 15 [20]. however, other historical reasons might have played a relevant role since the city of corrientes was under the in�uence and ruling of buenos aires, while in the farmlands, the general ricardo lópez jordan was commanding a rebel army (a sequel of argentina civil wars and the war of the triple alliance). the subversion ended with the battle of ñaembé, about 200 km east of corrientes, on january 26, 1871. putting things in perspective, we must realize that in those times, yf was recognized only in its toxic stage associated to the black vomit, it is then perfectly plausible that recently infected individuals would have left corrientes and asunción reaching buenos aires, despite quarantine measures that were late and leaky [4,22].2 the death toll in corrientes was of 1289 people in the city (and about 700 more in places around the city) [20], representing a 11,5% of the population (notice that this number is not consistent with current numbers in use by who [17] which indicate a 7.5% of mortality in diagnosed cases of yf but is well in line with historical reports [23] of 20% to 70% mortality in diagnosed cases �the statistical basis has changed with the improved knowledge of early, not toxic, yf cases. according to this historical view, the initial arrival of infectious people to buenos aires happened, more likely, during december 1870 and january 1871. in his study of the yf epidemic, written twenty three years after the epidemic outbreak, josé penna (md) [4] quotes the issue of the journal 2on december 16, a sanitary o�cial from buenos aires was commissioned to corrientes to organize the quarantine, a measure that was applied to ships coming from paraguay but not to those with corrientes as departing port. figure 2: police districts from a map of the time and computer representation. the red colored area in district 14 (san telmo) are the two blocks where the 1871 epidemic started. the green colored area in district 3 is the block where the hotel roma was and where the 1870 focus began (see section v.iii.). the red and green lines sourround the region simulated for the 1871 epidemics and the 1870 focus, respectively. notice that districts 15 and 13 disagree in the maps. the computer representation follows the information in ref. [21] from where the population information was obtained. �revista médico quirúrgica�, published in buenos aires on december 23, 1870 [24], which presents a report regarding the sanitary situation during the last �fteen days, indicating the emergence of a �bilious fever� and a general tendency of other fevers to produce icterus or jaundice. in the next issue, dated january 8, 1871, the �revista� indicates an important increase in the number of bilious fever cases reported [22]. in a separate article, the doctors call the attention on how easily and how of050002-5 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. ten the quarantine to ships coming from paraguay is avoided, and calls for strengthening the measures. penna indicates that the �bilious fever� (not a standard term in medicine) likely corresponded to milder cases of yf. we will term this idea �penna's conjecture� and will come back to it later. for our initial guess, we considered this information as evidence that the epidemic outbreak started during december, 1870. exploring the model, and arbitrarily, we took december 16, 1870, as the time to introduce two infectious people with yf in the simulations, at the blocks where the mortality started. yet, an educated guess for penna's conjecture is to consider the 3�6 days needed from infection to clinical manifestation and the 9�12 days of the extrinsic cycle. hence, since the �rst clinical manifestations of transmitted yf happened between december 11�23, we would guess the infected people arriving somewhere between november 21 and december 11. figure 3: a 1870 map [25] showing asunción next to the paraguay river, corrientes and rosario next to the paraná river and buenos aires (spelled buenos ayres) next to the rio de la plata. yet, we must take into account that penna's conjecture contrasts with the conjectures presented by md wilde and md mallo, members of the sanity committee in charge during the yf epidemic. wilde and mallo advocated for the spontaneous origin of the disease, very much in line with the theories of miasmas in use in those times, theories that guided the sanitary measures taken [19]. wilde and mallo also argued that asunción could not be the origin of the epidemic, because of their belief that the ten or �fteen days quarantine (counting since the last port touched) was enough to avoid the propagation of the disease. this belief contrasts with the experience of 1870 (in buenos aires) where a ten day quarantine was not enough to prevent a minor epidemic [4]. nevertheless, the quarantine measures werefully implemented in corrientes by december 31, 1870. the measures were later lifted because of the epidemic in corrientes and implemented at ports down the paraná river, being completed nearby buenos aires (ports of la conchas, tigre, san fernando and �la boca� within buenos aires city) by mid-february when the epidemic was in full development in buenos aires according to the port sanitary authorities, wilde and mallo [19]. being corrientes the source of infected people can hardly be disregarded. with about 5000 people leaving the city between december 15 and january 15 [19], a city where yf was developing. according to wilde and mallo [19], there were (non-fatal?) yf cases in buenos aires as early as january 6, 1871, reported by mds argerich and gallarani as well as documented cases of yf death after disembarking in rosario (200 km north of buenos aires along the paraná river) having boarded in corrientes. ii. breeding sites one of the key elements in the reconstruction and simulation of an epidemic transmitted by mosquitoes is to have an estimation of their numbers which will be re�ected directly in the propagation of the epidemic. in the mosquito model [8], this number is regulated by the quality and abundance of breeding sites. the production of a single breeding site, normalized to be a �ower pot in a local cemetery, was taken as unit in ref. [8] and the number of breeding sites measured in this unit roughly corresponds to half a liter of water. the aedes aegypti population in monitored areas of buenos aires, today, is compatible with about 20 to 30 breeding sites per block [9]. estimating the number of sites available for breeding today is already a di�cult task, the estimation of breeding 050002-6 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. sites available in 1871 is a nearly impossible one. in what remains of this subsection, we will try to get a very rough a-priori estimate. district population/(100 m)2 bs/(100 m)2 1 339 391.0 2 279 300.0 3 428 522.0 4 353 443.0 5 330 430.0 6 259 365.0 13 160 196.0 14 224 300.0 15 90 157.0 16 165 316.0 18 23 52.0 19 13 26.0 20 30 52.0 table 2: population data. buenos aires, 1869 [7]. population density by police district (see fig. 2) and equivalent breeding sites, bs, originally estimated as proportional to the house density in the police district. a very important di�erence between those days and the present corresponds to the supply of fresh water which today is taken from the river, processed and distributed through pipes; but in those days, it was an expensive commodity taken from the river by the �waterman� and sold to the customers who, in turn, had to let it rest so that the clay in suspension decanted to the bottom of the vessel (a process that takes at least 3 days). additionally, there were some wells available but the water was (is) of low quality (salty). the last, and rather common resource [19,26], was the collection of rain water in cisterns. iii. temperature reconstruction aedes aegypti developmental times depend on temperature. although it would seem reasonable to use as substitute of the real data of the average temperature registered since systematic data collection began, records of temperature in those times were kept privately [27] and are available. the data set consists of three daily measurements made from january 1866 until december 1871, at 7am, 2pm and 9pm. when averaged, the records allow an estimation of the average temperature of the day better than the usual procedure of adding maximum and minimum dividing by two. unfortunately, the 0 250 500 750 1000 1250 1500 1750 2000 2250 time / days 0 10 20 30 t e m p e ra tu re / º c 1900 2000 2100 2200 time / days -10 -5 0 5 10 r e si d u a ls january 1st, 1871january 1st, 1866 figure 4: average daily temperature and periodic approximation �tted accordng to eq. (1), t = 0 corresponds to january 1, 1866. the inset shows the di�erence between the measured temperatures and the �t (residuals) during 1871. register has some important missing points during the epidemic outbreak. because of this problem, the data in ref. [27] was used to �t an approximation in the form: t =7.22◦c× cos(2πt/(365.25 days) + 5.9484) + 17.21◦c, (1) following ref. [28] and then extrapolating to the epidemic period. in fig. 4, the data and the �t are displayed. the residuals of the �t do not present seasonality or sistematic deviations, as we can see in the inset of fig. 4. it is worth noticing that a similar �t on temperature data from the period 1980�1990 presents a mean temperature of 18.0◦c, amplitude of 6.7◦c and a phase shift of 6.058◦c [8] (notice that t = 0 in the reference corresponds to july 1 while in this work it corresponds to january 1). according to the threshold computations in ref. [8], the climatic situation was less favorable for the mosquito in 1866�1871 than in the 1960�1991 period. the reconstruction of temperatures needs to be performed at least from the 1868 winter, since a relatively arbitrary initial condition in the form of eggs for july 1, 1868 is used to initialize the code, and then run over a transitory of two years. such 050002-7 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. procedure has been found to give reliable results [8]. there are several factors in the biology of ae. ae. that indicate that the biological response to air-temperature �uctuations is re�ected in attenuated �uctuations of biological variables. first, the larvae and pupae develop in water containers, thus, what matters is the water temperature. this fact represents a �rst smoothing of air-temperature �uctuations. second, insects developmental rates for �uctuating temperature environments correspond to averages in time of rates obtained in constant temperature environments [29], an alternative view is that development depends on accumulated heat [30]. such averages occur over a period of about 6 days at 30◦c and longer times for other temperatures and non-optimal food conditions [31]. third, the biting rate (completion of the gonotrophic cycle) depends as well on temperatures averaged over a period of a few days. last, mosquitoes actively seek the conditions that �t best to them and more often than not, they are found resting inside the houses. iv. mortality data for this work, the daily mortality data recorded during the 1871 epidemic outbreak [6] is key. this statistical work has received no attention in the past, and no study of the yf outbreak in buenos aires made reference to this information. we have cross-checked the information with data in the 1869 national census [7], as well as with data in published works [4], and the details are consistent among these sources. the data set is presented here closing the historic research part. v. revising the clinical development of yellow fever. a yf epidemic outbreak happened in buenos aires, in 1870, developing about 200 cases [4] (the text is ambiguous on whether the cases are toxic or fatal). the epidemic outbreak was noticed by february 22 (�rst death), a sailor who left rio de janeiro (brazil) on february 7, and presumably landed on february 17 (no cases of yf were reported on board of the poitou �the boat). this well documented case allows us to see the margins of tolerance that have to be exercised in 0 30 60 90 120 150 time / days 0 10 20 30 40 50 60 70 d a il y m o rt a li ty ( h is to ri c ) january 1st, 1871 figure 5: daily mortality cases in the police district 14 (see fig. 2) corresponding to the quarter san telmo [6], t = 0 corresponds to january 1, 1871. taking medical information prepared for clinical use as statistical information. assume, following penna, that the sailor was exposed to yf before boarding in rio de janeiro, according to information in table 1 collected from the pan american health organization [17], adding incubation and viremic period, we have a range of 6�10 days, hence the sailor was close to the limit of his infectious period. he was not toxic, according to the md on board who signed a certi�cate accepted by the sanitary authority. yet, �ve days later, he was dying, making the remission plus toxic period of 5 days, shorter than the range of 7�12 days listed in table 1 and substantially shorter than the 10�15 days (remission plus toxic) communicated in ref. [16]. shall we assume as precise the values reported in ref. [16, 17] we would have to conclude that the disease was substantially di�erent, at least in its clinical evolution, in 1871 as compared with present days. in clinical studies performed during a yf epidemic on the jos plateau, nigeria, jones and wilson [32] report a 45.6% overall mortality and a signi�cant di�erence in the duration of the illness for fatal and non-fatal cases with averages of 6.4 and 17.8 days. série et al. [33] reports for the 1960�1962 epidemic in ethiopia a mortality ranging from 43% (kouré) to 100% (boloso) and 50% (menéra) with a total duration of the clinic phase 050002-8 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. of the illness of 7.14, 2.14 and 4.5 days, respectively (weighted average of 4.6 days in 18 cases). we must conclude that the extension of the toxic period preceding the death presents high variability. this variability may represent variability in the illness or in medical criteria. for example, série [33] indicates that the 100% mortality found at the boloso hospital is associated to the admission criteria giving priority to the most severe cases. in correspondence with this extremely high mortality level, the survival period is the shortest registered. the minimum length of the clinical phase is of 10� 14 days or 13�18 days, depending of the source (adding viremic, remission and toxic periods). we note that not only the toxic period of fatal cases must be shorter than the same period for non-fatal cases, but also the viremic period must be shorter in average, if all the pieces of data are consistent. the time elapsed between the �rst symptoms and death is probably longer today than in 1871, since it, in part, re�ects the evolution of medical knowledge. the hospitalization time is also rather arbitrary and changes with medical practices which do not re�ect changes in the disease. a rudimentary procedure to correct for this differences is to shift the simulated mortality some �xed time between 5 and 8 days (the di�erence between our 13 days guessed (table 1) and the 4.5� 6.4 reported for africa [32,33]). such a procedure is not conceptually optimal, but it is as much as it can be done within present knowledge. we certainly do not know whether just the toxic period must be shortened or the viremic period must be shortened as well, and in the latter case, how this would a�ect the spreading of the disease. a second source of discrepancies between recorded data and simulations are the inaccuracies in the historic record. can we consider the daily mortality record as a perfect account? which was the protocol used to produce it? we can hardly expect it to be perfect, although we will not make any provision for this potential source of error. iv. simulation results the simulations were performed using a one-block spatial resolution, with the division in square blocks of the police districts 14 (san telmo), 16, 2 and 4, corresponding to concepción, catedral sur and montserrat; and part of the districts 6, 18, 19, 19a and 20 (see fig. 2), and totalized for each police district to obtain daily mortality comparable to those reported in ref. [6] and picture in fig. 5. numerical mosquitoes were not allowed to �y over the river. at the remaining borders of the simulated region, a stochastic newmann boundary condition was used, meaning that the mosquito population of the next block across the boundary was considered equal to the block inside the region; but the number of mosquito dispersion events associated to the outside block was drawn randomly, independently of the events in the corresponding inner block. larger regions for the simulations were tested producing no visible di�erences. the time step was set to the small value of 30 s, avoiding the introduction of further complications in the program related to fast event rates for tiny populations [34], although an implementation of the method in ref. [34], not relying on the smallness of the time step so heavily, is desirable for a production phase of the program. before we proceed to the comparison between the historic mortality records of the epidemic and the simulated results, we need to gain some understanding regarding the sensitivity of the simulations to the parameters guessed and the best forms of presenting these results. we performed a moderate set of computations, since the code has not been optimized for speed and it is highly demanding for the personal computers where it runs for several days. here, we illustrate the main lessons learned in our explorations. poor people, unable to buy large quantities of water, had to rely mostly on the cisterns and other forms of keeping rain water. since 1852, when the population of buenos aires was about 76000 people, there was an important immigration �ow, increasing the population to about 178000 people by 1869 [7]. the immigrants occupied large houses where they rented a room, usually for an entire family, a housing that was known as �conventillo� and was the dominant form of housing in some districts such as san telmo, where the epidemic started [20]. in some police chronicles of the time, houses with as many as 300 residents are mentioned [35]. under such di�cult social circumstances, we can only imagine that the number of breeding sites available to mosquitoes has to be counted as orders of magnitude larger than present-day available sites. 050002-9 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. an a-priori and conservative estimation is to consider about ten times the number of breeding sites estimated today. thus, we assume, as a �rst guess, 300 breeding sites per block in san telmo. we will have to tune this number later as it regulates mosquito populations and the development of the epidemic focus. the number of breeding sites is the only parameter tuned to the results in this work. more precisely, the criteria adopted was to make the number of (normalized) bs proportional to the number of houses per block taken from historic records [21], adjusting the proportionality factor to the observed dynamics. we introduce the notation bsxy to indicate a multiplicative factor of y. the population of each police district was set to the density values reported in the 1869 census [7,21] and the police districts geography was taken from police records [36] and referenced according to maps of the city at the time [37�40]. table 2 shows the average population per block, initially estimated number of breeding sites and number of houses for the district of the initial focus, san telmo #14 and nearby districts (# 16, 4, 2). a sketch of buenos aires police districts according to a 1887 map [37] is displayed alongside with the computer representation in fig. 2. it is a known feature of stochastic epidemic models [41] that the distribution of totals of infected people has two main contributions. one is that the small epidemic outbreaks when none or a few secondary cases are produced and the extinction time of the outbreak comes quickly. the otheris that the large epidemic outbreaks which, if the basic reproductive number is large enough, present a gaussian shape separated by a valley of improbable epidemic sizes from the small outbreaks. while the present model does not fall within the class of models discussed in ref. [41], the general considerations applied to stochastic sir models qualitatively apply to the present study. yet, simulations started early during the summer season follow the pattern just described in ref. [41], but simulations started later do not present the probability valley between large and small epidemics. we have found useful to present the results disaggregated in the form: epidemic size, daily percentage of mortality relative to the total mortality and time to achieve half of the �nal mortality. this presentation will let us realize that most of the �uctuation is concentrated in the total epidemic size, while the daily evolution is relatively regular, except, perhaps, in the time taken to develop up to 50% of the mortality (depending on the abundance of vectors and the initial number of infected humans and chance). i. total mortality (epidemic size) since historical records include mostly the number of causalities, it appears sensible for the purposes of this study to use the total number of deaths as a proxy statistics for epidemic size. 0 20 40 0 20 40 f re q u e n c y i n 1 0 0 r u n s 0 20 40 0 300 600 900 1200 total mortality 0 20 40 figure 6: san telmo. total mortality histograms for di�erent number of breeding sites, computed after 100 simulations with the same initial condition corresponding to 2 infectious people located in san telmo on january 1, 1871, at the same location where the initial death happened in the historical event. from top to bottom, multiplication factors (bin-width : frequency of no-epidemic) bsx1 (126.2 : 0.23), bsx2 (209.8 : 0.09), bsx3 (128.2 : 0.06) and bsx4 (50.6 : 0.02). the y-axis indicates frequency in a set of 100 simulations. the total mortality depends strongly on the stochastic nature of the simulations, initial conditions and ecological parameters guessed. qualitatively, the results agree with the intuition, although this is an a-posteriori statement, i.e., only after seeing the results we can �nd intuitive interpretations for them. the discussion assumes that the development of the epidemic outbreak was regulated by either the 050002-10 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. availability of vectors (mosquitoes) or the exhaustion of susceptible people, the �rst situation represents a striking di�erence with standard sir models without seasonal dependence of the biological parameters. actually, in fig. 6, we compare frequencies of epidemics binned in �ve bins by �nal epidemic size for di�erent sets of 100 simulations with di�erent number of breeding sites. the number of breeding sites is varied in the same form all along the city, keeping the proportionality with housing, and it is expressed as multiplicative factor (bsx1=1, bsx2=2, bsx3=3, bsx4=4) presented in table 2. notice also that the width of the bins progresses as 126.2, 209.8, 128.2 and 50.6 indicating how the dispersion of �nal epidemic sizes �rst increases with the number of breeding sites but for larger numbers decreases. we can see how for a factor 2 (bsx2) (and larger) the mortality saturates, indicating the epidemic outbreaks are limited by the number of susceptible people. for our original guess, factor 1, the epidemic is limited by the seasonal presence/absence of vectors. however, for higher factors, there is a substantial increase in large epidemic outbreaks with larger probabilities for larger epidemics. only for the factor 4 (bsx4) the most likely bin includes the historical value of 1274 deaths. a second feature, already shown in the dengue model [10], is that outbreaks starting with the arrival of infectious people in late spring will have a lesser chance to evolve into a major epidemic. yet, those that by chance develop are likely to become large epidemic outbreaks since they have more time to evolve. on the contrary, outbreaks started in autumn will have low chances to evolve and not a large number of casualties. the corresponding histograms can be seen in fig. 7. figure 7 also shows how the outbreaks that begin on december 16, as well as simulations starting on january 1, present higher probabilities of large epidemics than of small epidemics, but this tendency is reverted in simulations of outbreaks that start by february 16. this transition is, again, the transition between outbreaks regulated by the number of available susceptible humans and those regulated by the presence or absence of vectors. 0 30 60 0 30 60 f re q u e n c y i n 1 0 0 r u n s 0 30 60 0 200 400 600 800 1000 1200 1400 mortality 0 30 60 figure 7: san telmo. total mortality histograms for di�erent date of arrival of infectious people. computed after 100 simulations with the same initial condition corresponding to 2 infectious people located in san telmo and a density factor bsx2.5. from top to bottom: december 16, 1870; january 1, january 16 and february 15, 1871. ii. mortality progression one of the most remarkable facts unveiled by the simulations is that when the time evolution of the mortality is studied as a fraction of the total mortality, much of the stochastic �uctuations are eliminated and the curves present only small di�erences (see fig. 8). the similarity of the normalized evolution allows us to focus on the time taken to produce half of the mortality (labelled t1/2). we notice, in fig. 8, that the t1/2 in these runs lay between 69�110, compared to the historic values of t1/2 =73. the latter observation brings the attention to a remarkable fact of the simulations: not only the normalized progression of the outbreaks are rather similar but also, there is a correspondence between early development and large mortality. drawing t1/2 against total-mortality (fig. 9), we see that even for di�erent number of breeding sites, all the simulations indicate that the �nal size is a noisy function of the day when the mortality reaches half its �nal value. this function is almost constant for small t1/2 and becomes linear with increasing dispersion when t1/2 is relatively large. once again, the two di�erent forms the outbreak is controlled. 050002-11 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. 0 50 100 150 day (1 is january 1st 1871) 0 0.5 1 a c c u m u la te d m o rt a li ty / (t o ta l m o rt a li ty ) average average variance average + variance figure 8: evolution of the mortality level for all the epidemics with three or more fatal cases in a batch of 100 simulations (96 runs) di�ering only in the pseudo-random series. the breeding sites number has a factor 4, and the initial contagious people were placed on january 1, 1871, in san telmo. 60 80 100 120 140 day of 50% mortality 0 500 1000 1500 t o ta l m o rt a li ty bsx1 bsx1.5 bsx2 bsx3 bsx4 figure 9: san telmo. total mortality versus day when the mortality reached half its �nal number for di�erent number of breeding sites. the mortality appears to be mainly a function of t1/2 and roughly independent of the number of breeding sites. v. real against simulated epidemic we would like to establish the credibility of the statement: the historic mortality record for the san telmo focus belongs to the statistics generated by the simulations. to achieve this goal, we need to compare the daily mortality in the historical record and the simulations. since, in the model, the mortality proceeds day after day with independent random increments (as a consequence of the poisson character of the model), it is reasonable to consider the statistics χ2 = ∑ i ( hm(i) −mm(i) d(i) )2 (2) where i runs over the days of the year, hm(i) is the fraction of the total death toll in the historic record for the day i, mm(i) is the average of the same fraction obtained in the simulations and d(i) is the corresponding standard deviation for the simulations. the sum runs over the number of days in which the variance is not zero, for bsx3 and bsx4 in no case d(i) = 0 and (hm(i) − mm(i) 6= 0). the number of degrees corresponds to the number of days with non-zero mortality in the simulations minus one. the degree discounted accounts for the fact that ∑ i hm(i) = ∑ i mm(i) = 1. i. tuning of the simulations before we proceed, we have to �nd an acceptable number of breeding sites, a reasonable day for the arrival of infected individuals (assumed to be 2 individuals arbitrarily) and adjust for the uncertainty in survival time. actually, moving the day of arrival d days earlier and shortening the survival time by d will have essentially the same e�ect on the simulated mortality (providing d is small), which is to shift the full series by d days. this is, assigning to the day i the simulated mortality sm of the day i + d, (sm(i + d)). hence, only two of the parameters will be obtained from this data. as we have previously observed, the total mortality presents a large variance in the simulations. moreover, in medical accounts of modern time [32, 33], the mortality ranges between 46% and 100% while in historical accounts the percentage goes from 20 to 70 [23]. hence, a simple adjustment of the mortality coe�cient from our arbitrary 50% within such a wide range would su�ce to eliminate the contributions of the total epidemic size. the average simulated epidemic for bsx4 is of ≈ 1248 deaths while the historic record is of 1274 deaths. 050002-12 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. bsxy d χ2 degree probability 2 8 490.2 177 0. 3 7 191.7 173 0.15 4 3 151.9 168 0.80 4 4 143.8 168 0.91 4 5 145.6 168 0.89 table 3: χ2 calculations according to (2). we indicate the multiplicative factor applied to the breeding sites described in table 2, the shift applied to the statistics produce with the parameters of table 1, the value of the statistic χ2, the number of degrees of freedom, and the probability p(x > χ2) for a random variable x distributed as a χ2-distribution with the indicated degrees of freedom. the standard reading of the χ2 test indicates that, for bsx3 and bsx4, the statement �the deviations from the simulations mean of the historic record (deviations assumed to be distributed as χ2 with the indicated degrees) belong to the simulated set� is not found likely to be false by the test in the cases bsx3 and bsx4. hence, to match the mean with the historic record, it would su�ce to correct the mortality from 50% to 51%. we can disregard the idea that the epidemic started before december 20, 1870 (penna's conjecture), since it is not possible to simultaneously obtain an acceptable �nal mortality and an acceptable evolution of the outbreak. our best attempt corresponds to an epidemic starting by december 14, 1870, which averages ≈ 1212 deaths and with a deviation of the mortality of χ2 = 219.1 with 175 degrees, giving a probability p(x ≥ 219, 1) = 0.01. we focus on arrival dates around january 1, 1871. we can also disregard, for this initial condition, the original guess bsx1 corresponding to 300 breeding sites per block in san telmo, since it produces too small epidemics. we present results for the epidemics corresponding to bsx2, bsx3 and bsx4 in table 3 for di�erent numbers of bs and shift d. ii. comparison the conclusion of the χ2 tests is that the simulations performed with bsx3 and beginning between december 24 1870 and january 1, 1871 (with a survival period shortened between 0 and 8 days) are compatible with the historical record. however, the compatibility is larger when bsx4 is considered and the beginning of the epidemic is situated between december 28, 1870 and january 5, 1871 (with a survival period shortened between 0 and 8 days respectively). we illustrate this comparison with fig. 10. 0 50 100 150 200 days of 1871 0 500 1000 1500 m o rt a li ty single run average historic average + sd average sd figure 10: the historic record of accumulated mortality as a function of the day of the year 1871 is presented. the averaged accumulated mortality as well as curves shifted one standard deviation are shown for comparison. an acceptable look alike individual simulation chose by visual inspection from a set of 100 simulations is included as well. two exposed individuals (not yet contagious) were introduced in the same blocks where the historic epidemic started the day january 1, 1871, with bsx4. the simulated mortality is anticipated in 4 days. statistical estimates were taken as averages over 96 runs which resulted in secondary mortality, out of a set of 100 runs. iii. the 1870 outbreak the records for the 1870 outbreak are scarce. of the recognized cases, only 32 entered the lazareto (hospital) and 19 of them were originated in the same block that the �rst case. secondary cases are registered at the lazareto' books starting on march 30 (two cases) and continuing with daily cases. the �nal outcome of these cases and the cases not entered at the lazareto [4] are not clear. 050002-13 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. except for the precise initial condition, corresponding to one viremic (infective) person located at the hotel roma (district 4 in fig. 2), the information is too imprecise to produce a demanding test for the model. we performed a set of 100 simulations introducing one viremic (infective) person at the precise block where the hotel roma was, on february 17. the number of breeding sites was kept at the same factor 4 with respect to the values tabulated in table 2 that was used for the best results in the study of the 1871 outbreak. needless to say, this does not need to be true, as the number of breeding sites may change from season to season. the distribution of the �nal mortality is shown in fig. 11. as we can see, relatively small epidemics of less than 200 deaths cannot be ruled out, although there are much larger epidemics also likely to happen. 0 500 1000 1500 2000 total mortality 0 5 10 15 20 25 30 f re q u e n c y i n 1 0 0 r u n s figure 11: mortality distribution for the simulations beginning on february 17 with the incorporation of one infective (viremic) in the block of the hotel roma and bsx4. the histogram is the result of 87 runs which resulted in epidemics (13 runs did not result in epidemics). the width of the bins is 322.2, and the �rst epidemic bin goes from 17 to 339 deaths with a frequency of 15/100. actually, a slow start of the epidemic outbreak would favor a small �nal mortality, as it can be seen in fig. 12. not only there is a relation between a low mortality early during the outbreak (such as april 15) and the �nal mortality, but we also see that the sharp division between small and large epidemics is not present in this family of epidemic outbreaks di�ering only in the pseudo-random number sequence. 0 50 100 150 200 250 300 early mortality 0 500 1000 1500 2000 f in a l m o rt a li ty , 1 8 7 0 april 15th march 30th figure 12: final mortality against early mortality for two dates: march 30 and april 15. a low early mortality �predicts� a low �nal mortality as the outbreak does not have enough time to develop. simulations correspond to the conditions of the 1870 small epidemic. bsx4, one infected arriving to the hotel roma on february 17. the historic information indicates that secondary cases were recorded by march 30. hence, corresponding to a slow start. the �nal mortality is not known, but it is believed it has been in the 100�200 range. for the smallest epidemics simulated, the secondary mortality starts after march 30. hence, the 1870 focus can be understood as a case of relative good luck and a late start within more or less the same conditions than the outbreak of 1871. vi. conclusions and �nal discussion in this work, we have studied the development of the initial focus in the yf epidemic that devastated buenos aires (argentina) in 1871 using methods that belong to complex systems epistemology [42]. the core of the research performed has been the development of a model (theory) for an epidemic outbreak spread only by the mosquito aedes aegypti represented according to current biological literature such as christophers [2] and others. the 050002-14 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. translation of the mosquito's biology into a computer code has been performed earlier [8,9] and the basis for the spreading of a disease by this vector has been elaborated in the case of dengue previously [10]. the present model for yf is then an adaptation of the dengue model to the particularities of yf, and the present attempt of validation (failed falsi�cation) re�ects also on the validity of this earlier work. the present study owes its existence to the work of anonymous police o�cers [6] that gathered and recorded epidemics statistics during the epidemic outbreak, in a city that was not only devastated by the epidemic, but where the political authorities left in the middle of the drama as well [20]. we have gathered (and implemented in a model) entomological, ecological and medical information, as well as geographic, climatological and social information. after establishing the historical constraints restricting our attempts to simulate the historical event, we have adjusted the density of breeding sites to be the equivalent to 1200 halfliter pots as those encountered in today buenos aires cemeteries (the number corresponds to san telmo quarter). perhaps a better idea of the number of mosquitoes present is given by the maximum of the average number of bites per person per day estimated by the model, which results in 5 bites/(person day) (to be precise: the ratio between the maximum number of bites in a block during a week and the population of the block, divided by seven). the population of the domestic mosquito aedes aegypti in buenos aires 1870�1871 was large enough to almost assure the propagation of yf during the summer season. the only e�ective measures preventing the epidemic were the natural quarantine resulting from the distance to tropical cities were yf was endemic (such as rio de janeiro) and the relatively small window for large epidemics, since the extinction of the adult form of the mosquito during the winter months prevents the overwintering of yf. in this sense, the relatively small outbreak of 1870 is an example of how a late arrival of the infected individual combined with a touch of luck produced only a minor sanitary catastrophe. by 1871, as a consequence of the end of the paraguayan war and the emergence of yf in asunción, the conditions for an almost unavoidable epidemic in buenos aires were given. the intermediate step taken in corrientes, with the panic and partial evacuation of the city, adding the lack of quarantine measures, was more than enough to make certain the epidemic in buenos aires. on the contrary, penna's conjecture of an earlier starting during december, 1870 are inconsistent with the biological and medical times as implemented in the model. we can disregard this conjecture as highly improbable. the historic mortality record is consistent with an epidemic starting between december 28 and january 5, being the symptomatic period (viremic plus remission plus toxic) of the illness between 13 and 5 days. furthermore, the existence of nonfatal cases of yf by january 6 mentioned by some sources [19] would be consistent, provided the cases were imported. in retrospect, the present research began as an attempt to validate/falsi�cate the yf model and, in more general terms, the model for the transmission of viral diseases by aedes aegypti using the historic data of this large yf epidemic. as the research progressed, it became increasingly evident that the model was robust. in successive attempts, every time the model failed to produce a reasonable result, it forced us to revise the epidemiological and historical hypotheses. in these revisions, we ended up realizing that the accepted origin of the epidemic in imported cases from brazil, actually hides the central role that the epidemics in corrientes had, and the gruesome failure of not quarantining corrientes once the mortality started by december 16, 1870, about two weeks before the deducted beginning of the outbreak in buenos aires. the same study of inconsistencies between the data and the reconstruction made us focus on the survival time of those clinically diagnosed with yf that �nally die. the form in which the illness evolves anticipates the �nal result. jones and wilson [32] indicate the symptoms of cases with a bad prognosis including the rapidity and degree of jaundice. this information suggests, in terms of modeling, that death is not one of two possible outcomes at the end of the �toxic period�, as we have �rst thought. separation of to-recover and to-die subpopulations could (should?) be performed earlier in the development of the illness, each subpopulation having its own parameters for the illness. yet, while in theory this would be desirable, in prac050002-15 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. tice it would have, for the time being, no e�ect, since the characteristic periods of the illness have not been measured in these terms. epidemics transmitted by vectors come to an end either when the susceptible population has been su�ciently exposed so that the replication of the virus is slowed down (the classical consideration in sir models) or when the vector's population is decimated by other (for example, climatic) reasons. the model shows that both situations can be distinguished in terms of the mortality statistics. we have also shown that the total mortality of the epidemic is not di�cult to adjust by changing the death probability of the toxic phase, and as such, it is not a demanding test for a model. the daily mortality, when normalized, shows sensitivity to the mosquito abundance, specially in the evolution times involved, since the general qualitative shape appears to be �xed. in particular, the date in which the epidemic reaches half the total mortality is advanced by larger mosquito populations. however, only comparison of the simulated and historical daily mortality put enough constraints to the free data in the model (date of arrival of infected people and mosquito population) to allow for a selection of possible combinations of their values. as successful as the model appears to be, it is completely unable to produce the total mortality in the city, or the spatial extension of the full epidemic. the simulations produce with bsx4 less than 4500 deaths, while in the historic record, the total mortality in the city is above 13000 cases. the historical account, and the recorded data, show that after the initial san telmo focus has developed, a second focus in the police district 13 (see fig. 2) developed, shortly several other foci developed that could not be tracked [4]. unless the spreading of the illness by infected humans is introduced (or some other method to make long jumps by the illness), such events cannot be described. it is worth noticing that the mobility patterns in 1871 are expected to be drastically di�erent from present patterns, and as such, the application of models with human mobility [43] is not straightforward and requires a historical study. one of the most important conclusions of this work is that the logical consistency of mathematical modeling puts a limit to ad-hoc hypotheses, so often used in a-posteriori explanations, as it forces to accept not just the desired consequence of the hypotheses, but all other consequences as well. last, eco-epidemiological models are adjusted to vector populations pre-existing the actual epidemics and can therefore be used in prevention to determine epidemic risk and monitor eradication campaigns. in the present work, the tuning was performed in epidemic data only because it is actually impossible to know the environmental conditions more than one hundred years ago. yet, our wild initial guess for the density of breeding sites resulted su�ciently close to allow further tuning. acknowledgments we want to thank professor guillermo marshall who has been very kind allowing mlf to take time o� her duties to complete this work. we acknowledge the grant pictr0087/2002 by the anpcyt (argentina) and the grants x308 and x210 by the universidad de buenos aires. special thanks are given to the librarians and personnel of the instituto histórico de la ciudad de buenos aires, biblioteca nacional del maestro, museo mitre and the library of the school of medicine uba. a appendix i. populations and events of the stochastic transmission model we consider a two dimensional space as a mesh of squared patches where the dynamics of vectors, hosts and the disease take place. only adult mosquitoes, flyers, can �y from one patch to a next one according to a di�usion-like process. the coordinates of a patch are given by two indices, i and j, corresponding to the row and column in the mesh. if xk is a subpopulation in the stage k, then xk(i,j) is the xk subpopulation in the patch of coordinates (i,j). population of both hosts (humans) and vectors (aedes aegypti) were divided into subpopulations representing disease status: sei for the vectors and seirrtd for the human population. ten di�erent subpopulations for the mosquito were taken into account, three immature subpopulations: eggs e(i,j), larvae l(i,j) and pupae p(i,j), and seven adult subpopulations: non parous adults a1(i,j), susceptible �yers fs(i,j), exposed �yers 050002-16 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. fe(i,j), infectious �yers fi(i,j) and parous adults in the three disease status: susceptible a2s(i,j), exposed a2e(i,j) and infectious a2i(i,j). the a1(i,j) is always susceptible, after a blood meal it becomes a �yer, susceptible fs(i,j) or exposed fe(i,j), depending on the disease status of the host. if the host is infectious, a1(i,j) becomes an exposed �yer fe(i,j) but if the host is not infectious, then the a1(i,j) becomes a susceptible �yer fs(i,j). the transmission of the virus depends not only on the contact between vector and host but also on the transmission probability of the virus. in this case, we have two transmission probabilities: the transmission probability from host to vector ahv and the transmission probability from vector to host avh. human population nh(i,j) was split into seven di�erent subpopulations according to the disease status: susceptible humans hs(i,j), exposed humans he(i,j), infectious humans hi(i,j), humans in remission state hr(i,j), toxic humans ht(i,j), removed humans hr(i,j) and dead humans because of the disease hd(i,j). the evolution of the seventeen subpopulations is a�ected by events that occur at rates that depend on subpopulation values and some of them also on temperature, which is a function of time since it changes over the course of the year seasonally [8,9]. ii. events related to immature stages table 4 summarizes the events and rates related to immature stages of the mosquito during their �rst gonotrophic cycle. the construction of the transition rates and the election of model parameters related to the mosquito biology such as: me mortality of eggs, elr hatching rate, ml mortality of larvae, α density-dependent mortality of larvae, lpr pupation rate, mp: mortality of pupae, par pupae into adults development coe�cient and the ef emergence factor were described in detail previously [8,9]. the natural regulation of aedes aegypti populations is due to intra-speci�c competition for food and other resources in the larval stage. this regulation was incorporated into the model as a densitydependent transition probability which introduces the necessary nonlinearities that prevent a malthusian growth of the population. this e�ect was incorporated as a nonlinear correction to the temperature dependent larval mortality. then, larval mortality can be written as: mll(i,j) +αl(i,j)×(l(i,j)−1) where the value of α can be further decomposed as α = α0/bs(i,j) with α0 being associated with the carrying capacity of one (standardised) breeding site and bs(i,j) being the density of breeding sites in the (i,j) patch [8,9]. iii. events related to the adult stage aedes aegypti females (a1 and a2) require blood to complete their gonotrophic cycles. in this process, the female may ingest viruses with the blood meal from an infectious human during the human viremic period v p . the viruses develop within the mosquito during the extrinsic incubation period eip and then are reinjected into the blood stream of a new susceptible human with the saliva of the mosquito in later blood meals. the virus in the exposed human develops during the intrinsic incubation period iip and then begin to circulate in the blood stream (viremic period), the human becoming infectious. the �ow from susceptible to exposed subpopulations (in the vector and the host) depends not only on the contact between vector and host but also on the transmission probability of the virus. in our case, there are two transmission probabilities: the transmission probability from host to vector ahv and the transmission probability from vector to host avh. the events related to the adult stage are shown in table 5 to 8. table 5 summarizes the events and rates related to adults during their �rst gonotrophic cycle and related to oviposition by �yers according to their disease status. table 6 and table 7 summarize the events and rates related to adult 2 gonotrophic cycles, exposed adults 2 and exposed �yers becoming infectious and human contagion. table 8 summarizes the events and rates related to non parous adult (adult 2) and flyer death. iv. events related to �yer dispersal some experimental results and observational studies show that the aedes aegypti dispersal is driven by the availability of oviposition sites [44�46]. according to these observations, we considered that only the flyers f(i,j) can �y from patch to patch in search of oviposition sites. the implementation of �yer dispersal has been described elsewhere [9]. 050002-17 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. event e�ect transition rate egg death e(i,j) → e(i,j) − 1 me×e(i,j) egg hatching e(i,j) → e(i,j)−1 l(i,j) → l(i,j) + 1 elr ×e(i,j) larval death l(i,j) → l(i,j) − 1 ml×l(i,j) + α×l(i,j) × (l(i,j) − 1) pupation l(i,j) → l(i,j)−1 p(i,j) → p(i,j) + 1 lpr ×l(i,j) pupal death p(i,j) → p(i,j) − 1 (mp + par × (1 − (ef/2))) ×p(i,j) adult emergence p(i,j) → p(i,j) − 1 a1(i,j) → a1(i,j) + 1 par × (ef/2) ×p(i,j) table 4: event type, e�ects on the populations and transition rates for the developmental model. the coe�cients are me: mortality of eggs; elr: hatching rate; ml: mortality of larvae; α: density-dependent mortality of larvae; lpr: pupation rate; mp: mortality of pupae; par: pupae into adults development coe�cient; ef: emergence factor. the values of the coe�cients are available in subsections vi. and vii.. event e�ect transition rate adults 1 death a1(i,j) → a1(i,j) − 1 ma×a1(i,j) i gonotrophic cycle with virus exposure a1(i,j) → a1(i,j) − 1 fe(i,j) → fe(i,j) + 1 cycle1 × a1(i,j) × (hi(i,j)/nh(i,j)) × ahv i gonotrophic cycle without virus exposure a1(i,j) → a1(i,j) − 1 fs(i,j) → fs(i,j) + 1 cycle1 × a1(i,j) × ((((nh(i,j) − hi(i,j))/nh(i,j)) + (1 − ahv) × (hi(i,j)/nh(i,j))) oviposition of susceptible �yers e(i,j) → e(i,j) + egn fs(i,j) → fs(i,j) − 1 a2s(i,j) → a2s(i,j) + 1 ovr(i,j) ×fs(i,j) oviposition of exposed �yers e(i,j) → e(i,j) + egn fe(i,j) → fe(i,j) − 1 a2e(i,j) → a2e(i,j) + 1 ovr(i,j) ×fe(i,j) oviposition of infected �yers e(i,j) → e(i,j) + egn fi(i,j) → fi(i,j) − 1 a2i(i,j) → a2i(i,j) + 1 ovr(i,j) ×fi(i,j) table 5: event type, e�ects on the subpopulations and transition rates for the developmental model. the coe�cients are ma: mortality of adults; cycle1: gonotrophic cycle coe�cient (number of daily cycles) for adult females in stages a1.; ahv: transmission probability from host to vector; ovr(i,j): oviposition rate by �yers in the (i,j) patch; egn: average number of eggs laid in an oviposition. the values of the coe�cients are available in table 1, subsections vi., vii., viii. and ix.. the general rate of the dispersal event is given by: β × f(i,j), where β is the dispersal coe�cient and f(i,j) is the flyer population which can be susceptible fs(i,j), exposed fe(i,j) or infectious fi(i,j) depending on the disease status. the dispersal coe�cient β can be written as β =   0 if the patches are disjoint diff/d2ij if the patches have at least a common point (3) where dij is the distance between the centres of the patches and diff is a di�usion-like coe�cient so that dispersal is compatible with a di�usion-like process [9]. v. events related to human population human contagion has been already described in table 7. table 9 summarizes the events and rates in which humans are involved. the human popula050002-18 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. event e�ect transition rate ii gonotrophic cycle of susceptible adults 2 with virus exposure a2s(i,j) → a2s(i,j) − 1 fe(i,j) → fe(i,j) + 1 cycle2×a2s(i,j) ×(hi(i,j)/nh(i,j))× ahv ii gonotrophic cycle of susceptible adults 2 without virus exposure a2s(i,j) → a1(i,j) − 1 fs(i,j) → fs(i,j) + 1 cycle2 × a2s(i,j) × ((((nh(i,j) − hi(i,j))/nh(i,j)) + (1 − ahv) × (hi(i,j)/nh(i,j))) ii gonotrophic cycle of exposed adults 2 a2e(i,j) → a2e(i,j) − 1 fe(i,j) → fe(i,j) + 1 cycle2 ×a2e(i,j) table 6: event type, e�ects on the subpopulations and transition rates for the developmental model. the coe�cients are cycle2: gonotrophic cycle coe�cient (number of daily cycles) for adult females in stages a2.; ahv: transmission probability from host to vector. the values of the coe�cients are available in table 1, subsections vi., vii., viii. and ix.. event e�ect transition rate exposed adults 2 becoming infectious a2e(i,j) → a2e(i,j) − 1 a2i(i,j) → a2i(i,j) + 1 (1/(eip − (1/ovr(i,j))))a2e(i,j) exposed �yers becoming infectious fe(i,j) → fe(i,j) − 1 fi(i,j) → fi(i,j) + 1 (1/(eip − (1/ovr(i,j))))fe(i,j) ii gonotrophic cycle of infectious adults 2 without human contagion a2i(i,j) → a2i(i,j) − 1 fi(i,j) → fi(i,j) + 1 hs(i,j) → hs(i,j) − 1 he(i,j) → he(i,j) + 1 cycle2 × a2i(i,j) × (hs(i,j)/nh(i,j)) ×avh ii gonotrophic cycle of infectious adults 2 without human contagion a2i(i,j) → a2i(i,j) − 1 fi(i,j) → fi(i,j) + 1 cycle2 × a2i(i,j) × ((((nh(i,j) − hs(i,j))/nh(i,j)) + (1 − avh) × (hs(i,j)/nh(i,j))) table 7: event type, e�ects on the subpopulations and transition rates for the developmental model. the coe�cients are cycle2: gonotrophic cycle coe�cient (number of daily cycles) for adult females in stages a2; ovr(i,j): oviposition rate by �yers in the (i,j) patch; avh: transmission probability from vector to host; eip : extrinsic incubation period. the values of the coe�cients are available in table 1, subsections vi., vii., viii. and ix. tion was �uctuating but balanced, meaning that the birth coe�cient was considered equal to the mortality coe�cient mh. vi. developmental rate coe�cients the developmental rates that correspond to egg hatching, pupation, adult emergence and the gonotrophic cycles were evaluated using the results of the thermodynamic model developed by sharp and demichele [47] and simpli�ed by schoo�eld et al. [48]. according to this model, the maturation process is controlled by one enzyme which is active in a given temperature range and is deactivated only at high temperatures. the development is stochastic in nature and is controlled by a poisson process with rate rd(t). in general terms, rd(t) takes the form rd(t) = rd(298 k) (4) × (t/298 k) exp((∆ha/r)(1/298 k− 1/t)) 1 + exp(∆hh/r)(1/t1/2 − 1/t)) where t is the absolute temperature, ∆ha and ∆hh are thermodynamics enthalpies characteristic of the organism, r is the universal gas constant, and t1/2 is the temperature when half of the enzyme is deactivated because of high temperature. 050002-19 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. event e�ect transition rate susceptible �yer death fs(i,j) → fs(i,j) − 1 ma×fs(i,j) exposed �yer death fe(i,j) → fe(i,j) − 1 ma×fe(i,j) infectious �yer death fi(i,j) → fi(i,j) − 1 ma×fi(i,j) susceptible adult 2 death a2s(i,j) → a2s(i,j) − 1 ma×a2s(i,j) exposed adult 2 death a2e(i,j) → a2e(i,j) − 1 ma×a2e(i,j) infectious adult 2 death a2i(i,j) → a2i(i,j) − 1 ma×a2i(i,j) table 8: event type, e�ects on the subpopulations and transition rates for the developmental model. the coe�cients are ma: adult mortality. the values of the coe�cients are available in subsection vii. event e�ect transition rate born of susceptible humans hs(i,j) → hs(i,j) + 1 mh×nh(i,j) death of susceptible humans hs(i,j) → hs(i,j) − 1 mh×hs(i,j) death of exposed humans he(i,j) → he(i,j) − 1 mh×he(i,j) transition from exposed to viraemic he(i,j) → he(i,j) − 1 hi(i,j) → hi(i,j) + 1 (1/iip) ×he(i,j) death of infectious humans hi(i,j) → hi(i,j) − 1 mh×hi(i,j) transition from infectious humans to humans in remission state hi(i,j) → hi(i,j) − 1 hr(i,j) → hr(i,j) + 1 (1/v p) ×hi(i,j) death of humans in remission state hr(i,j) → hr(i,j) − 1 mh×hr(i,j) transition from humans in remission to toxic humans hr(i,j) → hr(i,j) − 1 ht(i,j) → ht(i,j) + 1 ((1 −rar)/rp) ×hr(i,j) recovery of humans in remission hr(i,j) → hr(i,j) − 1 hr(i,j) → hr(i,j) + 1 (rar/rp) ×hr(i,j) death of removed humans hr(i,j) → hr(i,j) − 1 mh×hr(i,j) death of toxic humans ht(i,j) → ht(i,j) − 1 hd(i,j) → hd(i,j) + 1 (mt/tp) ×ht(i,j) recovery of toxic humans ht(i,j) → ht(i,j) − 1 hr(i,j) → hr(i,j) + 1 ((1 −mt)/tp) ×ht(i,j) table 9: event type, e�ects on the subpopulations and transition rates for the developmental model. the coe�cients are mh: human mortality coe�cient; v p : human viremic period; mh: human mortality coe�cient; iip: intrinsic incubation period; rp: remission period; tp: toxic period; rar: recovery after remission probability; mt: mortality probability for toxic patients. the values of the coe�cients are available in table 1. table 10 presents the values of the di�erent coef�cients involved in the events: egg hatching, pupation, adult emergence and gonotrophic cycles. the values are taken from ref. [30] and are discussed in ref. [8]. vii. mortality coe�cients egg mortality. the mortality coe�cient of eggs is me = 0.01 1/day, independent of temperature in the range 278 k ≤ t ≤ 303 k [49]. larval mortality. the value of α0 (associated to the carrying capacity of a single breeding site) is α0 = 1.5, and was assigned by �tting the model to observed values of immatures in the cemeteries of buenos aires [8]. the temperature dependent larval death coe�cient is approximated by ml = 0.01 + 0.9725 exp(−(t−278)/2.7035) and it is valid in the range 278 k ≤ t ≤ 303 k [50�52]. pupal mortality. the intrinsic mortality of a pupa has been considered as mp = 0.01 + 050002-20 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. develop. cycle (4) rd(t) rd(298 k) ∆ha ∆hh t1/2 egg hatching elr 0.24 10798 100000 14184 larval develop. lpr 0.2088 26018 55990 304.6 pupal develop. par 0.384 14931 -472379 148 gonotrophic c. (a1) cycle1 0.216 15725 1756481 447.2 gonotrophic c. (a2) cycle2 0.372 15725 1756481 447.2 table 10: coe�cients for the enzymatic model of maturation [eq. (4)]. rd is measured in day−1, enthalpies are measured in (cal / mol) and the temperature t is measured in absolute (kelvin) degrees. 0.9725exp(−(t −278)/2.7035) [50�52]. besides the daily mortality in the pupal stage, there is an additional mortality contribution associated to the emergence of the adults. we considered a mortality of 17% of the pupae at this event, which is added to the mortality rate of pupae. hence, the emergence factor is ef = 0.83 [53]. adult mortality. adult mortality coe�cient is ma = 0.091/day and it is considered independent of temperature in the range 278 k ≤ t ≤ 303 k [2,50,54]. viii. fecundity and oviposition coe�cient females lay a number of eggs that is roughly proportional to their body weight (46.5 eggs/mg) [55, 56]. considering that the mean weight of a three-day-old female is 1.35 mg [2], we estimated the average number of eggs laid in one oviposition as egn = 63. the oviposition coe�cient ovr(i,j) depends on breeding site density bs(i,j) and it is de�ned as: ovr(i,j) = { θ/tdep if bs(i,j) ≤ 150 1/tdep if bs(i,j) > 150 (5) where θ was chosen as θ = bs(i,j)/150, a linear function of the density of breeding sites [9]. ix. dispersal coe�cient we chose a di�usion-like coe�cient of diff = 830 m2/day which corresponds to a short dispersal, approximately a mean dispersal of 30 m in one day, in agreement with short dispersal experiments and �eld studies analyzed in detail in our previous article [9]. x. mathematical description of the stochastic model the evolution of the subpopulations is modeled by a state dependent poisson process [41,57] where the probability of the state: (e,l,p,a1,a2s,a2e,a2i,fs,fe,fi, hs,he,hi,hr,ht,hr,hd)(i,j) evolves in time following a kolmogorov forward equation that can be constructed directly from the information collected in tables 4 to 9 and in eq. 3. xi. deterministic rates approximation for the density-dependent markov process let x be an integer vector having as entries the populations under consideration, and eα,α = 1 . . .κ the events at which the populations change by a �xed amount ∆α in a poisson process with density-dependent rates. then, a theorem by kurtz [57] allows us to rewrite the stochastic process as: x(t) = x(0) + κ∑ α=1 ∆αy ( ∫ t 0 ωα(x(s))ds) (6) where ωα(x(s) is the transition rate associated with the event α and y (x) is a random poisson process of rate x. the deterministic rates approximation to the stochastic process represented by eq. (6) consists of the introduction of a deterministic approximation for the arguments of the poisson variables y (x) in eq. (6) [34,58]. the reasons for such a proposal is that the transition rates change at a slower rate than the populations. the number of each kind of 050002-21 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. event is then approximated as independent poisson processes with deterministic arguments satisfying a di�erential equation. the probability of nα events of type α having occurred after a time dt is approximated by a poisson distribution with parameter λα. hence, the probability of the population taking the value x = x0 + κ∑ α=1 ∆αnα (7) at a time interval dt after being in the state x0 is approximated by a product of independent poisson distributions of the form probability(n1 . . .nκ,dt/x0) = κ∏ α=1 pα(λα) (8) and pαn1...ne (λα) = exp(−λα) λnαα nα! (9) whenever x = x0 + ∑κ α=1 ∆αnα has no negative entries and pαn1...ne (λα) = exp(−λα) ∞∑ i=nα λiα i! = 1 − exp(−λα) nα−1∑ i=0 λiα i! (10) if {ni} makes a component in x zero (see ref. [34]) finally, dλα/dt =< ωα(x) > (11) where the averages are taken self-consistently with the proposed distribution (λα(0) = 0). the use of the poisson approximation represents a substantial saving of computer time compared to direct (monte carlo) implementations of the stochastic process. [1] yellow fever, paho technical report 2�7, 11, pan american health organization (2008). [2] r christophers, aedes aegypti (l.), the yellow fever mosquito, cambridge univ. press, cambridge (1960). [3] h r carter, yellow fever: an epidemiological and historical study of its place of origin, the williams & wilkins company, baltimore (1931). [4] j penna, estudio sobre las epidemias de �ebre amarilla en el río de la plata, anales del departamento nacional de higiene 1, 430 (1895). [5] w b arden, urban yellow fever di�usion patterns and the role of microenviromental factors in disease dissemination: a temporal-spatial analysis of the menphis epidemic of 1878, phd thesis, department of geography and anthropology (2005). [6] i acevedo, estadistica de la mortalidad ocasionada por la epidemia de �ebre amarilla durante los meses de enero, febrero, marzo, abril, mayo y junio de 1871, imprenta del siglo (y de la verdad), buenos aires (1873). (apparently built from police sources. the author name is not printed in the volume, but is kept in the records of the library.) [7] d de la fuente, primer censo de la republica argentina. veri�cado en los días 15, 16 y 17 de setiembre de 1869, imprenta del porvenir, buenos aires, argentina (1872). [8] m otero, h g solari, n schweigmann, a stochastic population dynamic model for aedes aegypti: formulation and application to a city with temperate climate, bull. math. biol. 68, 1945 (2006). [9] m otero, n schweigmann, h g solari, a stochastic spatial dynamical model for aedes aegypti, bull. math. biol. 70, 1297 (2008). [10] m otero, h g solari, mathematical model of dengue disease transmission by aedes aegypti mosquito, math. biosci. 223, 32 (2010). [11] c j finlay, mosquitoes considered as transmitters of yellow fever and malaria, 1899, psyche 8, 379 (1899). available from the philip s. hench walter reed yellow fever collection. [12] a agramonte, an account of dr. louis daniel beauperthuy: a pioneer in yellow fever research, boston med. surg. j. clv111, 928 (1908). extracts available at the phillip 050002-22 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. s. hench walter read yellow fever collection. agramonte translates paragraphs from beauperthuy's communications, the �rst and second are dated 1854 (spanish) and 1856 (french). [13] j j tepaske, beauperthuy: de cumana a la academia de ciencias de paris by walewska, isis 81, 372 (1990). [14] f l soper, the 1964 status of aedes aegypti eradication and yellow fever in the americas, am. j. trop. med. hyg. 14, 887 (1965). [15] campaña de erradicación del aedes aegypti en la república argentina. informe �nal, technical report, ministerio de asistencia social y salud pública, argentina, buenos aires (1964). [16] yellow fever, world health organization, ginebra, (2008). http://www.who.int/topics/ yellow_fever/en/. [17] control de la �ebre amarilla, technical report 603, organización panamericana de la salud, washington dc (2005). [18] r durrett, essentials of stochastic processes, springer verlag, new york (2001). [19] l ruiz moreno, la peste histórica de 1871. fiebre amarilla en buenos aires y corrientes, nueva impresora, paraná, argentina (1949). [20] m a scenna, cuando murió buenos aires 1871, ediciones la bastilla, editorial astrea de rodolfo depalma hnos., buenos aires (1974). [21] f latzina, m chueco, a martinez, n pérez, censo general de población, edi�cación, comercio e industrias de la ciudad de buenos aires 1887, municipalidad de buenos aires, buenos aires, argentina (1889). [22] estado sanitario de buenos aires, revista médico quirúrgica 7, 296 (1871). [23] d b cooper, k f kiple, yellow fever, in: the cambridge world history of human disease, ed. k f kiple, chap. viii, cambridge univ. press, new york (1993). [24] estado sanitario de buenos aires, revista médico quirúrgica 7, 281 (1870). [25] s a mitchell, map of brazil, bolivia, paraguay, and uruguay. (with) map of chili. (with) harbor of bahia. (with) harbor of rio janeiro. (with) island of juan ferna«dez. r a campbell and s a mitchell jr., chicago and philadelphia (1870). david rumpsey map collection at http://www.davidrumsey.com/luna/servlet/ detail/rumsey 8 1 30422 1140461: map-ofbrazil,-bolivia,-paraguay,-a. [26] e j herz, historia del agua en buenos aires, municipalidad de la ciudad de buenos aires, buenos aires, argentina (1979). colección cuadernos de buenos aires. [27] b a gould, anales de la o�cina de metereología argentina. tomo i. clima de buenos aires, pablo e coni, buenos aires, argentina (1878). [28] a király, i m jánosi, stochastic modelling of daily temperature �uctuations, phys. rev. e 65, 051102 (2002). [29] s s liu, g m zhang, j zhu, in�uence of temperature variations on rate of development in insects: analysis of case studies from entomological literature, ann. entomol. soc. am. 88, 107 (1995). [30] d a focks, d c haile, e daniels, g a moun, dynamics life table model for aedes aegypti: analysis of the literature and model development, j. med. entomol. 30, 1003 (1993). [31] l e gilpin, g a h mcclelland, systems analysis of the yellow fever mosquito aedes aegypti, forts. zool. 25, 355 (1979). [32] e m m jones, d c wilson, clinical features of yellow fever cases at vom christian hospital during the 1969 epidemic on the jos plateau, nigeria, b. world health organ. 46, 653 (1972). [33] c sérié, a lindrec, a poirier, l andral, p neri, etudes sur la �èvre jaune en ethiopie, b. world health organ. 38, 835 (1968). 050002-23 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. [34] h g solari, m a natiello, stochastic population dynamics: the poisson approximation, phys. rev. e 67, 031918 (2003). [35] l barela, j villagran padilla, notas sobre la epidemia de �ebre amarilla, revista histórica 7, 125 (1980). [36] f l romay, historia de la policia federal argentina. tomo v 1868-1880, biblioteca policial, buenos aires, argentina (1966). [37] plano topográ�co de buenos aires, 1887, oliveira y cia, instituto histórico ciudad de buenos aires, http://www.acceder.gov.ar/es/ td:planos_y_mapas.4/1403392. [38] kratzenstein, gran mapa mercantil de la ciudad de buenos ayres, lit. rod. kratzenstein y cia., buenos aires (1880). mapoteca del museo mitre no 00548. [39] w r solveyra, plano de la ciudad de buenos ayres con la división civil de los 12 juzgados de paz, lit. j. pelvilain, buenos aires (1863). mapoteca del museo mitre no 00583. [40] vidiella, plano comercial y estadístico de la ciudad de buenos aires. 2o edición, imprenta de la revista de ramon vidiella, buenos aires (1862). mapoteca del museo mitre no 00611. [41] h andersson, t britton, stochastic epidemic models and their statistical analysis, lecture notes in statistics, vol. 151, springer-verlag, berlin (2000). [42] r garcía, sistemas complejos. conceptos, método y fundamentación epistemológica de la investigación interdisciplinaria., gedisa, barcelona, spain (2006). [43] d h barmak, c o dorso, m otero, h g solari, dengue epidemics and human mobility, phys. rev. e 84, 011901 (2011). [44] m wol�nsohn, r galun, a method for determining the �ight range of aedes aegypti (linn.), bull. res. council of israel 2, 433 (1953). [45] p reiter, m a amador, r a anderson, g g clark, short report: dispersal of aedes aegypti in an urban area after blood feeding as demonstrated by rubidium-marked eggs, am. j. trop. med. hyg. 52, 177 (1995). [46] j d edman, t w scott, a costero, a c morrison, l c harrington, g g clark, aedes aegypti (diptera culicidae) movement in�uenced by availability of oviposition sites, j. med. entomol. 35, 578 (1998). [47] p j h sharpe, d w demichele, reaction kinetics of poikilotherm development, j. theor. biol. 64, 649 (1977). [48] r m schoo�eld, p j h sharpe, c e magnuson, non-linear regression of biological temperature-dependent rate models based on absolute reaction-rate theory, j. theor. biol. 88, 719 (1981). [49] m trpis, dry season survival of aedes aegypti eggs in various breeding sites in the dar salaam area, tanzania, b. world health organ. 47, 433 (1972). [50] w r horsfall, mosquitoes: their bionomics and relation to disease, ronald, new york, usa (1955). [51] m bar-zeev, the e�ect of temperature on the growth rate and survival of the immature stages of aedes aegypti, bull. entomol. res. 49, 157 (1958). [52] l m rueda, k j patel, r c axtell, r e stinner, temperature-dependent development and survival rates of culex quinquefasciatus and aedes aegypti (diptera: culicidae), j. med. entomol. 27, 892 (1990). [53] t r e southwood, g murdie, m yasuno, r j tonn, p m reader, studies on the life budget of aedes aegypti in wat samphaya bangkok thailand, b. world health organ. 46, 211 (1972). [54] r w fay, the biology and bionomics of aedes aegypti in the laboratory, mosq. news. 24, 300 (1964). [55] m bar-zeev, the e�ect of density on the larvae of a mosquito and its in�uence on fecundity, bull. res. council israel 6b, 220 (1957). 050002-24 papers in physics, vol. 5, art. 050002 (2013) / m l fernández et al. [56] j k nayar, d m sauerman, the e�ects of nutrition on survival and fecundity in florida mosquitoes. part 3. utilization of blood and sugar for fecundity, j. med. entomol. 12, 220 (1975). [57] s n ethier, t g kurtz, markov processes, john wiley and sons, new york (1986). [58] j p aparicio, h g solari, population dynamics: a poissonian approximation and its relation to the langevin process, phys. rev. lett. 86, 4183 (2001). 050002-25 papers in physics, vol. 4, art. 040001 (2012) received: 7 july 2011, accepted: 1 february 2012 edited by: a. goñi reviewed by: j. chavez boggio, leibniz institut für astrophysik potsdam, germany. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.040001 www.papersinphysics.org issn 1852-4249 high-speed tunable photonic crystal fiber-based femtosecond soliton source without dispersion pre-compensation mart́ın caldarola,1∗ vı́ctor a. bettachini,2 andrés a. rieznik,2 pablo g. könig,2 mart́ın e. masip,1 diego f. grosz,2, 3 andrea v. bragas1, 4 we present a high-speed wavelength tunable photonic crystal fiber-based source capable of generating tunable femtosecond solitons in the infrared region. through measurements and numerical simulation, we show that both the pulsewidth and the spectral width of the output pulses remain nearly constant over the entire tuning range from 860 to 1160 nm. this remarkable behavior is observed even when pump pulses are heavily chirped (7400 fs2), which allows to avoid bulky compensation optics, or the use of another fiber, for dispersion compensation usually required by the tuning device. i. introduction light sources based on the propagation of solitons in optical fibers have emerged as a compact solution to the need of a benchtop source of ultra-short tunable pulses [1–3]. the soliton formation from femtosecond pulses launched into an optical fiber is explained in terms of the interplay between selfphase modulation (spm) and group-velocity dispersion (gvd) in the anomalous dispersion regime [4]. the wavelength tunability is a consequence of the raman-induced frequency shift (rifs) produced on the pulse when traveling through the fiber ∗e-mail: caldarola@df.uba.ar 1 laboratorio de electrónica cuántica, departamento de f́ısica, universidad de buenos aires, pabellón i, ciudad universitaria, (c1428eha) buenos aires, argentina. 2 instituto tecnológico de buenos aires, eduardo madero 399, (c1106acd) buenos aires, argentina. 3 consejo nacional de investigaciones cient́ıficas y técnicas, argentina. 4 ifiba, consejo nacional de investigaciones cient́ıficas y técnicas, argentina. [5]. the term soliton self-frequency shift (ssfs) [6] was coined to name this effect widely used to produce tunable femtosecond pulses in different wavelength ranges, e.g., from 850 to 1050 nm [7], from 1050 to 1690 nm [8], and from 1566 to 1775 nm [1]. in most cases, photonic crystal fibers (pcf) are used for building these sources since their gvd can be easily tailored to produce solitons in a desired tuning range [9, 10]. for a given choice of the pcf, full experimental characterization of the pump and output pulses, complemented with theoretical predictions, is necessary to understand how nonlinear effects modify the output soliton. the wavelength tunability in a pcf-based light source is provided by the modulation of the pump power injected into the fiber [11–14]. it is worth noting that the wavelength choice of the output pulse is done without moving any mechanical part, which is clearly attractive for all the proposed and imaginable applications of these soliton sources. moreover, the wavelength of the output pulse can be chosen as fast as one can modulate the power of the pump pulse, as introduced in ref. [15, 16]. by introducing an acousto-optic modulator (aom) in the path of the pump pulse, the output wave040001-1 papers in physics, vol. 4, art. 040001 (2012) / m caldarola et al. length can be changed at a speed which is ultimately limited only by the laser repetition rate. this kind of experimental setup has been presented in some previous reports [14, 17], with stunning applications as the one presented in ref. [18], where a pseudo-cw wideband source for optical coherent tomography is introduced. however, the need to pre-compress the pump pulse to avoid the chirp produced by the aom contrives against the compact and mechanically robust design of the light source. in this paper, we demonstrate that the pcf-based source presented here is robust against chirped pump pulses. we present a complete set of measurements showing that the temporal and spectral characteristics of the generated solitons in the pcf remain unaltered even when pump pulses are heavily chirped up to ∼ 7400 fs2. results are presented for the whole range of tunability (860 nm to 1160 nm). we also present numerical simulations which remarkably fit the experimental data and help to understand the soliton behavior. this paper is organized as follows: in section ii, we describe the experimental setup. the numerical simulations are described in section iii. in section iv, we present experimental and numerical results and in section v we further analyze the results with numerical simulations. finally, in section vi, we present our conclusions. ii. experimental setup a scheme of the experimental setup is shown in fig. 1. a ti:sa laser (kmlabs) generates ultrashort transform-limited (tl) pulses of ∆t = 31 fs (fwhm-sech2), λpump = 830 nm, with a spectral width ∆λ = 23 nm, and a repetition rate of 94 mhz. the aom not only allows high speed (up to mhz) and accurate control of the soliton wavelength, as previously discussed, but also prevents feedback into the ti:sa, replacing the optical isolator required in similar setups [19]. as the aom introduces ∼ 56 mm of sf8 glass path, pump pulses gain a positive chirp of about ∼ 7400 fs2, which leads to a time spread by a factor of ∼ 3 in them. this can be pre-compensated, for example, by introducing an optical fiber in the anomalous dispersion regime [8, 20] or a prism compressor in the well-known configuration presented in [21]. in this figure 1: experimental setup. (a) titaniumsapphire (ti:sa) laser, (b) prism compressor, (c) acousto-optic modulator (aom), (d) coupling lens, (e) photonic crystal fiber (pcf), (f) collimator objective, (g) spatial filter, (h) flipper mirror, (i) fast-scan interferometric autocorrelator, (j) optical spectrum analyzer (osa). work, the chirp was compensated by a pair of sf18 prisms with an apex separation of 78 cm. additionally, the prism compressor allowed us to up-chirp pump pulses in a controlled fashion from tl to ∼ 1400 fs2 by introducing an extra glass path at the second prism of the arrangement [22]. this full or partial compensation of the phase distortion introduced by the aom allowed us to study the role of different chirp figures in the temporal and spectral characteristics of the solitons generated in the pcf. pump pulses were coupled into a nonpolarization-maintaining microstructured fiber commercially used for supercontinuum generation (thorlabs, nl-2.3-790-02). its main parameters are listed in table 1 and the dispersion curve and sem image are shown in fig. 2.1 upon propagation down the fiber, the spectrum is highly broadened so a spatial band-pass filter made of a prism and razor blades, similar to the one presented in [23], allowed to filter the spectral region of the solitonic branch (see fig. 3) without adding any extra chirp to the solitons. once the spectral selection was achieved, a flipper mirror directed the filtered beam for analysis either by the optical spectrum analyzer (osa) or by the interferometric autocorrelator. a fast-scan system [24] allows to perform fast interferometric auto-correlations. briefly, a platform with a hollow retroreflector is moved sinusoidally back and forth, with a stepper motor at 11 hz, to produce and op1datasheet available in http://www.thorlabs.com. 040001-2 papers in physics, vol. 4, art. 040001 (2012) / m caldarola et al. figure 2: dispersion curve of the pcf, showing the zero dispersion wavelength (zdw) at 790 nm. the inset is the scanning electron microscope image of the pcf core. the curve and image were provided by the manufacturer. l 75 cm zdw 790 nm β2 −12.4 ps2km−1 β3 0.07 ps 3km−1 γ(ω) γ0(ω0) + (ω −ω0)γ1 γ0(ω0) 78 w −1km−1 γ1 γ0/ω0 ω0 2271 thz table 1: pcf parameters relevant to the simulation. further details can be found in ref. [14]. tical delay in one of the arms of a michelson interferometer. the autocorrelation signal is recorded by a pmt and averaged with an oscilloscope. iii. numerical simulations in order to further validate experimental results, we simulated the propagation of femtosecond pulses in the pcf by numerically solving the generalized nonlinear schrödinger equation (gnlse) including dispersive, kerr, instantaneous and delayed raman response, and self-steepening effects [25], with a conservation quantity error (cqe) adaptive stepsize algorithm [26]. the gnlse reads ∂a ∂z + β1 ∂a ∂t + i β2 ∂2a ∂t2 (1) −β3 ∂3a ∂t3 + ... = i γ(ω) ( 1 + i ω0 ∂ ∂t ) × ( a(z,t) ∫ ∞ −∞ r(t′)|a(z,t− t′)|2dt′ ) , r(t) = (1 −fr) δ(t) + frhr(t), hr(t) = (fa + fc) ha(t) + fbhb(t), ha(t) = τ1 ( τ−21 + τ −2 2 ) e−t/τ2 sin (t/τ1), hb(t) = [ (2τb − t) /τ2b ] e−t/τb, where a(z,t) is the complex envelope of the electric field, βn are the expansion terms for the propagation constant around the carrier frequency ω0 and γ is the nonlinear coefficient. fr(t) represents the fractional contribution of the delayed raman effect hr. note that eq. (1) adopts a more accurate description of this effect than the one usually used [4]. in our simulation, we adopted τ1 = 12.2 fs, τ2 = 32 fs, τb = 96 fs, fa = 0.75, fb = 0.21, fc = 0.04, and fr = 0.24 [27]. the dependence of the fiber non-linear parameter γ with the frequency was modeled as a linear function (see table 1). iv. results i. transform-limited pump pulses first, we present the full characterization of the soliton source seeded by tl pump pulses, in an extended wavelength range if compared with the results presented in our previous paper [14]. in order to investigate the dependence of the output spectrum with the coupled power, managed by the aom, we skipped spectral filtering at first. fig. 3 shows the measured spectrum at the pcf output as a function of the coupled power. the infrared solitonic branch appears at ∼ 10 mw and undergoes red-shift with increasing power. the maximum wavelength attained is 1130 nm at 55 mw. spectra in fig. 3 also shows show that some of the input energy is converted to visible non-solitonic radiation. the pulsewidth of the filtered soliton as a function of its wavelength, λs, is shown in fig. 4. the 040001-3 papers in physics, vol. 4, art. 040001 (2012) / m caldarola et al. λ [nm] c o u p le d p o w e r [m w ] 500 600 700 800 900 1000 1100 1200 10 20 30 40 50 figure 3: experimental spectra vs coupled power to the pcf with transform limited (tl) pump pulses. the color map shows spectral intensity. the maximum achieved soliton shift, λs ' 1130 nm, was reached at 55 w. pulsewidth remains constant at ∼ 45 fs, for the entire tunability range. numerical simulations are also plotted in the same figure, showing an excellent agreement with experimental measurements. 850 900 950 1000 1050 1100 1150 λ s [nm] 35 40 45 50 55 ∆ t [f s ] figure 4: experimental pulsewidth of the soliton as a function of its wavelength, pumping the pcf with tl pulses. the results for the three lower wavelengths were already present in ref. [14]. full line: numerical simulations. ii. chirped pump pulses the effect over the soliton produced by the chirp of pump pulses was studied systematically by introducing a known amount of extra glass path on the second prism of the compressor. this scheme allowed to change the gvd of pump pulses from 0 to 1400 fs2. further chirping was achieved by the complete removal of the prism compressor, leading to a total amount of positive chirp ∼ 7400 fs2. 0 400 800 1200 1600 chirp [fs 2 ] 35 40 45 50 55 ∆ t [ fs ] 7400 0 400 800 1200 1600 chirp [fs 2 ] 15 20 25 30 ∆ λ [ n m ] 7400 figure 5: soliton temporal (a) and spectral width (b) vs. chirp of input pump pulses . full line: numerical results. the soliton wavelength is λs = 1075 nm. figure 5 (a) shows the pulsewidth of solitons with wavelength λs = 1075 nm upon variation of pump pulses chirp. even for a ∼ 7400 fs2 chirp, the soliton output pulsewidth remained around 45 fs. numerical simulations show very good agreement with these observations, as they predict nearly constant pulsewidth regardless of the input chirp (full line in fig. 5). measurements and numerical simulations in the spectral domain also indicate that the bandwidth of the output solitons is almost unaffected by the pump pulses chirp [see fig. 5 (b)]. the product ∆t∆ν was found to be near 0.315, as it is expected for transform-limited sech2 pulses. the effect of this heavy chirping was evident in the auto-correlation traces of pump pulses, as can be seen by comparing fig. 6 (a) and (c). however, there is not a clear difference between traces of the output solitons for the tl (b) and the highly chirped (∼ 7400 fs2) case (d). a color map of the spectra as function of the coupled power, for a highly chirped pump pulse (∼ 7400 fs2), is shown in fig. 7. as in the tl case, we observe that a solitonic branch is red shifted 040001-4 papers in physics, vol. 4, art. 040001 (2012) / m caldarola et al. figure 6: interferometric auto-correlation traces of tl (a) and heavily chirped, ∼ 7400 fs2, ti:sa pump pulses (c). interferometric auto-correlation traces of the output solitons are similar in both cases, unchirped (b) and heavily chirped (d). soliton wavelength is λs ' 1075 nm. by increasing the coupled power. however, in this case, 80 mw of coupled power is required to produce a 1160 nm soliton which represents an increment of about ∼ 45% in comparison to the tl case. λ [nm] c o u p le d p o w e r [m w ] 500 600 700 800 900 1000 1100 1200 0 10 20 30 40 50 60 70 80 figure 7: spectra vs coupled power to the pcf with highly chirped (∼ 7400 fs2) input pulses. a comparison of the soliton red-shift between tl and chirped pump pulse cases is presented fig. 8. the figure shows that more power is always required to attain the same shift when pump pulses are heavily chirped. 0 20 40 60 80 p [mw] 0 100 200 300 400 ∆ λ s [ n m ] 0 fs 2 7400 fs 2 figure 8: soliton wavelength shift for chirped (full squares) and tl pump pulses (empty squares) vs pump pulse power. dashed and full lines correspond to numerical results for tl and highly chirped pump pulses, respectively. figure 9 (a) shows the soliton pulsewidth as a function of its wavelength, λs, when the pump pulse is heavily chirped (∼ 7400 fs2). we observe an approximately constant output pulsewidth (∼ 45 fs) in the entire tuning range. furthermore, the ∆t∆ν product, shown in fig. 9 (b), indicates that the generated pulses can be identified as fundamental solitons (sech2-like), as in the case of tl pump pulses [14]. numerical simulations were also performed for this case (full lines in fig. 9) showing an excellent agreement with experimental results. v. discussion i. fiber soliton self-frequency shift effective length in order to further analyze soliton formation, we studied the pulse evolution along the fiber by performing numerical simulations. the spectrum evolution along the fiber, for a given coupled power, in the tl and the chirped cases are shown in fig. 10 (a) and (b), respectively. these simulations show that in the case of chirped pump pulses (∼ 7400 fs2), the spectrum broadening and the soliton formation take place farther down into the fiber (see fig. 10), as compared to the tl case. the delay in the formation of the soliton can be explained by an interplay of opposite chirping effects: the positive chirp acquired by traversing the aom is compensated as the pulse advances into the 040001-5 papers in physics, vol. 4, art. 040001 (2012) / m caldarola et al. 1000 1050 1100 1150 λ s [nm] 35 40 45 50 55 ∆ t [f s ] 1000 1050 1100 1150 λ s [nm] 0,1 0,2 0,3 0,4 0,5 ∆ t ∆ ν figure 9: (a) soliton pulsewidth and (b) the product ∆t∆ν vs wavelength in the case of highly chirped pump pulses (∼7400 fs2). pcf, in anomalous propagation, leading to pulse compression. the pcf itself provides pulse compression in the first stretch of the fiber previously to the branching of a soliton. therefore, the ssfs effective length, i.e., the fiber path where nonlinearity broadens the spectrum, is longer in the tl case. if the chirp is overcompensated and a negatively chirped pulse is fed into the fiber, these pulses would also be compressed within the first stretch of the fiber due to spm [28] leading to the same behavior than in the positively chirped case, resulting in a narrower tunability range. once the soliton is formed and the peak power is high enough, intrapulse raman scattering redshifts the soliton as it propagates through the remaining of the fiber. this spectral shift increases with both fiber length and soliton peak power [4]. so the fact that the soliton is formed at different lengths explains the red shifts observed for the same coupled power. however, as a larger wavelength shift can be achieved with a higher input power, this shortening in the effective length in the chirped case could be compensated by coupling more power into the pcf [1]. another possibility for compensating this effect on the ssfs is using a longer pcf. λ [nm] l e n g th [ c m ] 500 600 700 800 900 1000 1100 1200 0 10 20 30 40 50 60 70 λ [nm] l e n g th [ c m ] 500 600 700 800 900 1000 1100 1200 0 10 20 30 40 50 60 70 figure 10: simulated spectral evolution along the fiber length. pump pulses with identical peak power produce more soliton shifting with unchirped (a) than with heavily chirped (∼7400 fs2) (b) pump pulses. ii. fiber power conversion efficiency figure 11 shows a simulation where the same shift wavelength obtained for tl pump pulses is achieved by increasing the coupled power in the shorter effective length fiber (7400 fs2 chirp). fission of more than one soliton branch is visible in this case, as compared to the case of tl pump pulses, for which only a single soliton branch appears (fig. 10). each soliton branch carries a fundamental soliton (n = 1) with a peak power p0 given by [4] n2 = 1 = γp0t 2 0 |β2| , (2) as γ, β2 and t0 (see figs. 4 and 5) are the same in both, the tl and chirped cases, the peak power 040001-6 papers in physics, vol. 4, art. 040001 (2012) / m caldarola et al. λ [nm] l e n g th [ c m ] 500 600 700 800 900 1000 1100 1200 0 10 20 30 40 50 60 70 figure 11: simulation of the spectral evolution along the fiber showing that the same shifting shown in fig. 10 (a) for unchirped pump pulses is attained with chirped ones (∼ 7400 fs2) with a higher pump power. of the solitons is also the same. the arising of new soliton branches partially accounts for the increased pump power required in the chirped case (fig. 11) to attain the same shift. indeed, the soliton-pump power ratio is 0.2 in the chirped case and 0.44 in the tl case. this result reveals that the use of the pcf as a compressor decreases its power conversion efficiency. on the other hand, it is possible to achieve the same soliton shift as in the tl case by increasing the fiber length, and keeping the same pump power. in this case, the power conversion efficiency is even lower, 0.17, as predicted by simulations. vi. conclusions we have presented a high-speed tunable soliton infrared source capable of generating ∼ 45 fs transform-limited pulses in the range from 860 to 1160 nm. both the pulsewidth and the spectral width were shown to remain constant over the entire tuning range, even when pump pulses were heavily chirped up to 7400 fs2. insensitivity to the chirp of pump pulses points out to the feasibility of avoiding bulky compensation optics prior to the pcf, opening up the possibility to build reliable and compact high-speed tunable femtosecond sources in the near infrared region. a minor drawback of this source is that either more power needs to be coupled or a longer pcf needs to be used in order to achieve the same tuning range obtained with transform-limited pump pulses. acknowledgements this work was supported by anpcyt pict 2006-1594, anpcyt pict 2006-497 and uba programación cient́ıfica 20082010, proyecto n x022. [1] n nishizawa, t goto, compact system of wavelength-tunable femtosecond soliton pulse generation using optical fibers, ieee photon. technol. lett. 11, 325 (1999). [2] k abedin, f kubota, wavelength tunable high-repetition-rate picosecond and femtosecond pulse sources based on highly nonlinear photonic crystal fiber, ieee j. sel. topics quantum electron. 10, 1203 (2004). [3] j h lee, j van howe, c xu, x. liu, soliton self-frequency shift: experimental demonstrations and applications, ieee j. sel. topics quantum electron. 14, 713 (2008). [4] g p agrawal, nonlinear fiber optics, academic press, san diego (2007). [5] f m mitschke, l f mollenauer, discovery of the soliton self-frequency shift, opt. lett. 11, 659 (1986). [6] j p gordon, theory of the soliton selffrequency shift, opt. lett. 11, 662 (1986). [7] b washburn, s ralph, p lacourt, j dudley, tunable near-infrared femtosecond soliton generation in photonic crystal fibers, electronics lett. 37, 1510 (2001). [8] j takayanagi, t sugiura, m yoshida, n nishizawa, 1.0-1.7 µm wavelength-tunable ultrashort-pulse generation using femtosecond yb-doped fiber laser and photonic crystal fiber, ieee photon. technol. lett. 18, 659 (2006). [9] p russell, photonic crystal fibers, science 299, 358 (2003). 040001-7 papers in physics, vol. 4, art. 040001 (2012) / m caldarola et al. [10] d v skryabin, f luan, j c knight, p st j russell, soliton self-frequency shift cancellation in photonic crystal fibers, science 31, 1705 (2003). [11] n nishizawa, y ito, t goto, 0.78-0.90 wavelength-tunable femtosecond soliton pulse generation using photonic crystal fiber, ieee photon. technol. lett. 14, 986 (2002). [12] k s abedin, f kubota, widely tunable femtosecond soliton pulse generation at a 10ghz repetition rate by use of the soliton selffrequency shift in photonic crystal fiber, opt. lett. 28, 1760 (2003). [13] n ishii, c y teisset, e e serebryannikov, t fuji, t metzger, f krausz, a m zheltikov, widely tunable soliton frequency shifting of few-cycle laser pulses, phys. rev. e 74, 036617 (2006). [14] m e masip, a a rieznik, p g könig, d f grosz, a v bragas, o e mart́ınez, femtosecond soliton source with fast and broad spectral tunability, opt. lett. 34, 842 (2009). [15] s sanders, wavelength-agile fiber laser using group-velocity dispersion of pulsed supercontinua and application to broadband absorption spectroscopy, appl. phys. b lasers opt. 75, 799 (2002). [16] j walewski, m borden, s sanders, wavelength-agile laser system based on soliton self-shift and its application for broadband spectroscopy, appl. phys. b lasers opt. 79, 937 (2004). [17] k sumimura, t ohta, n nishizawa, quasisuper-continuum generation using ultrahighspeed wavelength-tunable soliton pulses, opt. lett.33, 2892 (2008). [18] k sumimura, y genda, t ohta, k itoh, n nishizawa, quasi-supercontinuum generation using 1.06 µm ultrashort-pulse laser system for ultrahigh-resolution optical-coherence tomography. opt. lett. 35, 3631 (2010). [19] m-c chan, s-h chia, t-m liu, t-h tsai, mc ho, a ivanov, a zheltikov, j-y liu, h-l liu, c-k sun, 1.2to 2.2m tunable raman soliton source based on a cr:forsterite laser and a photonic-crystal fiber, ieee photon. technol. lett. 20, 900 (2008). [20] j nicholson, a yablon, p westbrook, k feder, m yan, high power, single mode, all-fiber source of femtosecond pulses at 1550 nm and its use in supercontinuum generation, opt. express 12, 3025 (2004). [21] r l fork, o e mart́ınez, j p gordon, negative dispersion using pairs of prisms, opt. lett. 9, 150 (1984). [22] r l fork, c h b cruz, p c becker, c v shank, compression of optical pulses to six femtoseconds by using cubic phase compensation, opt. lett. 12, 483 (1987). [23] j l a chilla, o e martinez, direct determination of the amplitude and the phase of femtosecond light pulses, opt. lett. 16, 39 (1991). [24] s costantino, a r libertun, p d campo, j r torga, o e mart́ınez, fast scanner with position monitor for large optical delays, opt. comm. 198, 287 (2001). [25] j dudley, g genty, s coen, supercontinuum generation in photonic crystal fibers, rev. mod. phys. 78, 1135 (2006). [26] a heidt, efficient adaptive step size method for the simulation of supercontinuum generation in optical fibers, j. lightwave technol. 27, 3984 (2009). [27] q lin, g agrawal, raman response function for silica fibers, opt. lett. 31, 3086 (2006). [28] b r washburn, j a buck, s e ralph, transform-limited spectral compression due to self-phase modulation in fibers, opt. lett. 25, 445 (2000). 040001-8 papers in physics, vol. 11, art. 110005 (2019) received: 3 december 2018, accepted: 21 may 2019 edited by: j. s. reparaz reviewed by: a. san miguel, institut lumière matière, université de lyon, france licence: creative commons attribution 4.0 doi: http://dx.doi.org/10.4279/pip.110005 www.papersinphysics.org issn 1852-4249 on the impact of the stress situation on the optical properties of wse2 monolayers under high pressure a. francisco-lópez,1 b. han,2 d. lagarde,2 x. marie,2 b. urbaszek,2 c. robert,2 a. r. goñi1, 3∗ we have studied the optical properties of wse2 monolayers (ml) by means of photoluminescence (pl), pl excitation (ple) and raman scattering spectroscopy at room temperature and as a function of hydrostatic pressure up to ca. 12 gpa. for comparison, the study comprises two cases: a single wse2 ml directly transferred onto one of the diamonds of the diamond anvil cell and a wse2 ml encapsulated into hexagonal boron nitride (hbn) layers. the pressure dependence of the a and b exciton, as determined by pl and ple, respectively, is very different for the case of the bare wse2 ml and the hbn/wse2-ml/hbn heterostructure. whereas for the latter the a and b exciton energy increases linearly with increasing pressure at a rate of 3.5 to 3.8 mev/gpa, for the bare wse2 ml the a and b exciton energy decreases with a coefficient of -3.1 and -1.3 mev/gpa, respectively. we interpret that this behavior is due to a different stress situation. for a single ml the stress tensor is essentially uniaxial with the compressive stress component in the direction perpendicular to the plane of the ml. in contrast, for the substantially thicker hbn/wse2-ml/hbn heterostructure, the compression is hydrostatic. the results from an analysis of the pressure dependence of the frequency of raman active modes comply with the interpretation of having a different stress situation in each case. i. introduction monolayers (mls) of group-vi transition metal dichalcogenides (tmds), mx2 with m=mo, w and x= s, se, te, have emerged as fascinating twodimensional (2d) semiconductors due to remarkable properties that differ from the bulk. for ∗e-mail: goni@icmab.es 1 institut de ciència de materials de barcelona (icmabcsic), campus de la uab, 08193 bellaterra, spain. 2 université de toulouse, insa-cnrs-ups, lpcno, 135 avenue de rangueil, 31077 toulouse, france. 3 icrea, passeig llúıs companys 23, 08010 barcelona, spain. example, mls possess a direct band gap located at the k points of the hexagonal brillouin zone [1, 2]. electron-hole pairs are also strongly bound by coulomb interactions giving rise to robust excitons with binding energies of several hundreds of mev [3]. due to the lack of inversion symmetry in the ml, both spin and k valley degrees of freedom can be controlled using chiral optical selection rules [4]. finally, the strong spin-orbit coupling is responsible for large splittings in both valence and conduction band with several hundreds and tens of mev at the k points, respectively. this gives rise to a large variety of optically bright and dark excitons, governing the optical properties from cryogenic to room temperature. bright excitons involving the 110005-1 papers in physics, vol. 11, art. 110005 (2019) / a. francisco-lópez et al. first valence band are called a excitons, whereas those involving the second valence band are called b excitons. dark excitons can be optically inactive due to several reasons. the most studied ones are the spin-forbidden dark excitons which have been identified experimentally in wse2 and ws2 [5–9]. momentum mismatch (indirect) excitons composed of electron and hole residing in different k valleys have also been predicted but a clear experimental identification is still lacking. band structure calculations within density functional theory (dft) have shown that conduction band minima at the q points can be a few tens of mev above the k valleys in tungsten-based mls [10]. because the effective mass at the q points is expected to be larger than that at the k points, an indirect exciton composed of an electron in a q valley and a hole in a k valley may have larger binding energy than a direct bright exciton with electron and hole both residing in the k valley. these indirect excitons may give rise to photoluminescence (pl) emission through phonon assisted process as suggested in ref. [11]. high pressure methods combined with optical spectroscopy have demonstrated to be a powerful tool to tune the band structure and optical properties of semiconductors in general [13]. because the bands at different points of the brillouin zone shift in energy with different pressure coefficients, high pressure experiments can be used to distinguish between k-space direct and indirect transitions. as far as truly two-dimensional (2d) tmd materials are concerned, there are already several reports on the dependence on hydrostatic pressure of the emission and vibrational properties of monolayers of mos2 [14–18], wse2 [19] and ws2 [20]. the results, however, are surprisingly contradictory. all studies were performed on mls transferred or deposited on sio2/si wafers, except for the work of ref. [20], in which authors also looked at the pressure behavior of the ml directly on the diamond anvil surface. we conducted high-pressure opticalspectroscopy experiments on wse2 mls on the diamond anvil and encapsulated by hbn layers for comparison. the encapsulation of tmd mls using hexagonal boron nitride (hbn) layers contributes much to the preservation of the physical properties of the ml. in fact, narrower pl linewidths have been reported for hbn encapsulated mls. in this work, we show that different stress situations hold in the two samples, being fully hydrostatic for the figure 1: (left panel) optical image of the dac loaded with a single-layer of wse2 (plus some bulk and few layers) at a pressure of ca. 5.4 gpa. the monolayer is only apparent in the pl map (right panel), corresponding to a scan of the a-exciton peak intensity of the wse2 ml across the sample. hbn/wse2-ml/hbn heterostructure but uniaxial for the bare ml. this has direct impact on sign and magnitude of the pressure coefficient of the a and b excitons, for example. ii. experimental details for the high pressure experiments, the samples are mechanically exfoliated onto the surface of one of the diamond anvils of the high pressure cell (dac). we thus transferred onto the diamond surface using a polydimethylsiloxane (pdms) stamp either a wse2 flake containing some bulk and few-layers parts as well as the desired single monolayer or a hbn/ml/hbn van der waals (vdw) heterostructure. the latter were fabricated by mechanical exfoliation of bulk wse2 (commercially available) and hbn crystals [21]. a first layer of hbn is mechanically exfoliated onto a freshly cleaned diamond. the deposition of the subsequent wse2 ml and the second hbn capping layer is obtained by repeating this procedure. the left panel of fig. 1 illustrates the result of loading the dac with bare ml sample. the high-pressure photoluminescence and microraman scattering measurements were performed at room temperature employing a gasketed diamond anvil cell. a 4:1 methanol/ethanol mixture was used as pressure transmitting medium and a ruby sphere for pressure calibration [22]. the photoluminescence (pl) spectra were excited with the 110005-2 papers in physics, vol. 11, art. 110005 (2019) / a. francisco-lópez et al. 488 nm line of an ar+-ion laser, whereas for the raman measurements an infrared diode laser emitting at 785 nm was employed in addition to the 488 nm line. spectra were collected using a 20× long working distance objective with na=0.35 and dispersed with a high-resolution labram hr800 grating spectrometer equipped with a charge-coupled device detector. pl spectra were corrected for the spectral response of the spectrometer by normalizing each spectrum using the detector and the 600-grooves/mm grating characteristics. the experiments were always performed illuminating the same spot on the sample by using reference marks within the field of view of our confocal microscope. pl maps of the samples loaded into the dac were measured using a 633 nm hene laser in a witec alpha 300 ra+ confocal setup and the same 20× objective. the acquisition time was set to 100 ms per point and the pl images typically consisted of 100 × 100 µm2 regions analyzed in lateral steps of 2 µm (see right panel of fig. 1). for pl excitation measurements (ple) we used a fianium sc-400-4 supercontinuum laser, tunable from 410 nm to 1000 nm. pl spectra in the spectral region of the ml maximum emission were measured with the labram hr800 spectrometer while the white laser was tuned in steps of 2 nm in the range from 450 to ca. 700 nm. iii. results and discussion i. pl and ple under pressure figures 2a and 2b show several pl spectra of the bare wse2 ml and the hbn/wse2-ml/hbn vdw heterostructure, respectively, measured at room temperature as a function of pressure for 488-nm excitation. to ease their comparison, each spectrum has been normalized to its maximum intensity and vertically translated in the plot. in both cases, at ambient conditions the pl spectrum is dominated by a single emission band peaking at about 1.66 ev and having 50 mev full width at half maximum (fwhm), which corresponds to the radiative recombination of a-excitons. only for a single monolayer such emission is direct in nature and occurs between states with the same spin of the conduction band and the top of the valence band at the k-points of the brillouin zone. strikingly, with increasing pressure, the a-exciton exhibits a different behavior for the bare wse2 ml and the hbn/wse2-ml/hbn vdw heterostructure. whereas for the former the a exciton first increases slightly and then decreases in energy with pressure, for the latter case a monotonous blue-shift is observed. in both cases, the pl intensity is continuously reduced by applying pressure, as shown in figs. s1a,b of the supplementary material. the ple technique is complementary to luminescence, because it is particularly suitable to study optical transitions between excited states. here, we used ple to determine the pressure dependence of the b-exciton, involving holes of the second valence band. a representative ple spectrum (closed symbols) is displayed in fig. 3 for the case of the bare wse2 ml. each data point of the ple spectrum corresponds to the energy of the tunable laser used for excitation and the integrated intensity of the aexciton pl peak, normalized by the incident laser power at the tuned wavelength. the ple spectra exhibit essentially two features: a kind of excitonic absorption edge associated with the b-exciton and a peak-like feature corresponding to a high-energy critical point (cp) in the joint density of states of a wse2 ml [23]. changes in the ple lineshapes under pressure were analyzed using a fitting function [24] (solid gray curve in fig. 3), which consists of two components (dash-dotted red curves). the cp feature could be well described by a gaussian, whereas for the b-exciton we considered a series of gaussian peaks accounting for the discrete energy spectrum [25] and the analytical expression derived for the exciton continuum in ref. [26]. figure s2 of the supplementary material shows a representative ple spectrum for the encapsulated wse2 ml. to analyze the pl spectra of the wse2 mls, we used a gaussian-lorentzian cross-product function for describing the main peak which is ascribed to the a-exciton recombination. this function is a useful simplification of a voigt function, which corresponds to the mathematical convolution of a lorentzian, to account for the natural lineshape due to spontaneous emission, and an inhomogeneously broadened gaussian, accounting for a normal distribution of exciton energies. from these lineshape fits to the pl spectra we were able to extract the energy e1 of the ground state of the excitonic discrete spectrum corresponding to the aexciton, whereas from the lineshape fits of the ple spectra we obtained the e1 energy of the b-exciton 110005-3 papers in physics, vol. 11, art. 110005 (2019) / a. francisco-lópez et al. 1 . 5 1 . 6 1 . 7 1 . 8 1 2 3 4 5 1 1 . 7 0 1 0 . 6 0 1 0 . 2 5 9 . 8 5 9 . 5 0 9 . 2 0 8 . 7 5 7 . 8 5 7 . 5 0 6 . 9 5 6 . 4 5 5 . 7 5 5 . 0 5 4 . 4 0 3 . 5 0 3 . 9 0 3 . 1 0 2 . 7 0 2 . 5 0 2 . 3 0 1 . 6 0 1 . 3 5 0 . 7 5 0 . 6 5 0 . 3 5 0 . 1 5 w s e 2 m l 4 8 8 n m no rm ali ze d i nte ns ity (a rb. un its ) e n e r g y ( e v ) 0 p ( g p a ) ( a ) 1 . 5 1 . 6 1 . 7 1 . 8 0 1 2 3 4 5 8 . 3 0 7 . 9 5 7 . 6 5 7 . 3 5 6 . 9 0 6 . 6 0 6 . 0 0 5 . 5 5 5 . 1 5 4 . 7 5 4 . 1 0 3 . 6 5 3 . 3 0 2 . 8 0 2 . 4 5 1 . 9 5 1 . 6 0 1 . 2 5 0 . 8 5 0 . 4 5 no rm ali ze d i nte ns ity (a rb. un its ) e n e r g y ( e v ) h b n / w s e 2 m l / h b n 4 8 8 n m 0 p ( g p a ) ( b ) figure 2: normalized pl spectra measured at room temperature and as a function of pressure with 488-nm excitation (a) for the bare wse2 ml and (b) for the hbn/wse2-ml/hbn vdw heterostructure. the sharp line at 1.58 ev corresponds to the spurious signal from the 785 nm laser. counterpart [24]. both ground state energies are plotted as a function of pressure for the bare wse2 ml and the hbn/wse2-ml/hbn heterostructure in figs. 4a,b. both excitons behave essentially the same under pressure. this means that their energy separation of ca. 425 mev, determined by spinorbit interaction and an eventual difference in exciton binding energies, is fairly insensitive to pressure [27]. strikingly, the behavior of the a,b-excitons of the bare and encapsulated wse2 ml is opposite. whereas for the hbn/wse2-ml/hbn heterostructure both exciton energies increase linearly with increasing pressure with a coefficient of 3.53.8 mev/gpa, for the bare wse2 ml the excitons decrease in energy with a slope of -3.1 and -1.3 mev/gpa for the aand b-exciton, respectively. comparing with the available literature data, the pressure coefficient of the a-exciton determined here for the encapsulated wse2 ml is roughly one order of magnitude smaller than the one reported for a single ml of mos2 (20 mev/gpa [14], 30 1 . 8 2 . 0 2 . 2 2 . 4 2 . 6 2 . 8 0 5 0 1 0 0 c pp le in ten sit y ( arb . u nit s) e n e r g y ( e v ) w s e 2 m l 0 . 0 5 g p a b e x c i t o n figure 3: ple spectrum (closed blue symbols) of the bare wse2 ml at a very low pressure right after closing the dac. the solid grey curve and the dash-dotted red curves represent the lineshape fitting function and its components, respectively, associated to the b-exciton and a critical point (cp). 110005-4 papers in physics, vol. 11, art. 110005 (2019) / a. francisco-lópez et al. mev/gpa [15], 40 mev/gpa [16], 50 mev/gpa [17]), wse2 (32 mev/gpa [19]) and ws2 (20 mev/gpa [20]). all these data were obtained for monolayers on si/sio2 substrates, where the oxide layer was fairly thick, ranging between 200 and 300 nm. in the work of han et al. [20], the same experiment is reported for a ws2 ml exfoliated directly onto the diamond and a much smaller pressure coefficient is found (10 mev/gpa). in the next section, we suggest a possible way to explain such a large disparity in the high pressure results concerning the excitonic properties of single mls on the basis of a close inspection of the stress-strain relations. ii. stress-strain relations for different stress situations first of all, we have to define the physical system for consideration of its stress/strain situation inside the dac. it is a common practice to include the substrate supporting the 2d system, provided that adhesion forces between 2d sample and substrate are important. in this respect, we note that the conventional adhesion, corresponding to the tendency of two materials to cling to one another by van der waals (vdw) forces is not relevant for high pressure experiments. the net vdw adhesion (attractive) force is perpendicular to the surfaces in contact and is proportional to the contact surface. inside the diamond anvil cell (dac), however, the substrate (or the diamond) will transmit the same pressure to the sample as it does the pressure transmitting medium, irrespective of the magnitude of the vdw adhesion force. this is just a requirement of the static equilibrium condition. in contrast, a “lateral” adhesion force acts solely in-plane, as is associated with the gliding of one object on a substrate. these lateral forces are typically much weaker than normal adhesion forces. an important exception for which lateral adhesion become significant concerns the case of a rough substrate surface (rippled) and a sample (a thin membrane) showing certain degree of conformation to the substrate surface (see ref. [28]). this can drive the transmission of in-plane strain (tensile or compressive) from the substrate to the sample, as pointed out by d. machon et al. [29], being in principle relevant for high pressure experiments, mainly when substrate and sample exhibit different bulk modulus, i.e., different compressibility. it has been shown [28] that a thin membrane characterized by a bending rigidity c would be conformal to a rippled surface with a curvature κg and an adhesion energy (contact surface potential) γ, depending on the values adopted by a dimensionless parameter α = √ κeq κg . here κeq = √ 2γ c is the equilibrium curvature of the membrane, which is determined by the ratio of the adhesion energy and the bending rigidity of the membrane. when α ≥ 1 the membrane is expected to adhere well to the surface, whereas for α � 1 the membrane cannot adapt to the rippled surface and detaches from it being non-conformal. for the case of a 2d system with n layers, the bending rigidity depends on n, such that cn increases sharply for an increasing number of layers. thus, α is expected to decrease with increasing n. in fact, from the analysis of the raman data regarding the in-plane strain transferred from a si/sio2 substrate to single, bilayer, and multi-layer graphene, it has been inferred that for n > 2 the 2d system looses its adhesion due to the enhanced rigidity of the multilayers [29–31]. moreover, in the work of alencar et al. [18] on the pressure behavior of mos2 monolayers on si/sio2 substrates, it was argued that the unbinding of the transition metal dichalcogenide layer from the substrate can already occur for a single layer, because it has a much higher bending modulus than graphene [32]. as a matter of fact, they observe a splitting of the raman modes of the mos2 ml, which is ascribed to the presence of regions with low and high conformation of the ml to the substrate, i.e. regions with different inplane strain, affecting the frequency of the optical phonons. concerning our case of having the 2d systems directly transferred onto one on the diamond anvils, we can assume that the thick and much rigid hbn/wse2-ml/hbn sandwich does not adhere to the diamond. in the case of the wse2 monolayer, since we do not observe any splitting neither in the raman modes (see discussion below) nor in the bright excitonic luminescence (pl), we can also assume that a situation of low conformation applies. taking all the mentioned facts together, we are led to the conclusion that adhesion effects are not relevant for the proper interpretation of our experimental results and, hence, that we do not have to consider the diamond anvil as part of the physical system. 110005-5 papers in physics, vol. 11, art. 110005 (2019) / a. francisco-lópez et al. 0 2 4 6 8 1 0 1 . 6 5 1 . 7 0 2 . 0 5 2 . 1 0 1 . 3 ( 2 ) m e v / g p a 3 . 1 ( 1 ) m e v / g p a ( a ) ex cit on e ne rgy e 1 ( ev ) p r e s s u r e ( g p a ) w s e 2 m l b e x c i t o n a e x c i t o n 0 2 4 6 8 1 0 1 . 6 5 1 . 7 0 2 . 0 5 2 . 1 0 3 . 5 ( 1 ) m e v / g p a a e x c i t o nex cit on e ne rgy e 1 ( ev ) p r e s s u r e ( g p a ) h b n w s e 2 m l h b n b e x c i t o n 3 . 8 ( 2 ) m e v / g p a ( b ) figure 4: ground state energy e1 of the a and b-exciton as a function of pressure (a) for the bare wse2 ml and (b) for the hbn/wse2-ml/hbn vdw heterostructure. lines represent least-squares fits to the data points. numbers in parentheses represent error bars. let us consider that our 2d system corresponding either to the bare wse2 ml or the hbn/wse2ml/hbn vdw heterostructure is well described just as a thin slab of thickness d inside the dac, as sketched in fig. 5. for a sufficiently thick slab, the stress situation would be hydrostatic (∆p = 0). our ansatz is that for vanishing thickness a departure from the strictly hydrostatic situation would gradually develop. below a certain critical value, the in-plane stress components will decrease with decreasing thickness by a certain amount ∆p. the reason is that for a vanishing slab cross section the molecules of the pressure medium are unable to transmit any momentum to the slab in the inplane directions. in the limit of vanishing thickness d (a graphene ml, for example), the in-plane components of the stress tensor would also vanish (∆p = p). although tmd monolayers belong to the d3h space group, from the point of view of their elastic properties it is a good approximation to use an elastic-stiffness tensor with cubic symmetry, i.e., c33 ≈ c11 and c13 ≈ c12 [33]. please, note that it is not correct to take the bulk values for c33,c13, because they are much smaller than for a ml due to inclusion of the weak vdw interlayer interactions. under these assumptions (see also discussion in the supplementary material), we have derived the strain-tensor components �ij and relative volume change ∆v v for the general stress situation of the slab in fig. 5: �xy = − p 3b0 + c11 c11 − c12 · ∆p 3b0 , (1) �z = − p 3b0 − 2c12 c11 − c12 · ∆p 3b0 , (2) ∆v v = − p b0 + 2∆p 3b0 , (3) where b0 = 1 3 (c11 + 2c12) is the “bulk modulus” of the monolayer and the minus sign of the compressive pressure is explicitly written in the equations, such that 0 ≥ ∆p ≥ p . after eqs. (1) to (3), the usual hydrostatic case corresponds to having ∆p = 0, for which all three strain tensor components are the same and compressive (�xy = �z = − p3b0 ). this occurs for the several monolayers thick hbn/wse2-ml/hbn heterostructure. on the contrary, an effectively uniaxial stress situation sets in for ∆p = p . in this case, a compressive stress −p is applied from both sides and perpendicular to the plane of the slab, whereas the in-plane strain is tensile (�uniax.xy = − c12 c11−c12 · �hydro.xy ). we believe that this situation of highly, if not pure, uniaxial stress does apply for the bare wse2 ml case, as we argue further below when discussing the raman results. as a consequence, the sign of the linear pressure coefficient of the a and b-exciton energy changes from positive 110005-6 papers in physics, vol. 11, art. 110005 (2019) / a. francisco-lópez et al. to negative for the bare ml as compared to the encapsulated one because for the former the in-plane strain is tensile rather than compressive. furthermore, for all literature data, the stress situation corresponding to a ml on top of a sio2/si substrate is much more complicated. the bulk modulus of the silicon oxide is about one third of that of si. as discussed in detail in the supplementary material, a thick sio2 layer epitaxially attached to si will be strongly deformed when hydrostatic pressure is applied, due to the large biaxial tensile stress that the much thicker and less compressive si exerts on the oxide layer, which under pressure tends to compress three times more than si. as recognized in ref. [20], the large deformation of the si oxide layer introduces an extra strain of the tmd monolayer, which might be the reason for the observation of much higher pressure coefficients for the a,b-excitons. p-dp d p p-dp p-dp p-dp p z x y figure 5: schematic representation of the stress situation inside the dac for a slab of thickness d. the stress tensor can depart from hydrostatic (∆p 6= 0), if d → 0. iii. raman scattering under pressure the dominant first-order raman active modes of bulk 2h-wse2 are the one with a1g symmetry, which involves atomic displacements in the direction perpendicular to the layers, and the ones with e2g symmetry, for which the atomic displacements are in-plane [34]. for a single monolayer, the corresponding raman allowed modes with totally similar eigenvectors as for the bulk are the a’1 and e’ modes. in wse2 both mode types possess very similar frequencies and in raman spectra, both overlap with a very strong signal associated with second-order scattering processes involving two la phonons (see representative raman spectra in figs. s4a-4d of the supplementary material). nevertheless, the raman signal of the a1g and a’1 modes can be resonantly enhanced by tuning the laser wavelength to match the a-exciton energy e1 [34, 35]. near-resonance conditions are attained for bulk wse2 and the ml with the red 633 nm laser line and the infrared (ir) 785 nm line, respectively. the reason for the resonant behavior is that in tmds the a-exciton wavefunction contains a large weight from d2z orbitals of the metal atoms. this makes the a-exciton energy sensitive to movements of the metal atoms in the direction perpendicular to the layers, as is the case for the eigenvectors of the phonon modes with a symmetry. the non-resonant modes with e symmetry are better seen with blue excitation (488 nm) due to the 1/ω4 prefactor in the raman cross section. in the following, we show that by comparing the pressure coefficients of the raman active modes obtained for the bulk and the encapsulated and bare monolayer, we are able to infer a different stress situation for the latter. figure 6a shows the results of the raman measurements as a function of pressure for the modes with a symmetry measured with ir excitation for the hbn/wse2-ml/hbn heterostructure and a piece of bulk wse2 which was transferred onto the diamond together with the monolayer. both slopes are within experimental uncertainty identical, which speaks for a fully hydrostatic compression of the encapsulated monolayer. in contrast, the pressure dependence of the raman modes with e symmetry of the bare and encapsulated mls is compared in fig. 6b. strikingly, the slope for the bare wse2 ml is a factor 1.28 smaller than the one of the encapsulated ml. we note that for pure uniaxial compression of a cubic crystal like si [36] a reduced pressure coefficient by roughly a factor 1.5 is expected for the raman modes with displacements in the plane perpendicular to the stress. this is just a consequence of the tensile character of the in-plane strain when the material is uniaxially compressed in the perpendicular direction. we are thus led to the conclusion that for the bare ml the stress situation which is most likely established inside the dac is almost purely uniaxial. 110005-7 papers in physics, vol. 11, art. 110005 (2019) / a. francisco-lópez et al. 0 1 2 3 4 5 6 7 8 9 1 0 1 1 2 5 0 2 5 5 2 6 0 2 6 5 2 7 0 2 7 5 2 8 0 a 1 g b u l k w s e 2 a ' 1 h b n / w s e 2 m l / h b n ra ma n s hif t (c m1 ) p r e s s u r e ( g p a ) 2 . 1 ( 1 ) c m 1 / g p a 2 . 1 ( 1 ) c m 1 / g p a 7 8 5 n m ( a ) 0 1 2 3 4 5 6 7 8 9 1 0 1 1 2 5 0 2 5 5 2 6 0 2 6 5 2 7 0 2 7 5 2 8 0 1 . 7 ( 1 ) c m 1 / g p a 4 8 8 n m ra ma n s hif t (c m1 ) p r e s s u r e ( g p a ) e ' w s e 2 m l e ' h b n / w s e 2 m l / h b n 2 . 1 ( 1 ) c m 1 / g p a ( b ) figure 6: comparison of the pressure dependence of first-order raman active modes: (a) the a1g mode bulk wse2 and the a’1 of the hbn/wse2-ml/hbn heterostructure measured with 785 nm excitation and (b) the e’ mode of the bare wse2 ml and hbn/wse2-ml/hbn heterostructure. lines represent least-squares fits to the data points and numbers in parentheses represent error bars. iv. conclusions in summary, we have performed a comparative study of the pressure dependence of the optical transition energies corresponding to the radiative recombination of the a and b-excitons for a bare wse2 ml and a hbn/wse2-ml/hbn heterostructure, both transferred onto one of the diamonds of the dac. we have found that for the encapsulated wse2 ml the energy of both excitons increases with pressure with a relatively small slope of 3.5 to 3.8 mev/gpa, whereas for the bare ml both excitons decrease in energy. these results are at odds with available literature data which show much larger and positive pressure coefficients for a single wse2 monolayer and other truly 2d tmd materials. it is important to note that the literature data were almost exclusively obtained for monolayers deposited or transferred onto a thick sio2 layer on top of a si wafer. to explain the observed discrepancies between the different high pressure experiments on 2d tmds, we have proposed a revision of the stress situation in each particular case, which depends principally on the effective thickness of the slab being pressurized in the dac. here, we propose that for the bare single monolayer (case of vanishing thickness) the stress situation is better described by using an uniaxial stress tensor, whereas for the much thicker hbn/wse2ml/hbn heterostructure conventional hydrostatic conditions can be assumed. our raman results, obtained simultaneously with the pl and ple data in each case, speak in favor of such an interpretation. we believe that our findings will have significant impact on future high pressure work on truly 2d material systems like graphene, tmds, bn, and other layered materials, for the assessment of the stress situation appears to be crucial for the correct interpretation of the experimental results. acknowledgements the spanish ministerio de economı́a, industria y competitividad is gratefully acknowledged for its support through grant no. sev-2015-0496 in the framework of the spanish severo ochoa centre of excellence program and grant mat2015-70850-p (hibri2). afl acknowledges a fpi fellowship (bes-2016-076913) from the spanish ministerio co-financed by the european social fund and the phd program in materials science from universitat autònoma de barcelona in which he is enrolled. [1] a splendiani, l sun, y zhang, t li, j kim, c-y chim, g galli, f wang, emerging photoluminescence in monolayer mos2, nano lett. 10, 1271 (2010). [2] k f mak, c lee, j hone, j shan, t f heinz, atomically thin mos2: a new direct110005-8 papers in physics, vol. 11, art. 110005 (2019) / a. francisco-lópez et al. gap semiconductor, phys. rev. lett. 105, 136805 (2010). [3] g wang, a chernikov, m m glazov, t f heinz, x marie, t amand, b urbaszek, excitons in atomically thin transition metal dichalcogenides, rev. mod. phys. 90, 021001 (2018). [4] d xiao, g-b liu, w feng, x xu, w yao, coupled spin and valley physics in monolayers of mos2 and other group-vi dichalcogenides, phys. rev. lett. 108, 196802 (2012). [5] g wang, c robert, m m glazov, f cadiz, e courtade, t amand, d lagarde, t taniguchi, k watanabe, b urbaszek, x marie, inplane propagation of light in transition metal dichalcogenide monolayers: optical selection rules, phys. rev. lett. 119, 047401 (2017). [6] c robert, t amand, f cadiz, d lagarde, e courtade, m manca, t taniguchi, k watanabe, b urbaszek, x marie, fine structure and lifetime of dark excitons in transition metal dichalcogenide monolayers, phys. rev. b 96, 155423 (2017). [7] m r molas, c faugeras, a o slobodeniuk, k nogajewski, m bartos, d m basko, m potemski, brightening of dark excitons in monolayers of semiconducting transition metal dichalcogenides, 2d mater. 4, 021003 (2017). [8] x-x zhang, t cao, z lu, y-c lin, f zhang, y wang, z li, j c hone, j a robinson, d smirnov, s g louie, t f heinz, magnetic brightening and control of dark excitons in monolayer wse2, nature nanotechnol. 12, 883 (2017). [9] y zhou, g scuri, d s wild, a a high, a dibos, l a jauregui, c shu, k de greve, k pistunova, a y joe, t taniguchi, k watanabe, p kim, m d lukin, h park, probing dark excitons in atomically thin semiconductors via near-field coupling to surface plasmon polaritons, nature nanotechnol. 12, 856 (2017). [10] z jin, x li, j t mullen, k w kim, intrinsic transport properties of electrons and holes in monolayer transition-metal dichalcogenides, phys. rev. b 90, 045422 (2014). [11] j lindlau, c robert, v funk, j förste, m förg, l colombier, a neumann, e courtade, s shree, m manca, t taniguchi, k watanabe, m m glazov, x marie, b urbaszek, a högele, identifying optical signatures of momentumdark excitons in transition metal dichalcogenide monolayers, arxiv:1710.00988 (2017). [12] f cadiz et al., excitonic linewidth approaching the homogeneous limit in mos2-based van der waals heterostructures, phys. rev. x 7, 021026 (2017). [13] a r goñi, k syassen, optical properties of semiconductors under pressure, semicond. semimetals 54, 247 (1998). [14] x dou, k ding, d jiang, b sun, tuning and identification of interband transitions in monolayer and bilayer molybdenum disulfide using hydrostatic pressure, acs nano 8, 7458 (2014). [15] a p nayak, t pandey, d voiry, j liu, s t moran, a sharma, c tan, c-h chen, lj li, m chhowalla, j-f lin, a k singh, d akinwande, pressure-dependent optical and vibrational properties of monolayer molybdenum disulfide, nano lett. 15, 346 (2014). [16] f li, y yan, b han, l li, x huang, m yao, y gong, x jin, b liu, c zhu, q zhou, t cui, pressure confinement effect in mos2 monolayers, nanoscale 7, 9075 (2015). [17] l fu, y wan, n tang, y ding, j, gao, j yu, h guan, k zhang, w wang, c zhang, j-j shi, x wu, s-f shi, w ge, l dai, b shen, k-λ crossover transition in the conduction band of monolayer mos2 under hydrostatic pressure, sci. adv. 3, e1700162 (2017). [18] r s alencar, k d a saboia, d machon, g montagnac, v meunier, o p ferreira, a sanmiguel, a g souza fihlo, atomic-layer mos2 on sio2 under high pressure: bimodal adhesion and biaxial strain effects, phys. rev. mater. 1, 024002 (2017). [19] y ye, x dou, k ding, d jiang, f yang, b sun, pressure-induced k-λ crossing in monolayer wse2, nanoscale 8, 10843 (2016). 110005-9 papers in physics, vol. 11, art. 110005 (2019) / a. francisco-lópez et al. [20] b han, f li, l li, x huang, y gong, x fu, h gao, q zhou, t cui, correlatively dependent lattice and electronic structural evolutions in compressed monolayer tungsten disulfide, j. phys. chem. lett. 8, 941 (2017). [21] t taniguchi, k watanabe, synthesis of highpurity boron nitride single crystals under high pressure by using babn solvent, j. cryst. growth 303, 525 (2007). [22] h-k mao, j xu, p m bell, calibration of the ruby pressure gauge to 800 kbar under quasihydrostatic conditions, j. geophys. res. 91, 4673 (1986). [23] a carvalho, r m ribeiro, a h castro neto, band nesting and the optical response of twodimensional semiconducting transition metal dichalcogenides, phys. rev. b 88, 115205 (2013). [24] m. wojdyr, fityk: a general-purpose peak fitting program, j. appl. cryst. 43, 1126 (2010). [25] m brotons-gisbert, a segura, r robles, e canadell, p ordejón, j f sánchez-royo, optical and electronic properties of 2h-mos2 under pressure: revealing the spin-polarized nature of bulk electronic bands, phys. rev. mater. 2, 054602 (2018). [26] a r goñi, a cantarero, k syassen, m cardona, effect of pressure on the low-temperature excitonic absorption in gaas, phys. rev. b 41, 10111 (1990). [27] a r goñi, k syassen, m cardona, direct band gap absorption in germanium under pressure, phys. rev. b 39, 12921 (1989). [28] o pierre-louis, adhesion of membranes and filaments on rippled surfaces, phys. rev. e 78, 021603 (2008). [29] d machon, c bousige, r alencar, a torresdias, f balima, j nicolle, g s pinheiro, a g souza filho, a san-miguel, raman scattering studies of graphene under high pressure, j. raman spectrosc. 49, 121 (2018). [30] c bousige, f balima, d machon, g s pinheiro, a torres-dias, j nicolle, d kalita, n bendiab, l marty, v bouchiat, g montagnac, a g souza fihlo, p poncharal, a san-miguel, biaxial strain transfer in supported graphene, nano lett. 17, 21 (2017). [31] j nicolle, d machon, p poncharal, o pierrelouis, a san-miguel, pressure-mediated doping in graphene, nano lett. 11, 3564 (2011). [32] j-w jiang, z qi, h s park, t rabczuk, elastic bending modulus of single-layer molybdenum disulfide (mos2): finite thickness effect, nanotechnol. 24, 435705 (2013). [33] l-p feng, n li, m-h yang, z-t liu, effect of pressure on elastic, mechanical and electronic properties of wse2: a first-principles study, mater. res. bull. 50, 503 (2014). [34] s v bhatt, m p deshpande, v sathe, r raoc, s h chakia, raman spectroscopic investigations on transition-metal dichalcogenides mx2 (m =mo, w; x = s, se) at high pressures and low temperature, j. raman spectrosc. 45, 971 (2014). [35] t livneh, j s reparaz, a r goñi, lowtemperature resonant raman asymmetry in 2h-mos2 under high pressure, j. phys.: condens. matter 29, 435702 (2017). [36] e anastassakis, m cardona, phonons, strains, and pressure in semiconductors, semicond. semimetals 55, 117 (1998). 110005-10 papers in physics, vol. 12, art. 120006 (2020) received: 27 october 2019, accepted: 8 august 2020 edited by: f. melo reviewed by: m. yazdani-pedram, universidad de chile, chile b. rivas quiroz, universidad de concepción, chile licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.120006 www.papersinphysics.org issn 1852-4249 the electrical and mechanical properties of cadmium chloride reinforced pva:pvp blend films r. s. mahmood1,2*, s. a. salman1, n. a. bakr1� in this study, pure polymer blend (pva:pvp) film and salt (cdcl2·h2o) reinforced polymer blend films were prepared at different weight ratios (10 wt%, 20 wt%, 40 wt%) using the casting method. the effect of the salt weight ratio on the dielectric properties of the polymer blend films reinforced by cdcl2·h2o salt were investigated, and the experimental results showed that the dielectric constant and the dielectric loss factor decreased as the frequency increased for all polymer blend films. moreover, the above-mentioned properties increased with increasing salt weight ratios at the same frequency. the experimental results also showed an increase in ac electrical conductivity with increasing frequency, for all polymer blend films, and the ac electrical conductivity also increased with an increase in the weight ratio of the salt at the same frequency. the effect of the salt weight ratio on the mechanical properties of the salt-reinforced pva:pvp polymer blend films was also studied. the experimental results obtained from the tensile test of the salt-reinforced polymer blend films show significant change in the values of tensile strength, elongation at break, and young’s modulus with increasing salt weight ratios; the hardness value first increases then decreases with increasing salt weight ratios, and the fracture energy value increases with increasing salt weight ratios, thus they could be good candidates for hard adhesives with low flexibility. i. introduction polymers such as plastics and rubbers pervade our lives, and we come across them in many different forms. their physical properties are therefore of great importance, and an understanding of them is vital for their use in technology and engineering [1]. the blending of different polymers or inorganic ma*reeaheenaljana@gmail.com �nabeelalibakr@yahoo.com 1 department of physics, college of science, university of diyala, old baghdad way, banisa’ad, diyala, 32016 iraq. 2 diyala general directorate of education, governorate street, ba’aqubah, diyala, 32001 iraq. terials with polymers represents a strategic route to improving the performance of a material, and allows the realization of novel composite systems that enhance the performance of the parent blend [2]. polyvinyl alcohol (pva) is a versatile, polyhydroxy polymeric material which has gained the interest of researchers due to its many potential applications, and the scope for easy modification and formation of useful miscible blends with many other polymers. pva reinforced with different materials like iodine, ferric chloride, barium chloride and other salts have been studied extensively, and these polymeric materials show a significant modification in their microstructural, electrical and mechanical properties when compared to pure pva films [3]. the subject of polymer blends has been 120006-1 papers in physics, vol. 12, art. 120006 (2020) / r. s. mahmood et al. electrical test tensile test hardness test impact test figure 1: the sample micrographs required for each test. one of the primary areas of focus in polymer science and technology over several decades now. as a new area of interest in polymer science, polymer blend technology often represents an important subject [4]. polymer blends offer versatile industrial applications through property enhancement and economic benefits. the blending of two or more polymers of similar or dissimilar natures has been practiced for many years [5]. solution blending of different polymers is one of the methods used to obtain new material with a variety of properties, which mainly depend on the characteristics of the parent homo polymers and the blend composition [6]. polyvinyl alcohol (pva), a semi-crystalline polymer, has been studied widely because of its many interesting physical properties, which arise from the presence of oh groups and the hydrogen bond formation with other polymers or metals. polyvinyl pyrrolidone (pvp) is a vinyl polymer possessing planar and highly polar side groups due to the peptide bond [7]. the aim of this study is to prepare pure polymer blend (pva:pvp) film and salt (cdcl2·h2o) reinforced polymer blend films at different weight ratios (10 wt%, 20 wt%, 40 wt%) using the casting method, and to investigate the effect of salt reinforcement on the dielectric and mechanical properties of the prepared films. ii. experimental work in the preparation of polymer blend films, polyvinyl alcohol powder (produced by central house (p) ltd of india with a molecular weight of 13000 g/mol 23000 g/mol), and polyvinyl pyrrolidone powder (produced by the indian himedia company, with a molecular weight of 40000 g/mol) were used. for reinforcement of the blend, cadmium chloride (cdcl2·h2o) salt (produced by the indian himedia company) was used. 2 3 4 5 6 1.0 1.5 2.0 2.5 3.0 3.5 4.0 l o g ( ) log (frequency (hz)) pure (pva:pvp) 10 wt% 20 wt% 40 wt% figure 2: log-log plot of the dielectric constant as a function of the frequency of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. pure pva:pvp polymer blend film and cdcl2·h2o salt-reinforced polymer blend films were prepared at different weight ratios (10 wt%, 20 wt%, 40 wt%) using the casting method. the differential scanning colorimeter (dsc) was performed for all samples and reported elsewhere, and the evidence for the blend nature of the film was confirmed [8]. the pva and pvp powders with 1 : 1 wt%, and cdcl2·h2o powder with the above mentioned weight ratios were dissolved in distilled water by stirring for 1 hr at 60 �. the solution was then poured into special glass molds placed on a flat surface and left until the solvent evaporated to obtain the pure polymer blend film and salt-reinforced polymer blend films. the thickness was measured using a digital micrometer, and was found to be in the range 1450 µm 1455 µm. for the purpose of dielectric measurements, an lcr meter (4294a agilent precision impedance analyzer) was used in the frequency range of 100 hz 1 mhz at room temperature, and for the investigation of mechanical properties, the following instruments were used: tinius olsen-h10k for the tensile test, shore d checkline-dd-100 for the hardness test and filling darter impact tester of the type fdi-01 for the shock resistance test. figure 1 shows the sample images required for each test. 120006-2 papers in physics, vol. 12, art. 120006 (2020) / r. s. mahmood et al. 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 0 50 100 150 200 250 pure (pva:pvp) 10 wt% 20 wt% 40 wt% l o ss f ac to r ( '' ) log (frequency (hz)) figure 3: semi-log plot of the loss factor as a function of the frequency of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. iii. results and discussion i. electrical properties a. dielectric constant the dielectric constant (�′) was recorded for pure polymer blend pva:pvp film and cdcl2·h2o salt reinforced polymer blend films at different weight ratios (10 wt%, 20 wt%, 40 wt%) at room temperature and within the frequency range of 100 hz 1 mhz, as shown in fig. 2. the dielectric constant decreased with increased frequency for all polymer blend films, which can be explained as follows: in the low frequency region there will be sufficient time for molecular dipoles to rearrange and align themselves in the direction of the external electric field, but at high frequencies the time is shorter, and less than the time period needed by the molecules for rearrangement in the direction of the external electric field [9]. the dielectric constant at the same frequency increased with an increase in the weight ratio of added salt. in general, this increase in the value of the dielectric constant is due to increased polarization [10]. b. loss factor the loss factor (�′′) is the ratio of loss of power in electrically insulating materials to the total capacity transported through the insulator; i.e., the loss 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 pure (pva:pvp) 10 wt% 20 wt% 40 wt% l o g ( a. c. (s /m )) log (frequency (hz)) figure 4: log-log plot of a.c. electrical conductivity as a function of the frequency of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. of energy in the insulating material is directly proportional to the loss factor. the dielectric loss factor of the pure pva:pvp polymer blend films and polymer blend films reinforced by cdcl2·h2o salt was calculated at different weight ratios (10 wt%, 20 wt%, 40 wt%) at room temperature and within the frequency range of 100 hz – 1 mhz, as shown in fig. 3. it can be observed that the loss factor decreases as the frequency increases for all the polymer blend films. this may be attributed to the enhancement of the charge carriers that takes place across the electric charge area, decreasing the value of the loss factor at high frequencies until the electron’s energy is equal to the fermi level [11]. another reason for the change in the loss factor with frequency is the polarization mechanism and the multiple interactions between ions and dipoles. this is due to the value of relaxation time [12]. it is also observed that the value of the loss factor at the same frequency increases with an increase in the weight ratio of the added salt. in general, this increase in the value of the dielectric loss factor is attributed to the increase in polarization and the increase in ion charge carriers [13, 14]. c. ac electrical conductivity the alternating electrical conductivity of the pure pva:pvp polymer blend films and polymer blend films reinforced by cdcl2·h2o salt was measured 120006-3 papers in physics, vol. 12, art. 120006 (2020) / r. s. mahmood et al. 0 3 6 9 12 15 18 21 24 27 0 2 4 6 8 10 12 14 st re ss (m pa ) strain (%) pure (pva:pvp) blend 0 1 2 3 4 5 6 0 10 20 30 40 50 60 st re ss (m pa ) strain (%) 10 wt% 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 3 6 9 12 15 st re ss (m pa ) strain (%) 40 wt% 0 1 2 3 4 5 0 5 10 15 20 25 30 35 st re ss ( m pa ) strain (%) 20 wt% figure 5: stress-strain curves of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. at different weight ratios (10 wt%, 20 wt%, 40 wt%) at room temperature and within the frequency range of 100 hz 1 mhz, as shown in fig. 4. it is clear that the alternating electrical conductivity increases significantly as the frequency increases for all the polymer blend films, and this increase is due to increased polarization in the samples [15]. it should be noted that the alternating electrical conductivity in a dielectric material is the amount of power lost when an alternating electric field is exerted, which appears as heat when the dipoles rotate in their positions. the vibration of the charges changes with the alternating electric field, and therefore depends on the frequency [16]. moreover, the alternating electrical conductivity at the same frequency increases with an increase in the weight ratio of added salt. this increase is strongly affected by many factors, including the purity of the material and dispersion. in general, this increase in alternating electrical conductivity is attributed to a decrease in dielectric resistance due to the increase of conductive molecules in the polymeric blend films [17], and also because of the number of charge carriers that have a significant relaxation time due to the high energy barrier [18]. ii. mechanical properties a. tensile test the tensile test was conducted and stress-strain curves were obtained for pure pva:pvp polymer blend films and polymer blend films reinforced by cdcl2·h2o salt at different weight ratios (10 wt%, 20 wt%, 40 wt%) . these curves are shown in fig. 5. the stress-strain of the pure polymer blend films consists of the elastic deformation region showing a linear relationship between stress and strain. from this region young’s modulus can be estimated from the slope of the straight line. the polymeric material within the boundaries of this region suffers from an elastic deformation due to the stretching and elongation of the polymeric chains without breaking the bonds. this curve deviates from linear behavior due to cracks generated within the polymeric material. these cracks grow and combine with increased stress, creating larger incisions and continuing to grow with stress until a fracture occurs in the sample [19]. in other cases, the fracture begins at the outer surfaces in the sites of deformities or defects such as scratches, holes or internal cracks, which act as areas for concentration of stress. this leads to a rise in the stress value to limits where the strength value exceeds the internal force of cohesion, and thus breakage occurs. when the cdcl2·h2o salt is added to the pure polymer blend, the stress-strain curve changes and we obtain curves with different properties. figures 6, 7 and 8 show the variations in tensile strength, 0 10 20 30 40 0 10 20 30 40 50 60 t en si le s tr en gt h (m p a) weight ratio (wt%) pure (pva:pvp) blend figure 6: tensile strength of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. 120006-4 papers in physics, vol. 12, art. 120006 (2020) / r. s. mahmood et al. 0 10 20 30 40 0 5 10 15 20 25 30 e lo ng at io n at b re ak ( % ) weight ratio (wt%) pure [pva:pvp] blend figure 7: elongation at break of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. elongation at break and young’s modulus as a function of the weight ratio of the added salt, for all samples. table 1 shows the values of tensile strength, elongation at break and young’s modulus for all polymer films, determined from the stress-strain curves. the tensile strength of the pure polymer blend films is 15.5 mpa, the elongation at break value is 28.7 % and the value of young’s modulus is 332 mpa; however, when the blend is reinforced with cdcl2·h2o salt, these values change. the tensile strength values increase at the 10 wt% weight ratio , reaching 56.1 mpa, and then decrease with an increase in the weight ratio of the salt added. the elongation at break value starts to decrease with an increase in the weight ratio of salt added, while the value of young’s modulus starts to increase unsystematically to reach its highest value of 7200 mpa at the weight ratio of 40 wt%. the decrease observed in the values of tensile strength, elongation at break and young’s modulus for polymer blend films at some weight ratios of salt reinforcement compared with pure polymer film is due to weak interaction between the molecules and low interstitial adhesion between the composite components, which leads to an increase in the composite fragility [20]. the increase found in the values of tensile strength, elongation at break and young’s modulus for the polymer blend films at some weight ratios of salt reinforcement compared with pure polymer film indicates that reinforcement has been achieved. it can be concluded that, at these weight ratios, 0 10 20 30 40 0 1000 2000 3000 4000 5000 6000 7000 8000 y ou ng s m od ul us ( m p a) weight ratio (wt%) pure (pva:pvp) blend figure 8: young’s modulus of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. the added salt is compatible with common addition polymerization and is effectively dispersed in the polymer blend , affecting its mechanical properties [21]. b. hardness test hardness (shore d) for pure polymer blendfilms and those reinforced by cdcl2·h2o salt are shown in fig. 9. it is clear from the figure that the hardness of the pure polymer blend films increases with an increase in the weight ratio of salt added, reaching its highest value (32.5) at the 20 wt% weight ratio, and then the hardness decreases with an increase in the weight ratio of salt added. this decrease is due to the high viscosity gained by the prepared material when adding high weight ratios of reinforcing salt to the matrix (polymer blend), which is in the liquid state. the high viscosity makes penetration of the cdcl2·h2o salt inside the interfaces of the polymer blend inefficient and diftable 1: tensile property values of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. weight ratio of salt tensile strength elongation at young’s modulus (wt %) (mpa) break (%) (mpa) pure (pva:pvp) 15.5 28.7 332 10 56.1 4.72 1490 20 33.2 2.69 873 40 12.6 0.451 7200 120006-5 papers in physics, vol. 12, art. 120006 (2020) / r. s. mahmood et al. 0 10 20 30 40 25 26 27 28 29 30 31 32 33 34 35 h ar dn es s (s ho re d ) weight ratio (wt%) pure (pva:pvp) blend figure 9: hardness of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. ficult, which leads to the production of many gaps within the prepared composite material when hardened, causing a decrease in the hardness [22]. table 2 shows the hardness values of all polymer blend films. c. impact test the impact test is an important mechanical test used to demonstrate the resistance of a material to collapse by the force of impact under operating conditions, as it measures the actual energy required to break a piece under the test [23]. fracture energy for pure polymer blend films and those reinforced by cdcl2·h2o salt were recorded at different weight ratios, as shown in fig. 10. it can be seen that the fracture energy value for the pure polymer blend films is 0.392 kg.m2/s, and that this value increases as the weight ratio of added salt increases. in other words, the figtable 2: hardness values of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. weight ratio of salt hardness (wt %) (shore d) pure (pva:pvp) 30.1 10 30.1 20 32.5 30 25.9 0 10 20 30 40 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 f ra ct ur e e ne rg y (k g. m 2 / s) weight ratio (wt%) pure (pva:pvp) blend figure 10: fracture energy of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. ure shows that the absorbed energy necessary for fracture increases with an increase in the weight fracture [24], where the cdcl2·h2o salt works to hinder the growth of the crack, and this changes the shape of the crack and its direction. this change in the shape of the crack increases the surface area of the fracture and the spent energy, all of which lead to an increase in the mechanical resistance of the material [25]. the addition of cdcl2·h2o salt to the pva:pvp pure polymer blend film improved the mechanical properties. the reason for the increase in fracture energy with the increase in the weight ratio of added salt is that a large part of the impact energy projected onto the sample is reduced by the salt, which increases the resistance of the substance [25]. table 3 shows the fractional energy values of all polymeric films. table 3: fracture energy values of pure and pva:pvp polymer blend films reinforced with different weight ratios of cdcl2·h2o salt. weight ratio of salt fracture energy (wt %) (kg.m2/s) (pva:pvp) 0.392 10 0.392 20 0.392 40 0.539 120006-6 papers in physics, vol. 12, art. 120006 (2020) / r. s. mahmood et al. iv. conclusions the results for polymer blend films reinforced with cdcl2·h2o salt show that the values of the dielectric constant, loss factor, and alternating electrical conductivity increase with an increase in the weight ratio of cdcl2·h2o salt at the same frequency. in general, this increase is attributed to an increase in polarization. this is why all these polymer blends can be used in the manufacture of electric batteries. the results of the tension test for pva:pvp polymer blend films reinforced with cdcl2·h2o salt show a change in the values of tensile strength, elongation at break and young’s modulus as the salt weight ratio increases. it can be concluded that, at these weight ratios, the added salt is compatible with common addition polymerization and is effectively dispersed in the polymer blend, affecting its mechanical properties. the hardness value first increases then decreases as the salt weight ratio increases. this decrease is due to the high viscosity gained by the prepared material when adding high weight ratios of reinforcing salt to the matrix (polymer blend), which is in the liquid state. the fracture energy value increases with increasing salt weight ratio, where the salt works to hinder the groh of the crack and this changes the shape of the crack and its direction. this change in the shape of the crack increases the surface area of the fracture and the spent energy, all of which lead to an increase in the mechanical resistance of the material, thus they could be good candidates for hard adhesives with low flexibility. [1] s a salman, n a bakr, s s abduallah, study of thermal decomposition and ftir for pvaalcl3 composite films, j. eng. appl. sci. 14, 717 (2019). [2] s a salman, n a bakr, m r jwameer, effect of annealing on the optical properties of (pvacucl) composites, int. lett. chem. phys. astron. 63, 98 (2016). [3] b lobo, m r ranganath, t s g ravi chandran, g venugopal rao, v ravindrachary, s gopal, iodine-doped polyvinylalcohol using positron annihilation spectroscopy, phys. rev. b 59, 13693 (1999). [4] l m robeson, polymer blends handbook, springer-kluwer academic publishers, netherlands (2003). [5] j r fried, polymer science and technology, prentice hall inc., upper saddle rivers, new jersey (2014). [6] r singh, s g kulkarni, morphological and mechanical properties of poly(vinyl alcohol) doped with inorganic fillers, int. j. polym. mater. po. 62, 351 (2013). [7] p j liu, w h chen, y liu, s b bai, q wang, thermal melt processing to prepare halogen-free flame retardant poly(vinyl alcohol), polym. degrad. stabil. 109, 261 (2014). [8] r s mahmood, s a salman, n a bakr, optical, thermal properties of cadmium chloride reinforced pva:pvp blend films, journal of polymer & composites 8, 46 (2020). [9] b h rabee, a hashim, synthesis and characterization of carbon nanotubes-polystyrene composites, eur. j. sci. res. 60, 247 (2011). [10] g c psarras, k g gatos, p k karahaliou, s n georga, c a krontiras, j karger-kocsis, relaxation phenomena in rubber/layered silicate nanocomposites, express polymer letters 1, 837 (2007). [11] sk s basha, m gnanakiran, b r kumar, k v b reddy, m v b rao, m c rao, synthesis and spectral characterization on pva/pvp: go based blend polymer electrolytes, rasayan j. chem. 10, 1159 (2016). [12] s b aziz, o gh abdullah, s a hussein, h m ahmed, effect of pva blending on structural and ion transport properties of cs:agnt-based polymer electrolyte membrane, polymers 9, 622 (2017). [13] f t m noori, e m mehdi, plasma effect on ac electrical properties of (pva-pvp-mno2) nano composites for piezoelectric application, int. j. sci. res. 7, 483 (2018). 120006-7 http://dx.doi.org/10.36478/jeasci.2019.717.724 http://dx.doi.org/10.36478/jeasci.2019.717.724 http://dx.doi.org/10.18052/www.scipress.com/ilcpa.63.98 http://dx.doi.org/10.18052/www.scipress.com/ilcpa.63.98 https://doi.org/10.1103/physrevb.59.13693 https://doi.org/10.1103/physrevb.59.13693 https://doi.org/10.1080/00914037.2012.700288 https://doi.org/10.1080/00914037.2012.700288 https://doi.org/10.1016/j.polymdegradstab.2014.07.021 https://doi.org/10.37591/jopc.v8i1.3478 https://doi.org/10.37591/jopc.v8i1.3478 https://www.researchgate.net/profile/ahmed_hashim19/publication/310141127_synthesis_and_characterization_of_carbon_nanotubes-polystyrene_composites/links/5829a5d808aef00c20560d04.pdf https://doi.org/10.3144/expresspolymlett.2007.116 https://doi.org/10.3144/expresspolymlett.2007.116 http://dx.doi.org/10.7324/rjc.2017.1041756 http://dx.doi.org/10.7324/rjc.2017.1041756 https://doi.org/10.3390/polym9110622 https://doi.org/10.3390/polym9110622 https://www.ijsr.net/archive/v7i1/art20177355.pdf papers in physics, vol. 12, art. 120006 (2020) / r. s. mahmood et al. [14] k k verma, m s alam, r k sinha, r k shukla, dielectric, electrical and microstructural properties of unfilled and mwcnts filled polystyrene nanocomposite prepared by in-situ polymerization technique using ultrasonic irradiation, indian j. pure ap. phy. 52, 614 (2015). [15] m t ramesan, v k athira, p jayakrishnan, c gopinathan, preparation, characterization, electrical and antibacterial properties of sericin/poly(vinyl alcohol)/poly(vinyl pyrrolidone) composites, j. appl. polym. sci. 133, 43535 (2016). [16] r popielarz, c k chiang, r nozaki, j obrzut, dielectric properties of polymer/ferroelectric ceramic composites from 100 hz to 10 ghz, macromolecules 34, 5910 (2001). [17] s m m stowe, ph.d. thesis, department of physics, university of baghdad faculty of education ibn al-haitham, baghdad (2005). [18] m b nanda prakash, a manjunath, r somashekar, studies on ac electrical conductivity of cdcl2 doped pva polymer electrolyte, adv. cond. matter phys. 2013, 1 (2013). [19] x d yu, m malinconico, e martuscelli, highly filled particulate composites enhancement of performances by using compound coupling agents, j. mater. sci. 25, 3255 (1990). [20] c ravindra, m sarswati, g sukanya, p shivalila, y soumya, k deepak, tensile and thermal properties of poly (vinyl pyrrolidone)/vanillin incorporated poly (vinyl alcohol) films, res. j. physical sci. 3, 1 (2015). [21] y luo, x jiang, w zhang, x li, effect of aluminium nitrate hydrate on the crystalline, thermal and mechanical properties of poly(vinyl alcohol) film, polym. polym. compos. 23, 555 (2015). [22] s i salih, k m shabeeb, q a hamad, studying mechanical properties for polymer matrix composite material reinforced by fibers and particles, j. tech. univ. 4, 81 (2010). [23] m c gupta, a p gupta, polymer composite, new age international publishers, new delhi (2005). [24] a m abdullah, a m hashim, a j bader, effect of alumina particles on the mechanical properties of discontinuous glass fiber reinforced unsaturated polyester composites, al-qadisiyah journal for engineering sciences 4, 170 (2011). [25] a a al-jubouri, a i al-mousawi, kh a ismail, a j salman, effect of reinforcing by magnesium oxide particles on thermal conductivity and mechanical properties of vinyl ester resin polymer composite, j. karbala univ. special issue 6th. scientific conference, 26 (2010). 120006-8 http://14.139.47.23/index.php/ijpap/article/view/4767 http://14.139.47.23/index.php/ijpap/article/view/4767 https://doi.org/10.1002/app.43535 https://doi.org/10.1002/app.43535 https://doi.org/10.1021/ma001576b https://doi.org/10.1155/2013/690629 https://doi.org/10.1007/bf00587683 http://www.isca.in/phy_sci/archive/v3/i8/1.isca-rjps-2015-014.php https://doi.org/10.1177/096739111502300805 https://doi.org/10.1177/096739111502300805 https://www.uotechnology.edu.iq/tec_magaz/volume282010/no.4.2010/researches/text (18) arabic.pdf https://www.iasj.net/iasj/article/33407 https://www.iasj.net/iasj/article/33407 https://www.researchgate.net/publication/248708227_effect_of_reinforcing_by_magnesium_oxide_particles_on_thermal_conductivity_and_mechanical_properties_of_vinyl_ester_resin https://www.researchgate.net/publication/248708227_effect_of_reinforcing_by_magnesium_oxide_particles_on_thermal_conductivity_and_mechanical_properties_of_vinyl_ester_resin https://www.researchgate.net/publication/248708227_effect_of_reinforcing_by_magnesium_oxide_particles_on_thermal_conductivity_and_mechanical_properties_of_vinyl_ester_resin introduction experimental work results and discussion electrical properties dielectric constant loss factor ac electrical conductivity mechanical properties tensile test hardness test impact test conclusions papers in physics, vol. 12, art. 120003 (2020) received: 1 june 2019, accepted: 8 june 2020 edited by: c. muravchik reviewed by: r. cofre licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.120003 www.papersinphysics.org issn 1852-4249 further results on why a point process is effective for estimating correlation between brain regions i cifre1, 2∗, m zarepour3, 4, s g horovitz5, s a cannas3, 4†, d r chialvo2, 4, 6 signals from brain functional magnetic resonance imaging (fmri) can be efficiently represented by a sparse spatiotemporal point process, according to a recently introduced heuristic signal processing scheme. this approach has already been validated for relevant conditions, demonstrating that it preserves and compresses a surprisingly large fraction of the signal information. here we investigated the conditions necessary for such an approach to succeed, as well as the underlying reasons, using real fmri data and a simulated dataset. the results show that the key lies in the temporal correlation properties of the time series under consideration. it was found that signals with slowly decaying autocorrelations are particularly suitable for this type of compression, where inflection points contain most of the information. ∗ icifre@gmail.com † sergiocannas@gmail.com 1 facultat de psicologia, ciències de l’educació i de l’esport, blanquerna, universitat ramon llull, c. ćıster 34. barcelona, (08022), spain. 2 center for complex systems & brain sciences (cemsc3), universidad nacional de san mart́ın, 25 de mayo 1169, san mart́ın, (1650), buenos aires, argentina. 3 instituto de f́ısica enrique gaviola (ifeg), facultad de matemática, astronomı́a, f́ısica y computación, universidad nacional de córdoba, ciudad universitaria, (5000), córdoba, argentina. 4 consejo nacional de investigaciones cient́ıficas y tecnológicas (conicet), godoy cruz 2290, buenos aires, argentina. 5 national institute of neurological disorders and stroke, national institutes of health, bethesda, md, usa. 6 instituto de ciencias f́ısicas (icifi). escuela de ciencia y tecnoloǵıa, universidad nacional de san mart́ın, 25 de mayo 1169, san mart́ın, (1650), buenos aires, argentina. i. introduction the large-scale dynamics of the brain exhibit a plethora of spatiotemporal patterns. an important methodological challenge is to define adequate coarse-graining of the brain imaging data which comprises thousands of the so-called bold (“blood oxygen level dependent”) time series. the usual analysis aims at identification of bursts of correlated activity across certain regions, which requires extensive computations, complicated in part by the large size of the data sets. a decade ago it was discovered that this type of problem can be simplified efficiently by using only the timings of the peak amplitude signal events; i.e., converting the raw continuous signal into a point process (pp) [1–7]. subsequent work using similar approaches [8–14] further confirmed that the method entails large compression of the original signals without significant loss of information. overall, these findings not only suggest a way to speed up computations, but most importantly highlight the need to clarify which aspects or features of the brain imaging signals contain the most relevant information. 120003-1 papers in physics, vol. 12, art. 120003 (2020) / i. cifre et al. fmri data t=1 t=2 voxel number vo xe l n um be r t=t a b c b o ld ( s. d. ) 0 50 100 -2 0 2 0 50 100 -2 0 2 0 50 100 -2 0 2 1 s.d. j=2 j=1 j=j time (a.u.) voxel (i) vo xe l ( j) figure 1: the basic aspects to consider in defining the point process of the bold signal. the traces on panel b are examples of raw time series (j) of fmri bold signals at three brain locations (called “voxels”). time points are selected at the upward threshold crossings or the peaks of the signal (filled circles). the temporal co-occurrence of these points defines co-activation matrices for different lengths of time (graphs in c ), which can be further averaged to estimate the correlation matrix for the entire time t of the system under study. the present work is dedicated to identifying the reasons underlying the effectiveness of this approach. the results will show that the key lies in the temporal correlation properties of the time series under consideration: signals with temporal correlations are particularly suitable for this type of compression because inflection points contain most of the information. the paper is organized as follows: in the next section the point process is defined and a simple example is presented. section 3 discusses the main reason the point process works, emphasizing the relevance of the bold autocorrelations. this result is further tested in section 4 using groundtruth simulated bold data in which the autocorrelation is altered. the paper closes with a brief discussion to summarize the relevance of the main conclusion. ii. definitions and examples the basic steps that have been used [1–4] to define the point process in brain signals are summarized in fig. 1. the raw data consist of time series recorded from the brain using functional magnetic resonance imaging (fmri) corresponding to the activity of one of many thousands of small brain regions. it is accepted that this imaging technique measures a “blood oxygen level-dependent” signal (i.e., “bold”) in each small region, giving an estimation of the blood oxygen saturation, which itself is proportional to local neuronal activity. the point process can be defined in different ways, but for the reasons discussed later, the end results are equivalent. as shown in fig. 1, time points can be selected at the upward threshold crossings (here at unity) of the signal (filled circles). a second approach is to construct the point process by selecting the local peaks and/or valleys of the bold time series. for ease of discussion, we will deal with the second option in this paper. the temporal co-occurrence of the points defines the co-activation matrix (bottom graphs), which can be further averaged to estimate the correlation matrix of the system under study. figure 2 illustrates, for those unfamiliar to the subject, three examples of typical bold time series that are usually recorded from the brain. from visual inspection it is already apparent that they are smooth traces, exhibiting temporal correlations, as will be further discussed later. there are also spatial correlations, for instance between the top two traces, which is evidenced in the images’ heat map between counter-lateral regions. a qualitative comparison of how well it works: it has already been established, in different circumstances [2–4], that the co-activation matrix obtained with the point process method is very similar to the correlation matrix computed from the full (i.e., continuous) bold signal. figure 3 shows an example of a correlation matrix constructed from the point process computed from a subject while resting (data fully described in [4]). the results demonstrate that as few as 4 points are already sufficient to define clusters of co-activation, as demonstrated previously in [2–4]. in addition, the results here show how de-activations (i.e., blueish colors) can also be evidenced by the pp approach. 120003-2 papers in physics, vol. 12, art. 120003 (2020) / i. cifre et al. figure 2: right: examples of typical bold time series from three selected sites in the brain. the dashed box in the middle time series indicates a portion of the signal used later in the analysis presented in fig. 4. left: images correspond to a snapshot of activity amplitudes (top), the correlations between the three selected seeds (red dotted circles) and the rest of the brain (middle three images), and the corresponding brain structural slice at mni coordinate z=10 (bottom). the mni x, y, z coordinates are -31 -95 10, respectively, for the top trace, 43 -73 10 for the middle trace and -10 -31 10 for the bottom trace. iii. why does it work? a simple theory as discussed previously, the example in fig. 3 implies a large compression; the question then is why a few points are enough to compute results similar to those obtained with the full signal. a simple visual inspection of the bold traces reveals that the type of signals we are dealing with are temporally correlated. this is very well known; the neuronal activity is temporally and spatially correlated, and furthermore, the activity is convoluted by the hemodynamic transfer function which in itself introduces additional temporal correlations. therefore, for any time series with these properties, it seems natural to think that the most informative points are those in which its derivative changes sign. the other points are redundant since they can be predicted, to a certain degree, by a linear estimator. this is illustrated in fig. 4, using as an example two minutes of bold recording (normalized by its standard deviation σ). after setting a threshold ν, the inflection points larger than a given ν value are identified. these points constitute the marked point process in question. figure 3: example of correlation maps obtained from the raw bold time series of length n=235 (right panel) and from the derived point process (left panels) for different numbers of points (n=4,7,14,26). the left images represent, as “heat maps”, the co-activation of the seed (located at mni coordinates x=4, y=-60, z=18) with respect to each voxel. note that a few points already suffice to identify well-defined clusters that are 1-4 standard deviations away from chance co-activations. the right panel corresponds to the pearson correlation computed from the entire length of the raw bold time series. red/blue colors label positive/negative point co-activations in the case of the left maps and positive/negative correlations in the case of the right map. now we ask how much of the raw signal is left out if these inflection points are used to extrapolate a piece-wise linear time-series. to answer this we analyze bold time series from the brain of a subject during an experiment in which fmri data are collected at rest [3]. we proceed to compute the linear correlation between the two time series, the raw and the piece-wise linear one. in panels d and e the results are shown for different values of threshold ν (in units of σ) as well as for the correlation of the time series, estimated by the value of the first autocorrelation coefficient γ. panel d shows that as the bold signal autocorrelation increases, the similarity between the piece-wise linear and the raw signals increases, evaluated in two ways: by the root mean squared error (rmse) and by the linear correlation 〈r〉 between the two time series. as expected, raising the threshold ν above zero produces an increasing loss of information about the signal, which is reflected in a monotonic increase in the rmse and a decrease in the 〈r〉 values (see panel e). 120003-3 papers in physics, vol. 12, art. 120003 (2020) / i. cifre et al. 0 0.5 1 threshold (υ) 0 0.5 1 < r > ; < r m s e > 360 390 420 450 480 time (sec.) -2 -1 0 1 2 b o l d s ig n a l 0.4 0.8 autocorrelation ( γ ) 0.6 0.8 1 < r > ; < r m s e > 0.4 0.8 autocorrelation ( γ ) 0.6 0.8 1 < r > ; < r m s e > < r > < rmse > 0 0.5 1 threshold (υ) 0 0.5 1 < r > ; < r m s e > a c b d e synthetic experimental figure 4: why it works: the trace in panel a is an example of a two-minute recording of a bold brain signal during rest (denoted by the dotted box in fig. 2). the point process is defined by the timing of the peaks and valleys larger than a given threshold, (two are indicated here by arrows). the points, in this case, are only six (dots depicted in the bottom trace) out of the 120 samples of the original time series. the ability of these six points to preserve information about the original bold signal can be estimated by its similarity with a piece-wise linear time series (dashed red lines) constructed by joining the peaks and valleys. horizontal dotted lines in panel a denote the threshold for the example. panels d and e correspond to the computed similarity between the bold signals and the piece-wise linear signals (evaluated by the correlation 〈r〉 and rmse values) for different autocorrelation γ and threshold ν values. panels b and c correspond to similar calculations using synthetic time series. for panels b and d ν was fixed at 1. for panels c and e γ was 0.85. according to the present hypothesis, the functional dependence shown by the bold signals in panels d and e will be replicated using synthetic signals with similar autocorrelation properties. to this end, we generate artificial time series with autocorrelation values identical to those of the bold signals, using the matlab routine f_alpha_gaussian.mfrom [16] (see also source codes at [17]). panels b and c show that the behavior with respect to the threshold ν and γ for the synthetic and empirical data are very similar. the results show that the key to understanding why the approach works lies in the correlation properties of the time series under consideration. in synthesis, it is found that signals with long-range correlations are particularly suitable for this type of compression, where inflection points contain most of the information. the results also apply to other signals from any origin, as long as their autocorrelation features are similar. iv. further testing using synthetic ground-truth data. the results in the previous section emphasize the relevance of the individual signal’s autocorrelation as the main property related to the ability of a point process to preserve information about the original signal, and consequently, about functional connectivity between signals. we can test this by manipulating the autocorrelation in any given system for which the “ground-truth” crosscorrelations are known. the data reported by smith et al. [19] can be used for our purpose. the authors in [19] reviewed and compared the available fmri analysis methods ranging from simple measures of pair-wise linear correlations to sophisticated multivariate approaches. in the process, they generated diverse and realistic simulated fmri data sets describing different underlying networks. these simulations were based on the dy120003-4 papers in physics, vol. 12, art. 120003 (2020) / i. cifre et al. external inputsnetwork nodes a b c 0 100 200 samples 1 25 50 n o d e s n od e j node i 1 25 50 50 25 1 a (i, j) 0.4 0.2 0 -1 figure 5: simulated fmri network from ref. [19]. panel a depicts the topology and panel b the adjacency matrix of the interactions of the nodes, where negative values (labeled red) correspond to self-interactions. connections are directed: a node in the upper diagonal of the matrix denotes a directed connection from a lower-numbered node to a higher-numbered one. panel c shows an example of the bold time series simulated on this network. namic causal modelling (dcm) [20] fmri forward model (see [19] for full details). we used these simulated bold time series (downloaded from the authors’ site [22]) to test the point process approach in comparison with standard correlation methods. first, for completeness, we briefly describe the essence of the model used in these simulations. the model: smith et al. simulations used a neural network model coupled with buxton’s nonlinear balloon model [21] for the vascular dynamics. each neural network node has a binary external input (square symbols in the top diagram of fig. 5). the state of the inputs (i.e., active or inactive) is given by a poisson process which controls the probability of switching states, and can be seen as external signals or as noise at the neural level. subsequently, the neural signals propagate across the network according to the dcm neural network model, where node interactions are defined by the a network matrix: ż = σaz + cu (1) where z is the neural time series, ż is its rate of change, u are the external inputs and c the weights controlling how the external inputs feed into the network (here just through the identity matrix). the off-diagonal terms in a determine the network connections between nodes (arrows in fig. 5a), and the diagonal elements are all set to -1 to model within-node temporal decay; in this way, σ controls both the within-node (neural) temporal inertia and the temporal lag between nodes. the time series of the activity for each node is then fed to a nonlinear balloon model for vascular dynamics [21] output, which is a function of the changing neural activity. the parameters were adjusted by the authors in order to match bold time series seen for typical data of brain resting activity recorded with 3 tesla fmri technology. finally, various sources of variability were added to account for realistic expectations. the bold time series and the underlying ground-truth network matrices are accessible on the authors’ site [22]. in the following paragraphs, we will explore the ability of the point process to extract the underlying network and to compare it with the commonly-used correlation method. figure 5 shows the ground-truth network con120003-5 papers in physics, vol. 12, art. 120003 (2020) / i. cifre et al. 0.3 0.6 0.9 1.2 υ 0.08 0.09 0.1 < r m s > 0.3 0.6 0.9 1.2 υ 0.4 0.5 0.6 < r > 0 0.25 0.5 0.75 1 false positive rate 0 0.25 0.5 0.75 1 t ru e p o s it iv e r a te (t.max. =400) pp (t. max.=2000) pp (t. max.=400) raw (t. max.=2000) raw 400 800 1200 1600 2000 t. max (samples) 0.6 0.8 1 a u c raw pp a b c d figure 6: panels a and b: comparison of the partial correlation matrices –one calculated with the original bold data (raw) and the other from its derived point process (pp)– as a function of the threshold ν used to define the point process. panel a corresponds to the root mean squared values of the differences, and panel b to the correlation coefficients. dots correspond to individual results while averages (filled circles) and error bars correspond to the mean and standard deviation of the 50 simulated bold records (time series length t=200 samples) panel c shows the receiver operating characteristic curve obtained for both methods in detecting the presence of links (ν = 0.7 and σ = 2.2) computed for two time series of maximum length (t. max. indicated in the legend). panel d corresponds to the area under the roc curve, computed for both methods as a function of the increasing maximum length, t. max., of the time series considered. sidered here (file sim4.mat downloaded from the site [22]). panels a and b illustrate the topology and the adjacency matrix of the network (a in eq. 1). it comprises 10 regular modules interconnected by a few links, a typical small-word graph; a total of 50 nodes interconnected by 61 positive offdiagonal interactions (40 of which correspond to nearest neighbors), as well as 50 negative (diagonal) self-interactions. panel cshows typical traces of the computed bold time series recorded at the 50 nodes, simulating data from a fmri typical session. the data set contains 50 stochastic realizations representing simulated fmri records of different human subjects. extracting the correlation graph: to benchmark the relative merits of the point process approach, we first extracted the point process from the bold time series, and then computed the covariance matrices for both the point process and the raw bold dataset. following this, the partial correlation from both matrices was calculated (as in [19]), and their differences compared for various levels of the threshold ν used to define the point process. as seen in figs. 6a and 6b, there was an optimum threshold (for this dataset ∼ ν = 0.7 ) at which the correlation matrices became more similar. we then proceeded to check how well the method performed in predicting the underlying connectivity graph from the time series. specifically, we checked how well the correlation matrices from both the point process and the raw bold dataset described the off-diagonal elements depicted in fig. 5b, i.e., the synthetic ground-truth network. we used the receiver operating characteristic curve (roc) [23], which benchmarks specificity and sensitivity as a function of a given parameter. to determine whether a connection is predicted or not between two nodes, we chose a threshold of 2 σ (corresponding to p = 95%) at the (i,j) partial correlation matrices entry. in general, to obtain the roc curve a given range of relevant parameters must be explored while the true/false positive/negative predictions are counted. here, for convenience, we chose to explore a range of time series lengths from relatively very short (set to 20 samples here) up to a variable maximum length, t.max. for the examples presented, the values of t.max. ranged from 400 to 2000 samples. figure 6c illustrates the results, where the family of curves (triangles for raw data and circles for the point process) corresponds to time series of various maximum lengths (t.max.). as expected, the shortest t.max (400 samples) gave the lowest confident results, while the longest t.max. (2000 samples) resulted in a very good estimation of the true network connections. the area under the curve, plotted in fig. 6d, is a good estimation of the relative goodness of the prediction, where a value of 1 corresponds to a perfect prediction and a value of 0.5 is equivalent to chance. note that the main motivation for these numerical simulations is to demonstrate that there is close similarity between the pp and raw roc curves in fig. 6d. this similarity is indicative of the good performance of the point process approach compared with the use of the raw time series. 120003-6 papers in physics, vol. 12, art. 120003 (2020) / i. cifre et al. v. discussion and conclusion in this note, we revisited the heuristic point process approach originally introduced by tagliazucchi et al. [1–3] to represent brain spatiotemporal dynamics in terms of the relatively high-amplitude inflection points of the bold signal. at the same time caballero and colleagues [5–7] independently reported similar results, but these were based on de-convolution techniques. later some extensions of the approach were presented by several authors [8–11]. why it works: the present results show that the pp approach works due to a rather trivial fact: in any case of rather strongly autocorrelated signals, the most informative points are the inflection points; the remaining samples are more or less interpolated by straight lines (see fig.4a) which can in principle, and for certain applications, be ignored. thus, in general, it is expected that time series that exhibit slowly decaying temporal correlations will be particularly suitable for this type of approach. relevance of the current results for functional connectivity of the brain: since its introduction, it has been suggested that the pp (or its variants) contains dynamical information, in the sense that it is potentially able to identify the timing and the location of fluctuating epochs of high correlations among brain regions. this identification has recently acquired relevance in the context of what is now dubbed “dynamical functional connectivity”, a very active area of research in the neuro-imaging community (see for instance the reviews by keilholz et al. [25] and iraji et al. [27]. in line with this, the recent report of esfahlani et al. [26] emphasizes the fact that few events of co-activation can estimate the functional connectivity architecture of a system, a finding which is in full agreement with our arguments. thus, it is very important to understand that behind all these reports there is a basic reason why these few points contain most of the information. there is a lot of room for further investigation based on estimation of the correlation between these relatively large-amplitude inflection points. in particular, it seems a promising approach for inspection of non-stationarities in fmri bold data, in certain pathologies that are known to exhibit bursts of non-stationarity, such as in parkinson disease syndrome and tourette disease syndrome. typical of both cases is the existence of few epochs of coherent brain activity, which can be blurred if only the average functional connectivity is computed. finally, it seems reasonable that the approach can be applied beyond brain research, to inspect similar problems in other fields. acknowledgements the authors thank steve m. smith and mark woolrich (imperial college, london) for sharing information on the netsim package. this work was partially supported by nih (u.s.) grant no. 1u19ns107464-01. ic was supported by ministerio de economa, industria y competitividad (spain) grant psi2017-82397-r. mz, drc, & sac were supported in part by conicet (argentina). drc is grateful for the support of the escuela de ciencia y tecnoloǵıa, unsam. [1] e tagliazucchi, p balenzuela, d fraiman, d r chialvo, brain resting state is disrupted in chronic back pain patients, neurosci. lett. 485, 26 (2010). [2] e tagliazucchi, p balenzuela, d fraiman, p montoya,d r chialvo, spontaneous bold event triggered averages for estimating functional connectivity at resting state, neurosci. lett. 488, 158 (2011). [3] e tagliazucchi, p balenzuela, d fraiman, d r chialvo, criticality in large-scale brain fmri dynamics unveiled by a novel point process analysis, front. physiol. 3, 15 (2012). [4] e tagliazucchi, m siniatchkin, h laufs, d r chialvo, the voxel-wise functional connectome can be efficiently derived from co-activations in a sparse spatio-temporal point-process, front. neurosci. 10, 381 (2016). [5] n petridou, c c gaudes, l l dryden, s t francis, p a gowland, periods of rest in fmri contain individual spontaneous events which are related to slowly fluctuating spontaneous activity, hum. brain mapp. 34, 1319 (2013). 120003-7 https://doi.org/10.1016/j.neulet.2010.08.053 https://doi.org/10.1016/j.neulet.2010.08.053 https://doi.org/10.1016/j.neulet.2010.11.020 https://doi.org/10.1016/j.neulet.2010.11.020 https://doi.org/10.3389/fphys.2012.00015 https://doi.org/10.3389/fnins.2016.00381 https://doi.org/10.3389/fnins.2016.00381 https://doi.org/10.1002/hbm.21513 papers in physics, vol. 12, art. 120003 (2020) / i. cifre et al. [6] t w allan, s t francis, c caballero-gaudes, p g morris, e b liddle, p f liddle, m j brookes, p a gowland, functional connectivity in mri is driven by spontaneous bold events, plos one 10, 4 (2015). [7] f i karahanoglu, d van de ville, transient brain activity disentangles fmri resting-state dynamics in terms of spatially and temporally overlapping networks, nat. commun. 6, 7751, (2015). [8] w li, y li, c hu, x chen, h dai, point process analysis in brain networks of patients with diabetes, neurocomputing 145, 182 (2014). [9] x liu, j h duyn, time-varying functional network information extracted from brief instances of spontaneous brain activity, p. natl. acad. sci. usa. 110, 4392 (2013). [10] x liu, c chang, j h duyn, decomposition of spontaneous brain activity into distinct fmri co-activation patterns, front. syst. neurosci. 7, 10 (2013). [11] j e chen, c chang, m d greicius, g h glover, introducing co-activation pattern metrics to quantify spontaneous brain network dynamics, neuroimage 111, 476 (2015). [12] x jiang, j lv, d zhu, t zhang, x hu, l guo, t liu, integrating group-wise functional brain activities via point processes, ieee 11th international symposium on biomedical imaging (isbi), 669 (2014). [13] e amico et al, posterior cingulate cortexrelated co-activation patterns: a resting state fmri study in propofol-induced loss of consciousness, plos one 9, 6 (2014). [14] g r wu, w liao, s stramaglia, d r ding, h chen, d marinazzo, a blind deconvolution approach to recover effective connectivity brain networks from resting state fmri data, med. image. anal. 17, 365 (2013). [15] d cordes, v haughton, j d carew, k arfanakis, k maravilla, hierarchical clustering to measure connectivity in fmri resting-state data, magn. reson. imaging 20, 305 (2002). [16] m stoyanov, m gunzburger, j burkardt, pink noise, 1/fα noise, and their effect on solutions of differential equations, int. j. uncertain. quan. 1, 257 (2011). [17] https://people.sc.fsu.edu/∼jburkardt [18] v m eguiluz, d r chialvo, g a cecchi, m baliki, a v apkarian, scale-free brain functional networks, phys. rev. lett. 94, 018102 (2005). [19] s m smith, k l miller, g salimi-khorshidi, m webster, c f beckmann, t e nichols, j d ramsey, m w woolrich, network modelling methods for fmri, neuroimage, 54, 875 (2011). [20] k j friston, l harrison, w penny, dynamic causal modelling, neuroimage 19, 1273 (2003). [21] r buxton, e wong, l frank, dynamics of blood flow and oxygenation changes during brain activation: the balloon model, magn. reson. med. 39, 855 (1998). [22] http://www.fmrib.ox.ac.uk/datasets/netsim/ [23] t fawcett, an introduction to roc analysis, pattern recogn. lett. 27, 861 (2006). [24] d gutierrez-barragan, m a basson, s panzeri, a gozzi, infraslow state fluctuations govern spontaneous fmri network dynamics, curr. biol. 29, 2295 (2019). [25] s keilholz, c caballero-gaudes, p bandettini, g deco, v calhoun, time-resolved restingstate functional magnetic resonance imaging analysis: current status, challenges, and new directions, brain connectivity, 7, 465 (2017). [26] f z esfahlani, y jo, j faskowitz, l byrge, d p kennedy, o sporns, r f betzel, highamplitude co-fluctuations in cortical activity drive functional connectivity, biorxiv 800045, (2020). [27] a iraji, a faghiri, n lewis, z fu, s rachakonda, v d calhoun, tools of the trade: estimating time-varying connectivity patterns from fmri data, psyarxiv 1 (2020). 120003-8 https://doi.org/10.1371/journal.pone.0124577 https://doi.org/10.1038/ncomms8751 https://doi.org/10.1038/ncomms8751 https://doi.org/10.1016/j.neucom.2014.05.045 https://doi.org/10.1016/j.neucom.2014.05.045 https://doi.org/10.1016/j.neucom.2014.05.045 https://doi.org/10.3389/fnsys.2013.00101 https://doi.org/10.3389/fnsys.2013.00101 https://doi.org/10.1016/j.neuroimage.2015.01.057 https://doi.org/10.1109/isbi.2014.6867959 https://doi.org/10.1109/isbi.2014.6867959 https://doi.org/10.1109/isbi.2014.6867959 https://doi.org/10.1371/journal.pone.0100012 https://doi.org/10.1016/j.media.2013.01.003 https://doi.org/10.1016/j.media.2013.01.003 https://doi.org/10.1016/s0730-725x(02)00503-9 https://doi.org/10.1615/int.j.uncertaintyquantification.2011003089 https://doi.org/10.1615/int.j.uncertaintyquantification.2011003089 https://people.sc.fsu.edu/~jburkardt https://doi.org/10.1103/physrevlett.94.018102 https://doi.org/10.1016/j.neuroimage.2010.08.063 https://doi.org/10.1016/j.neuroimage.2010.08.063 https://doi.org/10.1016/s1053-8119(03)00202-7 https://doi.org/10.1002/mrm.1910390602 https://doi.org/10.1002/mrm.1910390602 http://www.fmrib.ox.ac.uk/datasets/netsim/ https://doi.org/10.1016/j.patrec.2005.10.010 http://dx.doi.org/10.1016/j.cub.2019.06.017 http://dx.doi.org/10.1016/j.cub.2019.06.017 https://doi.org/10.1089/brain.2017.0543 https://doi.org/10.1089/brain.2017.0543 https://doi.org/10.1101/800045 https://doi.org/10.1101/800045 https://doi.org/10.31234/osf.io/mvqj4 introduction definitions and examples why does it work? a simple theory further testing using synthetic ground-truth data. discussion and conclusion papers in physics, vol. 12, art. 120002 (2020) received: 20 february 2020, accepted: 25 may 2020 edited by: s. rafai reviewed by: j. a. isler licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.120002 www.papersinphysics.org issn 1852-4249 vortex dynamics under pulsatile flow in axisymmetric constricted tubes nicasio barrere1*, javier brum1, alexandre l’her1, gustavo l. sarasúa1, cecilia cabeza1 improved understanding of how vortices develop and propagate under pulsatile flow can shed important light on the mixing and transport processes occurring in such systems, including the transition to turbulent regime. for example, the characterization of pulsatile flows in obstructed artery models serves to encourage research into flow-induced phenomena associated with changes in morphology, blood viscosity, wall elasticity and flow rate. in this work, an axisymmetric rigid model was used to study the behaviour of the flow pattern with varying degrees of constriction (d0) and mean reynolds (r̄e) and womersley numbers (α). velocity fields were obtained experimentally using digital particle image velocimetry, and generated numerically. for the acquisition of data, r̄e was varied from 385 to 2044, d0 was 1.0 cm and 1.6 cm, and α was varied from 17 to 33 in the experiments and from 24 to 50 in the numerical simulations. results for the reynolds numbers considered showed that the flow pattern consisted of two main structures: a central jet around the tube axis and a recirculation zone adjacent to the inner wall of the tube, where vortices shed. using the vorticity fields, the trajectory of vortices was tracked and their displacement over their lifetime calculated. the analysis led to a scaling law equation for maximum vortex displacement as a function of a dimensionless variable dependent on the system parameters re and α. i. introduction research on the dynamics of pulsatile flows through constricted regions has multiple applications in biomedical engineering and medicine. cardiovascular diseases are the primary cause of death worldwide (accounting for 31% of total deaths in 2012) [1]. more than half of these deaths could have been avoided by prevention and early diagnosis. atherosclerosis is characterized by the accumulation of fat, cholesterol and other substances in the intima layer, creating plaques which obstruct the arterial lumen. atherosclerotic plaque fissuring and/or breaking are the major causes of cardiovas* nbarrere@fisica.edu.uy [1] instituto de f́ısica, facultad de ciencias, udelar, montevideo 11400, uruguay cular stroke and myocardial infarct. several factors leading to plaque complications have been reported: sudden increase in luminal pressure [2], turbulent fluctuations [3], hemodynamic shear stress [4], vasa vasorum rupture [5], material fatigue [6] and stress concentration within a plaque [7]. as most of these factors are flow dependent, understanding how a pulsatile flow behaves through a narrowed region can provide important insights for the development of reliable diagnostic tools. flow alterations due to arterial obstruction (i.e. stenosis) and aneurysms have been reported in the literature [8–11]. simplified models of stenosed arteries have used constricted rigid tubes [12–15]. at moderate reynolds numbers, the altered flow splits at the downstream edge of the constriction into a high-velocity jet along the centreline and a vortex shedding zone separating from the inner surface of the tube wall (recirculation flow). increasing 120002-1 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. reynolds numbers lead to a transition pattern and ultimately, turbulence. several experimental studies have been reported. stettler and hussain [16] studied the transition to turbulence of a pulsatile flow in a rigid tube using one-point anemometry. however, one-point anemometry fails to account for different flow structures which may play an important role at the onset of turbulence. peacock [17] studied the transition to turbulence in a rigid tube using two-dimensional particle image velocimetry (piv). chua et al [18] implemented a three-dimensional piv technique to obtain the volumetric velocity field of a steady flow. the work of ahmed and giddens [19,20] studied the transition to turbulence in a constricted rigid tube with varying degrees of constriction using twocomponent laser doppler anemometry. for constriction degrees up to 50% of the lumen, the authors found no disturbances at reynolds numbers below 1000. this study was not focused on describing vortex dynamics; however, the authors reported that turbulence, when observed, was preceded by vortex shedding. several authors carried out numerical studies. long et al. [15] compared the flow patterns produced by an axisymmetric and a non-axisymmetric geometry, finding that flow instabilities through an axisymmetric geometry was more sensitive to changes in the degree of constriction. the work of isler et al. [21] in a constricted channel found the instabilities that break the symmetry of the flow. mittal et al. [22] and sherwin and blackburn [23] studied the transition to turbulence over a wide range of reynolds numbers. mittal et al. used a planar channel with a one-sided semicircular constriction. they found that downstream of the constriction the flow was composed of two shear layers, one originating at the downstream edge of the constriction and the other separating from the opposite wall. for reynolds numbers above 1000, the authors reported transition to turbulence due to vortex shedding. moreover, they found through spectral analysis that the characteristic shear layer frequency is associated with the frequency of vortex formation. sherwin and blackburn [23] studied the transition to turbulence using a three-dimensional axisymmetric geometry with a sinusoidal constriction. based on the results of linear stability analysis, the authors reported the occurrence of kelvinhelmholtz instability. this reaffirms that instabilities involving vortex shedding take place in the transition to turbulence. finally, several studies compared experimental and numerical results. ling et al. [24] compared numerical results with those obtained by hot-wire measurements. as mentioned above, onedimensional hot-wire measurements cannot be used to identify flow structures. griffith et al. [25], using a rigid tube with a slightly narrowed section resembling stenosis, compared numerical results with experimental data. using stability analysis by means of floquet exponents, they demonstrated that the experimental flow was less stable than that of the simulated model. for low reynolds numbers (50 − 700), the authors found that a ring of vortices formed immediately downstream of the stenosis and that its propagation velocity changed with the degree of stenosis. the work of usmani and muralidhar [26] compareed flow patterns in rigid and compliant asymmetric constricted tubes for a reynolds range between 300 and 800 and womersley between 6 and 8. the authors reported that the downstream flow was characterized by a high velocity jet and a vortex whose evolution was described qualitatively. the aforementioned studies highlight the significance of studying vortex dynamics, since vortex development precedes turbulence, and ultimately, contributes to the risk of a cardiovascular stroke. the aim of this work is to characterize vortex dynamics under pulsatile flow in an axisymmetric constricted rigid tube. the study was carried out both experimentally and numerically for different constriction degrees and with mean reynolds numbers varying from 385 to 2044 and womersley numbers from 17 to 50. the results show that the flow pattern in these systems consists primarily of a central jet and a vortex shedding layer adjacent to the wall (recirculation flow), consistent with the literature. by tracking the vortex trajectory it was possible to determine the displacement of vortices over their lifetime. vortex kinematics was described as a function of the system parameters in the form of a dimensionless scaling law for maximum vortex displacement. 120002-2 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. figure 1: : (a) schematic drawing of the experimental setup; (b) lengthwise view of a tube of diameter d = 2.6 ± 0.1 cm and an annular constriction of diameter d0; (c) experimental axial velocity at z = 0 and r = 0 for r̄e = 1106, with d0 = 1.6 cm . ii. materials and methods i. experimental setup the experimental setup (fig. 1 (a)) consisted of a circulating loop and a digital particle image velocimetry (dpiv) module capable of obtaining videos and processing velocity fields. the circulating loop consisted of a programmable pulsatile pump (pp), a reservoir (r), a flow development section (fds) and a tube containing an annular constriction (ct). the tube with the annular constriction (ct) consisted of a transparent acrylic rigid tube of length l = 51.0±0.1 cm and inner diameter d = 2.6±0.1 cm. the annular constriction consisted of a hollow cylinder 5 ± 1 mm in axial length, which was fitted inside the rigid tube. keeping the outer diameter of the annular constriction fixed at 2.6 cm, i.e., equal to the inner diameter of the rigid tube, its inner diameter was changed to achieve different degrees of constriction. tests were carried out using annular constrictions with inner diameters d0 = 1.6 ± 0.1 cm and d0 = 1.0 ± 0.1 cm, equivalent to degrees of constriction relative to d of 39% and 61%, respectively. a cross-sectional scheme of the constriction geometry is shown in fig. 1 (b), along with the coordinate system used throughout the study. the radial direction is represented by r and the axial direction by z, with r = 0 coinciding with the tube axis and z=0 with the downstream edge of the constriction. within this axis representation, the downstream region is defined by z >0 values and the upstream region is defined by z < 0 values. finally, this constricted tube (ct) was placed inside a chamber filled with water, so that the refraction index of the fluid inside the tube matched that of the outside fluid. the tube inlet was connected to the pulsatile pump via a flow development section (fds), and its outlet was connected to the reservoir. the flow development section was designed to ensure a fully developed flow at the tube inlet and consisted of two sections: first, a conical tube of 35 cm in length whose inner diameter increased from 1 cm to 2.6 cm to provide a smooth transition from the outlet of the pump. secondly, an acrylic rigid tube of inner diameter d and length of 48d connected to the constricted tube (ct). the reservoir (r) was used to set the minimum pressure inside the tube, which was set as equal to atmospheric pressure in the experiments. the system was filled with degassed water and seeded with neutrally buoyant polyamide particles (0.13 g/l concentration, 50 µm diameter, dantec). the dpiv technique was used to obtain the velocity field [27,28]. a 1 w nd:yag laser was used to illuminate a 2 mm-thick section of the tube. images were obtained at a frame rate of 180 hz over a period equivalent to 16 cycles, using a cmos camera (pixelink, pl-b776f). the velocity field was finally computed using openpiv open-source software with 32 × 32 pixel2 windows and an overlap of 8 pixels in both directions. 120002-3 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. the region of interest was defined as -0.5< r/d <0.5, 0< z/d <1.5. within this region no turbulence was observed, as confirmed by spectral analysis of the velocity fields. this observation is also consistent with previous observations [17,24,29]. due to the pulsatility and the absence of turbulence, it was possible to consider each cycle as an independent experiment, using the ensemble average over all 16 cycles to obtain the final velocity fields. for each degree of constriction different experiments were carried out, varying the flow velocity at the tube inlet. the pulsation period and the shape of the velocity as a function of t, in z = 0 and r = 0, shown in fig.1 (c), remained unchanged throughout the experiments. pulsation period and peak velocity can be related to the womersley and reynolds numbers, respectively. the womersley number, α, is defined as α = d 2 √ 2πf ν , where f is the pulsatile frequency and ν the kinematic viscosity of water (ν = 1.0 × 10−6 m2/s). the experiments were carried out for three different pulsatile periods, t : 0.96s, 2.39s and 3.58s, corresponding to α values of 33.26, 21.10 and 17.22 respectively. the peak reynolds number upstream of the constriction is defined as reu = dvu/ν where vu is the peak velocity measured at the centreline. four different values of reu were used as inlet condition upstream of the constriction (see table 1). similarly, the mean reynolds at the downstream edge of the constriction is defined as r̄e = d0v̄/ν, where v̄ is the mean velocity over a whole period, measured at (z,r) = (0, 0). finally, experiments are labeled through the r̄e values, since this represents a unique combination of reu, d0 and α, as α r̄e reu constriction d0 (cm) 33.26 654 820 39% 1.6 1002 820 61% 1.0 1106 1187 39% 1.6 1767 1187 61% 1.0 2044 1625 61% 1.0 21.10 385 505 39% 1.6 963 820 39% 1.6 17.22 585 820 39% 1.6 table 1: experimental degrees of constriction and r̄e numbers. figure 2: normalized displacement of piston yp for reu=505 (diamond green), reu=820 (blue circles), reu=1187 (red squares), reu=1625 (black plus sign). summarized in table 1. the pulsatile pump (pp in fig. 1 (a)) generated different flow conditions at the tube inlet, [30]. the pumping module consisted of a step-by-step motor driving a piston inside a cylindrical chamber. the control module contained the power supply for the entire system and an electronic board, programmable from a remote pc by means of custom software. the piston motion was programmed using the pump settings for pulsatile period, ejected volume, pressure and cycle shape. the piston motion of the programmable pump is shown in fig. 2 in normalized units of displacement and time. figure 2 shows that the piston motion, and hence the pulsatile waveform, are comparable among all reu values, meaning that the inlet conditions are equivalent for all the experiments. finally, the condition of fully-developed flow at the upstream region was verified. the entrance length is usually estimated as l ≈ 0.05red. according to the work of ku [31], the entrance length of a pulsatile flow with α >10 corresponds to the upstream mean reynolds over a whole cycle, r̄eu, giving l ≈ 0.05r̄eud. in our setup, the flow development section measures 48d, r̄eu <1000 and α >10 for all experiments, which makes it reasonable to assume a fully-developed flow. nevertheless, velocity profiles were studied for reu values of 820, 1187 and 1625 in order to ensure developed flow condition. profiles are taken from a 120002-4 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. figure 3: upstream velocity profiles for (a) reu=820, (b) reu=1187 and (c) reu=1625. profiles are located at z1=-1.25d (blue circles), z2=-0.75d (red triangles) and z3=-0.25d (black plus sign). 1.5d length window upstream of the constriction, at fixed locations z1 = −1.25d, z2 = −0.75d and z3 = −0.25d. velocity values are normalized by the maximum velocity attained in a pulsatile cycle. figure 3 shows such profiles at t = 0.25t , t = 0.5t , t = 0.75t and t = t . profiles show that for each reu, independently of the time, profiles are equal for z1=-1.25d, z2=-0.75d and z3=-0.25d, which demonstrates that the flow is developed. ii. numerical simulation three-dimensional numerical simulation using comsol� software enabled comparison between the experimental data and the study of parameter configurations not achieved experimentally. this entailed solving time-dependent navier-stokes equations under the incompressibility condition. the inlet condition was defined as normal inflow and taken from reu=820 and reu=1187 experimental data in the upstream region, where the flow is developed. in order to do this, a fourth-order fourier decomposition of the experimental flow rate was carried out. the fourier expansion yields vz(t) = a0 2 + ∑4 n=1 (ancos (2πnft) + bnsin (2πnft)) with coefficients (×10−5, in cm/s) a0 =196.32, a1 =61.94, a2 =8.22, a3 =13.79, a4 =3.87, b1 =-27.66, b2 =-3.81, b3 =-5.13, b4 =-2.84 for reu=1187. higher-order terms were not considered, as they do not contribute significantly to the shape of velocity profiles. in order to improve numerical stability, the velocity profile calculated by decomposition was multiplied by a ramp function of 120002-5 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. time increasing steadily from 0 to 1 in 0.6 seconds. figure 4(a) shows the axial velocity at the centreline at the downstream edge of the constriction, and presents good agreement with its experimental analogue shown in fig. 1 (c). outlet conditions were set to p = 0, p being the difference between the outlet pressure and atmospheric pressure, disabling the normal flow and backflow suppression settings. no-slip conditions were imposed on the inner surface of the tube wall or the constriction. a scheme of the geometry of the constricted tube used in the numerical analysis is shown in fig. 4(b). the simulations were initialized with the fluid at rest and were run for 20 periods. the first 4 periods were discarded in order to exclude transient effects. the step size for the data output was selected to be the same as for the experiments. the mesh elements were tetrahedrons except those close to the no-slip boundaries, which were prismatic elements. this was done automatically by comsol, based on the boundary requirements. a comsol predefined physics-controlled mesh was built, enabling settings that allowed the element size to adapt to the physics at specific regions. a cross-sectional view of the tube and the mesh elements is shown in fig. 4 (c). figure 6 compares numerical and experimental velocity profiles in order to validate numerical results. profiles are located upstream (z = −0.3d) and downstream (z = 0.75d) for the two inlet conditions reu=820 and reu=1187. the maximum difference between numerical and experimental profiles was less than 9%. a mesh convergence analysis was carried out for three different meshes consisting of 180958, 351569 and 951899 elements at the highest reynolds (reu=1187) attained in simulation. in fig.5 the axial velocity at r = 0, z = 0 vs. time (fig.5(a)) and the velocity profiles at z = 0.5d (fig.5(b)) are compared for the three different meshes, showing the results to be independent of the mesh chosen. the relative deviation at the centreline, r = 0, is less than 4.5%, although a slight deviation at t = t near the wall is observed. however, this does not affect vortex formation and propagation, which are equally predicted by the three meshes. finally, a mesh of 351569 elements was chosen for this work in a trade-off between spatial resolution and computational time. iii. results and discussion i. flow pattern the velocity at the centreline, r = 0, at the downstream edge of the constriction, z = 0, as a function of time (see fig. 1(c)) was used to establish the time reference for all the experiments. the decelerating phase (diastolic phase) was defined as 0 < t < 0.5t and the accelerating phase (systolic phase) as 0.5t < t < t. figures 7 and 8 show the temporal flow patterns obtained from the experimental data corresponding to r̄e=654, 1002, 1106, 1767 and 2044, while fig. 9 shows the patterns obtained via simulation. figures 7(a), 8(a) and 9(a) show the flow evolution at times 0.25 t , 0.5 t , 0.6 t and 0.9 t. some features were common to all the experiments and numerical results. during the accelerating phase two main flow structures can be distinguished: a central, high-velocity central jet and a recirculation zone between the former and the inner surface of the tube wall, with vortices developing in the latter. in figs. 7, 8 and 9, the central jet is shown in blue and the recirculation zone in red. the central jet is distinguished from the recirculation zone by having vorticity values below the threshold of 30% of the maximum of the absolute value of vorticity in each timestep. the aim of this criterion is to separate regions approximately rather than give their precise location. within the recirculation zone, vortices separated from the wall and travelled along the tube. other features are dependent on the reynolds number. for instance, for r̄e = 654 (fig. 7 (b)) the vortex had already developed at the first half of the decelerating phase and propagated until the beginning of the accelerating phase, as suggested by the dashed lines. at this stage, the thickness of the recirculation zone is approximately equal to h. during the accelerating phase, the thickness of the recirculation zone began to decrease as that of the central jet increased. toward the end of this phase, the central jet was thicker and the previous vortex had almost entirely dissipated, while a new vortex had begun to develop in the vicinity of the constriction (z/d < 0.2). for r̄e = 1106(fig. 7 (c)), the flow presented the same characteristics as for r̄e = 654. here, the vortex travelled faster during the entire cycle 120002-6 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. figure 4: (a) axial inlet velocity at r = 0; corresponds to reu = 1187; (b) axial velocity at r = 0 and z = 0; corresponds to r̄e = 1106; (c) schematic drawing of the flow development section and the axisymmetric constricted tube; (d) cross sectional view of the mesh. and the recirculation zone was 27% thicker. this is ascribed to the increment in the peak velocity of the central jet and the consequent increase in shear stress, leading to enlargement of the recirculation zone and an increase in radial velocity. at r̄e = 654 and r̄e = 1106 two vortices were observed in the region of interest over one pulsatile period (figs. 7(b) and (c)). in the case of r̄e = 1106 (fig. 8 (b)), only one vortex was observed over one period. this difference arose because the vortex propagates at lower velocities for lower reynolds values, then it is expected that two vortices will be observed in the region of interest, which were shed in consecutive pulsatile periods. for r̄e = 1106 at the beginning of the accelerating phase, the central jet decreased in thickness and the recirculation zone became enlarged to occupy almost the entire tube section at z/d ≥ 1. this can be attributed to a marked deceleration of the central jet at the start of the accelerating phase. due to mass conservation, the radial velocity is expected to increase, leading to enlargement of the recirculation zone. this is also consistent with that reported previously by sherwin [23]. at r̄e = 1767 (fig. 8 (c)), the vortex developed and propagated faster than in the previous cases, and a secondary vortex was observed. in the decelerating phase, the vortex reached its maximum size and almost escaped from the region of interest. when the velocity of the central jet at the constriction was near its minimum, the vortex left the region of interest and the rest of the flow became disordered. in the accelerating phase, the vortex developed earlier than at lower reynolds numbers. comparison of figs. 7 and 8 illustrates how the flow pattern changed with the degree of constriction. while in fig. 7 the vortex travelled without a substantial change in size, fig. 8 (b) shows a sharp change in vortex size, measured radially, and fig. 8 (c) shows that the vortex had almost entirely dissipated at t = 0.5t. moreover, due to the reduction in d0 (fig. 8) the velocity of the central jet was higher, and the recirculation zone became enlarged with increasing z, which explains the increase in vortex size. this is consistent with that reported by sherwin et al. [23] and usmani [26]. enlargement of the recirculation zone was present throughout the experiments. this could be explained in terms of circulation, as pointed out in the work of gharib et al. [32] which studies the flow led by a moving piston into an unbounded domain. this work determined that a vortex forms up to a limited amount of circulation before shedding from the layer where it was created. the main difference from our work consists in the confinement that the walls impose on the vortex. the result is that a vortex which sheds with a certain amount of circulation enlarges in the axial direction. this mechanism can also explain the generation of a secondary vortex. specifically, the vortex sheds after 120002-7 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. figure 5: behavior comparison for three meshes: 180958 elements (mesh 1), 351569 elements (mesh 2) and 951899 elements (mesh 3). (a) axial velocity at r = 0, z = 0 vs time and (b) velocity profiles in downstream location z = 0.5d at 0.25t, 0.5t, 0.75t and t. a precise amount of circulation is reached. then any excess of circulation generated goes to a trail of vorticity behind the vortex, eventually identified as a secondary vortex. finally, experimental and numerical results were compared in order to validate the simulation. figure 9 shows the evolution of the simulated flow at r̄e = 654 and r̄e = 1106. comparison of fig. 9 with fig. 7 confirms similar behaviour for the experimental and simulated flows. the main structures, i.e. the central high-velocity jet and recirculation zone, were satisfactorily reproduced by numerical simulation. ii. vortex propagation vortex propagation was studied by measuring vortex displacement as a function of time along one pulsatile period. to this end, the study region was constrained to 0.3 < r/d < 0.5 in order to isolate the vortex fraction forming in the superior wall of the tube. in this region vorticity values of the vortex are positive. for each time frame, the vorticity field was used to extract the vortex position. then, in each frame, all vorticity values below a threshold of 30% of the maximum vorticity value were disregarded in order to obtain the filtered vorticity field ζ̄(r,z). the vortex position was then calculated as (r̄, z̄) = ∑ (r,z)ζ̄(r,z). the vortex propagation was along z and is specified in figs. 7, 8 and 9 by a black dashed line. this aforementioned method enables us to track the vortex and finally to measure the vortex maximum displacement, which will be discussed in the next subsections. iii. numerical results for varying womersley number maximum vortex displacement was defined as z∗/d. position z∗ is where vorticity becomes lower than 30% of the maximum vorticity and is measured from z0, the position where the vortex formed. position z0 was defined as the location of the vortex centre before it sheds. the dependence of the vortex displacement over its lifetime on α, or f, was studied numerically for fixed values of r̄e = 654 and r̄e = 1106, and pulsatile frequency values of 0.5 f, 0.75 f, f, 1.5 f and 2 f, where f is the pulsatile frequency tested experimentally. figure 10 clearly illustrates the dependence of the maximum vortex displacement on α. for a fixed value of r̄e , as α increased, i.e., f increased, the flow behaviour tended to replicate within a smaller region of the tube over a shorter time span. with increasing α, vortices tended to have a shorter lifespan and their maximum displacement was therefore smaller. in other words, z∗/d decreased with increasing pulsatile frequency. iv. scaling law a full description of vortex displacement has been given in previous sections. specifically, the maximum displacement of the vortex was studied; that is, the distance it travels before it vanishes. from 120002-8 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. figure 6: experimental (blue circles) and numerical (red solid line) velocity profiles at upstream location z = −0.3d for (a) reu=820, (b) reu=1187 and at downstream location z = 0.75d for (c) reu=820, (d) reu=1187 our results we propose a scaling law which summarizes the behavior of z∗/d as a function of the parameters involved, r̄e and α. the physical dimensions for the relevant variables are ν, d, f and v̄. based on the vaschybuckingham theorem, it is possible to describe z∗/d as a function of two independent dimensionless numbers, r̄e and α, which involve the relevant variables mentioned. maximum displacement z∗/d was found to be proportional to flow velocity v̄ (i.e. r̄e ), and inversely proportional to pulsation frequency f (i.e. α2), suggesting that z∗/d = k r̄e α2 (1) where k is the proportionality constant. figure 11 shows z∗/d as a function of r̄e/α2 , showing that both the experimental data and the numerical results are a good fit to eq. 1. the fitted slope was k = 1.15 ± 0.19 at a confidence level of 95%. this result highlights the dependence of z∗/d on r̄e and 1/α2, as could be inferred from the data of fig. 10. the work of gharib et al. [32] shows that the developing vortex reaches a threshold of circulation before it sheds. if an excess of circulation is generated, this could lead to the formation of other structures such as secondary vortices. this means that pulsatile frequency f coincides with the shedding frequency of the primary vortex, the one that was tracked. hence, the parameter r̄e/α2 can be related to the strouhal number through r̄e/α2 = 2 π d0 d 1 sr . a discussion of the results in terms of the strouhal number clarifies the relation between the oscillatory component of the flow and the stationary component of the flow. for lower values of r̄e/α2, that is, sr close to 1, stationary and oscillatory components are comparable, which explains a vortex with low to null displacement. the extreme case occurs when the flow oscillates but no vortices are shed. for instance, this could be attained for reynolds numbers below 120002-9 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. figure 7: (a) axial velocity profile at z = 0 and r = 0 and time at which the velocity field was obtained (red circles); (b) streamlines derived from experimental data for r̄e = 654, with d0 = 1.6 cm; (c) streamlines derived from experimental data for r̄e = 1106, with d0 = 1.6 cm. in both cases, red streamlines represent the recirculation zone and blue streamlines the central jet (colours available online only). those used in this work. as r̄e/α2 increases, that is for sr . 10−1, vortices are shed and travel further before vanishing as a consequence of the stationary component of the flow being greater than the oscillatory component. parameter r̄e/α2 also states the kinematic nature of vortex displacement, and by rewriting r̄e α2 = 2d0v̄ πd2f we conclude that z∗/d depends on kinematic variables and not on viscosity. finally, this shows that eq. 1 can be considered as a scaling law that describes the vortex displacement over its lifetime for any combination of the relevant parameters re and α within the range studied and the constriction shape used. specifically for the dependence on constriction shape, preliminary results with a guassian-shaped constriction were obtained numerically, giving a difference below 20% for the maximun vortex displacement for reu=1187. vortex dynamics depend on inlet condition shape [23,33]. however, studies are not conclusive regarding the dependence of the vortex maximum displacement on the shape of the inlet condition. further research is being carried out on this point. 120002-10 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. figure 8: (a) axial velocity profile at z = 0 and r = 0 and time at which the velocity field was obtained (red circles); (b) streamlines derived from experimental data for r̄e = 1002, with d0 = 1.0 cm; (c) streamlines derived from experimental data for r̄e = 1767, with d0 = 1.0 cm. in both cases, red streamlines represent the recirculation zone and blue streamlines the central jet (colours available online only). iv. conclusions in this work, a pulsatile flow in an axisymmetric constricted geometry was studied experimentally and results were compared with those obtained numerically for the same values of r̄e and α. after validation of the numerical results, simulations were run over a range of α values that could not be tested experimentally, in order to identify trends in flow behaviour with varying α. the flow structure was found to consist of a central jet around the centreline and a recirculation zone adjacent to the wall, with vortices shedding in the latter. the analysis addressed how the vortex trajectory and size changed with the system parameters. specifically, for a fixed value of α, vortex size grew with decreasing d0, and vortex displacement was larger for increasing r̄e values. the dependence on α was such that as α increased, the behavior of the flow was reproduced in a shorter extension of the tube. the vortex trajectory was tracked and its displacement over its lifetime deter120002-11 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. figure 9: (a) axial velocity profile at z = 0 and r = 0 and time at which the velocity field was obtained (red circles); (b) streamlines derived from simulation results for r̄e = 654; (c) streamlines derived from simulation results for r̄e = 1106. in both cases, red streamlines represent the central jet layer and blue streamlines represent the central jet (colours available online only). mined. results showed that vortex displacement over its lifetime decreased with increasing α. the analysis led to a scaling law establishing linear dependence of the vortex displacement on a dimensionless parameter combining r̄e and α, namely r̄e/α2 . moreover, this parameter was found to be proportional to the inverse of the strouhal number. this directly relates the vortex behavior to the ratio between the pulsatile component and the stationary component of the flow. as seen from the medical perspective on the issue of stenosed arteries, these results provide insight into vortex shedding and displacement in a simplified model of stenosed arteries. this becomes crucial since vortex shedding precedes turbulence, and turbulence is related to plaque complications which may finally lead to cardiovascular stroke. the authors encourage further research into the behaviour of vortices over a wider range of parameters, including different constriction shapes and sizes. 120002-12 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. figure 10: dependence of the dimensionless maximum vortex displacement on α. experimental results in blue squares and numerical results in red circles (r̄e = 654) and red triangles (r̄e = 1106). figure 11: dimensionless maximum vortex displacement, z∗/d, as a function of the dimensionless parameter r̄e/α2 for experimental data (blue squares) and numerical data (red circles). acknowledgements this research was supported by csic i+d uruguay, through the i+d project 2016 “estudio dinámico de un flujo pulsátil y sus implicaciones hemodinámicas vasculares”, anii (doctoral scholarship posnac-2015-1-109843), ecossud (project reference number u14s04) and pedeciba, uruguay. the manuscript was edited by eduardo speranza. references [1] m naghavi et al., from vulnerable plaque to vulnerable patient: a call for new definitions and risk assessment strategies: part i, circulation 108, 1664 (2003). [2] j e muller, g h tofler, p h stone, circadian variation and triggers of onset of acute cardiovascular disease, circulation 79, 733 (1989). [3] h m loree, r d kamm, c m atkinson, r t lee, turbulent pressure fluctuations on surface of model vascular stenosis, am. j. of physiol. 261, 644 (1991). [4] s d gertz, w c roberts, hemodynamic shear force in rupture of coronary arterial athersoclerosis plaques, am. j. of cardiol. 66, 1368 (1990). [5] a c barger, r beeuwkes 3rd, l l lainey, k j silverman, hypothesis: vasa vasorum and neovascularization of human coronary arteries. a possible role in the pathophysiology of atheriosclerosis, n. engl. j. med 19. 175 (1984). [6] e falk, p k shah, v fuster, coronary plaque disruption, circulation 92, 657 (1995). [7] k imoto textitet al., longitudinal structural determinants of atherosclerotic plaque vulnerability, j. am. coll. cardiol. 46, 1507 (oct. 2005). [8] c g caro, atheroma and arterial wall shear observations, correlation and proposal of a shear dependent mass transfer mechanism for atherogenesis, proc. r. soc. lond. b 17, 109 (1971). [9] d n ku, d p giddens, c k zarins, s glagov, pulsatile flow and atherosclerosis in the human carotid bifurcation, arteriosclerosis 5, 293 (1985). [10] r m nerem, m j levesque, fluid mechanics in atherosclerosis, in: handbook of bioengineering, eds. r skalak, s chien, chap. 21, pag. 21.1, mcgraw-hill, new york (1987). [11] s s gopalakrishnan, b pier, a biesheuvel, dynamics of pulsatile flow through model abdominal aortic aneurysms, j. fluid mech. 758, 150 (2014). 120002-13 https://doi.org/10.1161/01.cir.0000087480.94275.97 https://doi.org/10.1161/01.cir.0000087480.94275.97 https://doi.org/10.1161/01.cir.79.4.733 https://doi.org/10.1152/ajpheart.1991.261.3.h644 https://doi.org/10.1152/ajpheart.1991.261.3.h644 https://doi.org/10.1016/0002-9149(90)91170-b https://doi.org/10.1016/0002-9149(90)91170-b https://doi.org/10.1056/nejm198401193100307 https://doi.org/10.1161/01.cir.92.3.657 https://doi.org/10.1016/j.jacc.2005.06.069 https://doi.org/10.1016/j.jacc.2005.06.069 https://doi.org/10.1098/rspb.1971.0019 https://doi.org/10.1098/rspb.1971.0019 https://doi.org/10.1161/01.atv.5.3.293 https://doi.org/10.1161/01.atv.5.3.293 https://doi.org/10.1017/jfm.2014.535 https://doi.org/10.1017/jfm.2014.535 papers in physics, vol. 12, art. 120002 (2020) / nicasio barrere et al. [12] a arzani, p dyverfeldt, t ebbers, s c shadden, in vivo validation of numerical prediction for turbulence intensity in aortic coarctation, ann. biomed. eng. 40, 860 (2012). [13] m d ford et al., virtual angiography for visualization and validation of computational models of aneurysm hemodynamics, ieee t. med. imaging 24, 1586 (2005). [14] l boussel et al., phase contrast magnetic resonance imaging measurements in intracranial aneurysms in vivo of flow patterns, velocity fields, and wall shear stress: comparison with computational fluid dynamics, magn. reson. med. 61, 409 (2009). [15] q long, x y xu, k v rammarine, p hoskins, numerical investigation of physiologically realistic pulsatile flow through arterial stenosis, j. biomech. 34, 1229 (2001). [16] j c stettler, a k m fazle hussain, on transition of pulsatile pipe flow, j. fluid mech. 170, 169 (1986). [17] j peacock, t jones, r lutz, the onset of turbulence in physiological pulsatile flow in a straight tube, exp. fluids 24, 1 (1998). [18] c s chua et al., particle image of non axisymmetic stenosis model, 8th international symposium on particle image velocimetry (2009). [19] s a ahmed, d p giddens, flow disturbance measurements through a constricted tube at moderate reynolds numbers, j. biomech. 16, 955 (1983). [20] s a ahmed, d p giddens, pulsatile poststenotic flow studies with laser doppleranemometry, j. biomech. 17, 695 (1984). [21] j a isler, r s gioria, b s carmo, bifurcations and convective instabilities ofsteady flows in a constricted channel, j. fluid mech. 849, 777 (2018). [22] r mittal, s p simmons, f najjar, numerical study of pulsatile flow in a constricted channel, j. fluid mech. 485, 337 (2003). [23] s j sherwin, h m blackburn, three dimensional instabilities and transition of steady and pulsatile axisymmetric stenotic flows, j. fluid mech. 533, 297 (2005). [24] s c ling, h b atabek, a nonlinear analysis of pulsatile flow in arteries, j. fluid mech. 55, 493 (1972). [25] m d griffith, t leweke, m c thompson, k hourigan, pulsatile flow in stenotic geometries: flow behaviour and stability, j. fluid mech. 622, 291 (2009). [26] a y usmani, k muralidhar, pulsatile flow in a compliant stenosed asymmetric model, exp. fluids 57, 186 (2016). [27] j westerweel, fundamentals of digital particle velocimetry, meas. sci.technol. 8, 1379 (1997). [28] m raffel, c willert, s wereley, j kompenhans, particle image velocimetry. a practical guide, second edition, springer (2007). [29] r a cassanova, d p giddens, disorder distal to modeled stenoses in steady and pulsatile flow, j. biomech. 11, 441 (1977). [30] g balay, elasticidad en tejidos arteriales, diseno de un corazon artficial in vitro y nuevo metodo ultrasonico de determinacion de elasticidad arterial, ma thesis, universidad de la republica, uruguay, 2012. [31] d n ku, blood flow in arteries, ann. rev. fluid mech. 29, 399 (1997). [32] m gharib, e rambod, k shariff, a universal time scale for vortex ring formation, j. fluid mech. 360, 121 (1998). [33] h liu, t yamaguchi, waveform dependence of pulsatile flow in a stenosed channel, j. biomech. eng. 123, 88 (2000). 120002-14 https://doi.org/10.1007/s10439-011-0447-6 https://doi.org/10.1109/tmi.2005.859204 https://doi.org/10.1109/tmi.2005.859204 https://doi.org/10.1002/mrm.21861 https://doi.org/10.1002/mrm.21861 https://doi.org/10.1016/s0021-9290(01)00100-2 https://doi.org/10.1016/s0021-9290(01)00100-2 https://doi.org/10.1017/s0022112086000848 https://doi.org/10.1017/s0022112086000848 https://doi.org/10.1007/s003480050144 https://doi.org/10.1016/0021-9290(83)90096-9 https://doi.org/10.1016/0021-9290(83)90096-9 https://doi.org/10.1016/0021-9290(84)90123-4 https://doi.org/10.1017/jfm.2018.410 https://doi.org/10.1017/jfm.2018.410 https://doi.org/10.1017/s002211200300449x https://doi.org/10.1017/s0022112005004271 https://doi.org/10.1017/s0022112005004271 https://doi.org/10.1017/s0022112072001971 https://doi.org/10.1017/s0022112072001971 https://doi.org/10.1017/s0022112008005338 https://doi.org/10.1017/s0022112008005338 https://doi.org/10.1007/s00348-016-2274-x https://doi.org/10.1007/s00348-016-2274-x https://doi.org/10.1088/0957-0233/8/12/002 https://doi.org/10.1007/978-3-540-72308-0 https://doi.org/10.1016/0021-9290(78)90056-8 https://doi.org/10.1146/annurev.fluid.29.1.399 https://doi.org/10.1146/annurev.fluid.29.1.399 https://doi.org/10.1017/s0022112097008410 https://doi.org/10.1017/s0022112097008410 https://doi.org/10.1115/1.1339818 https://doi.org/10.1115/1.1339818 introduction materials and methods experimental setup numerical simulation results and discussion flow pattern vortex propagation numerical results for varying womersley number scaling law conclusions papers in physics, vol. 14, art. 140006 (2022) received: 11 december 2021, accepted: 7 march 2022 edited by: l. rezzolla licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.140006 www.papersinphysics.org issn 1852-4249 a novel singularity-free black hole with nonlinear magnetic monopole: hawking radiation and quantum correction yu-ching chou1–4∗, weihan huang5 this paper introduces a nonlinear, magnetically charged, singularity-free black hole model. the ricci scalar, kretschmann scalar, horizon, energy conditions, and hawking radiation corresponding to the singularity-free metric are presented, and the asymptotic behavior and quantum correction of the model are examined. the model was constructed by coupling a mass function with the regular black hole solution under nonlinear electrodynamics in general relativity. aside from resolving the problem of singularities in einstein’s theory of general relativity, the model asymptotically meets the quantum correction under an effective field theory. this obviates the need for additional correction terms; in this regard, the model outperforms the black hole models developed by bardeen and hayward. regarding the nonlinear magnetic monopole source of the gravitational field of the black hole, the energy–momentum tensors fulfill weak energy conditions. the model constitutes a novel, spherically symmetric solution to regular black holes. i introduction newton’s law of gravitation states that the collapse of a nonrotating, perfectly spherical dust shell leads to a singularity at the center because all of the matter simultaneously reaches r = 0. subsequently, singularities would not occur if the initial configuration of the matter were slightly distorted [1]. therefore, huang (2020) [2] proposed a modification to newton’s gravitation which obeys the inverse square law and does not have singularities at r = 0. according to general relativity, the presence of trapped ∗unclejoe0306@gmail.com 1 health 101 clinic, taipei city, 10078 taiwan. 2 taipei medical association, taipei city, 10641 taiwan. 3 taiwan medical association, taipei city, 10688 taiwan. 4 taiwan primary care association, taipei city, 10849 taiwan. 5 national tsing hua university, physics department, hsinchu city, 30013 taiwan. surfaces is the key to the formation of singularities from gravitational collapse. they are surfaces on which the radial coordinates of particles following a timelike or a null curve can gradually only go to reducing values. these trapped surfaces are subject to an extreme gravitational field, where light emitted from the surface is dragged backward, and describe the inner region of an event horizon. this theoretical singularity exists in static black holes [3, 4]. it follows the singularity theorem proposed by penrose and hawking [5–7]. however, it is possible to speculate the existence of singularity-free (regular) black holes. when curvature increases (i.e., when the planckian value is reached), general relativity should be modified to resolve singularity. accordingly, bardeen (1968) [8] proposed the first static spherically symmetric regular black hole solution. this was followed by dymnikova (1992) [9]; mazur and mottola (2001) [10]; nicolini (2005) [11]; hayward (2006) [12]; hossen felder, modesto, and pemont-schwarz (2010) [13]. these above-mentioned regular black hole solutions 140006-1 papers in physics, vol. 14, art. 140006 (2022) / y. c. chou et al. all satisfy the weak energy conditions. among them, the hayward metric is the simplest model. in addition, the regular black hole model established by e ayón-beato and a garćıa (1998) [14–16] can be interpreted as a nonlinear electric or magnetic gravitational field monopole. in the past two decades, some interesting solutions to einstein’s equations of general relativity have been constructed under the framework of nonlinear electrodynamics (ned) [17–22]. a recent study showed that in the bardeen model, parameter g is a magnetic monopole gravitational field described by ned [23]. however, the electromagnetic tensors used in bardeen’s solution are stronger than those used in maxwell electrodynamics when the limit of weak magnetic fields is calculated [1]. to address this issue, kruglov (2017) [24] derived a magnetic black hole solution from the exponential nonlinear framework of electrodynamics (ene). literature asserted that metrics that can successfully simulate quantum effects must meet the “one-loop quantum correction” of newtonian potential, obtained through effective field theory [25–28]. the bardeen model meets the quantum correction asymptotically [27]; however, the hayward model does not, requiring additional correction terms to be introduced [29]. this study aims to propose a novel singularityfree black hole model, unlike bardeen, hayward, and kruglov’s ene models. the proposed model, based on the recent modification of newtonian gravity by huang (2020) [2], is an extended version in the framework of general relativity. it resolves the singularity problem in einstein’s theory, satisfies weak energy conditions, and returns the electromagnetic field tensors of the lagrange function for the nonlinear magnetic monopole gravitational field to maxwell electrodynamics when calculating the weak magnetic field limit. simultaneously, the metric meets the quantum correction asymptotically, without additional correction terms. this paper is organized as follows: first, we discuss the energy–momentum tensors of nonlinear electrodynamics, going on to present a nonlinear, magnetically charged, and singularity-free black hole model developed by introducing mass functions to the metric. second, we present the ricci scalar, the kretschmann’s scalar, the horizon, the energy conditions, and the hawking radiation for the metric. third, we discuss the asymptotic behavior and quantum correction. we set the following parameter: c = g = 1. the first and second partial differential of f(r) on r are noted as f′ and f′′. ii energy–momentum tensor of nonlinear electrodynamics in general relativity in this section, we propose the energy–momentum tensor of nonlinear electrodynamics under the general relativity framework, a method used first by e ayón-beato and a garćıa [23]. consider the following action that represents nonlinear electrodynamics in curved spacetime: s = 1 16π ∫ d4x √ −g (r−l(f)) , (1) where f = fabfab is the square of the electromagnetic field strength tensor, l is a lagrangian density function associated with f, and lf = ∂l∂f . the electromagnetic tensor fab is defined based on the vector potential aa: fab = ∇aab −∇baa. (2) einstein’s equations are derived using eq. (1): gab = tab, (3) tab = 2 ( lff2ab − 1 4 gabl ) . (4) the nonlinear maxwell equations can be expressed as follows: ∇a ( lffab ) = 0. (5) when l(f) = f, eq. (5) regresses to the standard maxwell equations. we start with a following generalized spherical symmetry metric: ds2 = −f(r)dt2 + f(r)−1dr2 + r2dω2, dω2 = dθ2 + sin2 θdφ2, a = a(r)dt + qm cos θdφ, (6) where qm represents the overall magnetic charge and can be defined as follows: 140006-2 papers in physics, vol. 14, art. 140006 (2022) / y. c. chou et al. qm = 1 4π ∫ f. (7) we found it extremely difficult to construct a solution to analyze black holes with hadronic charges. however, we found it significantly easier to construct a solution for single charges, i.e., a(r) = 0 or qm = 0. therefore, we explicitly explain the use of the abovementioned magnetic charge to construct an exact black hole solution. the primary motivation of this study is to create a singularity-free black hole using the proposed gravitational model. therefore, we initially focused on identifying the metrics that were consistently instructive at the origin of spacetime. we parameterized the metric function as follows: f(r) ≡ 1 − 2m(r) r . (8) the constant mass of schwarzschild’s black hole was replaced with a mass distribution function m(r). using the magnetically charged exact black hole solution as an example, the metric function can be parameterized using eq. (6) and by setting a(r) = 0. the results indicate that the nonlinear maxwell equations are self-satisfying. we find only two following independent equations for einstein’s equations: 0 = f′ r + f − 1 r2 + 1 2 l, (9) 0 =f′′ + 2f′ r + l− 4q4m r4 lf. (10) the lagrangian density l can be solved to be served as a function for r. l = −2 ( f′ r + f − 1 r2 ) . (11) eq. (11) was incorporated into eq. (10). we found that for any given metric function, the latter function is always self-satisfying. therefore, the most common method is to use eq. (6) to derive magnetically charged spherically symmetric static solutions. during the parameterization of eq. (8), the lagrangian density is simplified as follows: l = 4m′(r) r2 . (12) in addition, f can be expressed as follows: f = fabfab = 2q2m r4 . (13) therefore, one can freely select a mass function m(r) of interest and then analytically resolve the lagrangian density to use it as a function for f. this completes the calculations for a magnetically charged static solution. iii construction of a singularityfree black hole model coupled with nonlinear electrodynamics in this section, we construct a singularity-free newtonian theory of gravity under the framework of general relativity in curved spacetime using the results of the previous section. for this purpose, we incorporated a mass function m(r) to couple with the regular black hole solution with nonlinear electrodynamics in general relativity. the proposed model is as follows: m(r) ≡ m ( r r + h )µ , fch(r) ≡ 1 − 2m(r) r , (14) ds2 = −fch(r)dt2 + fch(r)−1dr2 + r2dω2, where a small constant h (unit: length) was inserted to prevent divergence of the equation when r → 0. fch(r) is referred to as the chou-huang function, and µ is a dimensionless parameter that should be solved to satisfy the einstein–maxwell equations, µ > 0, while m is a constant denoting gravitational mass. when µ = 1 and h → 0, the metric regresses to the schwarzschild’s metric with a mass of m. this prevents the scalar curvature from diverging when r → 0. the ricci scalar is expressed as follows: r = 2µ(µ + 1)mh2 r(3−µ)(r + h)µ+2 . (15) this equation highlights that r satisfies the condition of nondivergence for the scalar curvature when µ ≥ 3. to simplify calculations, we only discuss µ = 3 in this study. by inserting eq. (15), we obtain the following: 140006-3 papers in physics, vol. 14, art. 140006 (2022) / y. c. chou et al. r = 24mh2 (r + h)5 . (16) furthermore, we prove the metric couples to maxwell electrodynamics in the weak-field limit. einstein’s equations (9)-(10) and eq. (12) with lagrangian density are solved regarding mass function m(r), obtaining the mass function in the following form: m(r) = m −β h3 α [ 1 − ( r r + h )3] , (17) where α is a constant (unit: length squared), and satisfies α = h 4 2q2m . the term βh 3 α denotes the electromagnetic-induced mass mem, and β denotes a dimensionless constant. we assumed the gravitational mass m to be equal to mem, i.e., m = mem ≡ βh 3 α . subsequently, eq. (17) would return to the form of eq. (14). using the definition of f in eq. (13), we obtained the following: αf = ( h4 2q2m )( 2q2m r4 ) = h4 r4 . (18) we simplified the lagrangian density to the following: l = 4m′(r) r2 = 12β α ( h r + h )4 = 12βf( 1 + (αf)1/4 )4 . (19) thereafter, we used m and qm to express α. l(f) = 12βf( 1 + ( 8q6m m4 f)1/4 )4 , (20) where the lagrangian contains fractional powers of f and f = fabfab = 2q2m r4 ≥ 0. under the weak magnetic field limit, m � qm and l(f) → 12βf. therefore, when β = 1/12, l(f) regresses to maxwell electrodynamics. finally, we solved the chou-huang function and obtained the following: fch(r) = 1 − 2m r ( r r + h )3 . (21) the calculations were inserted into eq. (14) to obtain a singularity-free metric. the metric line elements are the following: ds2 = − ( 1 − 2mr2 (r + h)3 ) dt2 + ( 1 − 2mr2 (r + h)3 )−1 dr2 + r2dω2, (22) the ricci scalar and kretschmann’s scalar are expressed as follows: r = 24mh2 (r + h)5 , (23) k = 48m2(2h4 + 7h2r2 − 2hr3 + r4) (r + h)10 . (24) when h → 0, the metric is restored to schwarzschild’s metric with a constant mass of m. the asymptotic of the ricci and kretschmann scalars can be obtained as follows: r = 24m h3 − 120mr h4 + 360mr2 h5 + o(r3) r → 0, (25) r = 24mh2 r5 − 120mh3 r6 + o(r−7) r →∞, (26) k = 96m2 h6 − 960m2r h7 + 5616m2r2 h8 + o(r3) r → 0, (27) k = 48m2 r6 − 576m2h r7 + o(r−8) r →∞. (28) the ricci and kretschmann scalars vanish, and the spacetime becomes flat when r approaches infinity. eqs (25)–(28) indicate that the metric in eq. (22) is regular. we thus complete the extension of revising newtonian gravity under the general relativity framework. iv energy condition we note that eq. (20) satisfies weak energy conditions. let x be a timelike field without loss of generality. x can be selected as a normal field (i.e., xax a = −1). the local energy density along x can be expressed using the right side of eq. (4), as follows: tabx axb = 2 ( eγe γlf + 1 4 l ) . (29) 140006-4 papers in physics, vol. 14, art. 140006 (2022) / y. c. chou et al. figure 1: plot of the chou-huang function fch(r). variation of the fch(r) with different values of mc while maintaining h = 1, when r > 0; fch(r) = 0 is the future trapping region: m > m∗ contains two horizons, m = m∗ contains one horizon, and m < m∗ does not contain any horizons. by definition, eγ = fγδx δ is orthogonal to x. therefore, it is a spacelike vector (eγe γ > 0). using eq. (29), we could determine that if l ≥ 0 and lf ≥ 0, there would not be any negative local energy densities anywhere in the field. this is a weak energy condition. the quantities of nonnegativity were derived using eq. (20). therefore, the proposed model satisfies weak energy conditions. v horizon gtt = 0 was used to infer the horizon of the black hole. fch(r) = 1 − 2mr2 (r + h)3 = 0. (30) eq. (30) is a cubic equation. r3 + (3h− 2m)r2 + 3h2r + h3 = 0. (31) the coefficient of the term r3 is greater than zero; the cubic equation has three roots. this article only discusses the solutions when r > 0. according to its discriminant, −24m3h3 + 81m2h4, we derived the following: m > m∗ ≡ 27h8 . fch(r) = 0 allows two real roots; however, m = m∗ only contains one real root. these are future external and internal trap horizons surrounding the gravitational trapping region, as illustrated in fig. 1. to derive exact solutions for the two horizons (r+ and r−), we defined the following: cos θ = ( 1 − 9h 2m + 27h 2 8m2 ) ( 1 − 9h m + 27h 2 m2 − 27h3 m3 )1 2 . (32) the analytical solution of r > 0 in eq. (31) can be expressed as follows: r+ = −h + 2 3 m ( 1 + 2 √ 1 − 3h m cos( θ 3 ) ) , (33) r− = −h + 2 3 m ( 1 − 2 √ 1 − 3h m cos( θ + π 3 ) ) . (34) at one limit, m = m∗, θ = π, and cos θ = −1. under this condition, the two horizons merged into one at the following: r+ = r− = r∗ = 16 27 m∗ = 2h. (35) another interesting limit was found at m � h. under this condition, θ = 0 and cos θ = 1. r+ = −h + 2 3 m ( 1 + 2 √ 1 − 3h m ) =2m − 3h− 3h2 2m − 9h3 4m2 + o(h4) ∼= 2m, (36) r− = −h + 2 3 m ( 1 − 2 √ 1 − 3h m ( 1 2 )) = 3h2 4m + 9h3 8m2 + 135h4 64m3 + o(h5) ∼= 3h2 4m , (37) where the r+ horizon is approximated to 2m, the horizon of schwarzschild’s metric with a mass of m. the r− horizon is approximated to 3h2 4m , which has a positive value close to zero. 140006-5 papers in physics, vol. 14, art. 140006 (2022) / y. c. chou et al. figure 2: plot of the hawking radiation temperature as a function of r+. we let h = 1 (blue), 0.1 (red), and 0.01 (purple). th has a maximum value of 5−2 √ 6 4πh when r+ = (2 + √ 6)h. it then quickly becomes 0 when r+ = 2h, turning negative when r+ < 2h. vi hawking radiation hawking radiation is a quantum effect of black holes, in which the quantum tunneling effect causes particles in black holes to pass through the event horizon. the tunneling probability of this process can be calculated. we do not discuss the derivation process in detail here in this paper. the results indicate that the hawking radiation is proportional to the gravity κ on the horizon surface. the hawking radiation temperature (th) for metric (22) can be expressed as follows: th ≡ κ 2π = f′ch(r+) 2 . (38) the results of eq. (21) were inserted into eq. (38), and the following equation for th was derived: th = 2m(r2+ − 2hr+) 4π(r+ + h)4 = (r+ − 2h) 4πr+(r+ + h) , (39) where th is a function of r+. we let h = 1, 0.1, and 0.01 to plot a function graph of th versus r+, as shown in fig. 2. it shows that when r+ is close to 0, unlike the th of the schwarzschild metric, the th has a maximum value of 5−2 √ 6 4πh when r+ = (2 + √ 6)h, and then quickly becomes 0 when r+ = 2h, and turns negative when r+ < 2h. in addition, we can elucidate hawking radiation temperature by observing two limits. at one of the limits, m = m∗ and r+ = 2h, where the th approximates zero. therefore, the proposed model predicts that radiation ceases but does not completely evaporate when the mass of the black hole reaches the critical value m∗. naturally, the other limit was at m � h. at this instance, r+ ∼= 2m, whereby the th approximated to schwarzschild’s metric, th ∼= 18πm . vii asymptotic behavior and quantum correction we find from the asymptotic behavior of this singularity-free metric that there are several noteworthy characteristics. it approaches a static, spherically symmetric charged black hole satisfying einstein–maxwell equations and meets the quantum correction under the effective field theory. first, the taylor expansion of the chou-huang function approximating the center can be expressed as follows: fch(r) =1 − 2mr2 h3 + 6mr3 h4 + o(r4) ∼=1 − 2gmr2 c2h3 , (40) where all the physical constants were regressed. subsequently, de sitter’s spacetime can be expressed as follows: fds(r) = 1 − λ 3 r2. (41) this equation is like that of hayward’s spacetime. the de sitter’s core protected spacetime from the presence of singularity. we compared eqs. (40) and (41) and found several interesting interactions between the physical constants. λ ∼= 6gm c2h3 . (42) therefore, the singularity-free physical characteristics of h are associated with the cosmological constant λ. moreover, when r → ∞, this metric asymptotically approximates to the following taylor expansion: 140006-6 papers in physics, vol. 14, art. 140006 (2022) / y. c. chou et al. −gtt = 1− 2m r + 6mh r2 − 12mh2 r3 +o ( 1 r4 ) , (43) where the r−1 term can be used to determine the association between m and the configured mass, the r−2 term can be used to determine the association between h and certain “coulomb” charges, such as those in the reissner–nordström solution. we insert α = h 4 2q2m , m = βh 3 α , and β = 1/12 into eq. (43) and obtain the following: −gtt = 1− 2m r + q2m r2 − 12mh2 r3 + o ( 1 r4 ) , (44) where qm is the total magnetic charge. metric (22) was asymptotically approximated to the reissner–nordström solution, a static spherically symmetrical charged black hole. furthermore, we found that the r−3 term can serve as a “quantum correction” factor in metric (22). literature suggests that metrics must meet the “one-loop quantum correction” of newtonian potential, derived from effective field theory, to effectively simulate quantum effect [25–28]. specifically, φ(r) = − gm r ( 1 + γ l2 r2 ) + gq2m 2r2 + o ( 1 r4 ) , (45) where g is the newtonian constant of gravity, γ = 41 10π [25], γ = 121 10π [27], and l is the planck length. the newtonian limit for the standard schwarzschild’s metric can be expressed as follows: φ(r) = − 1 2 (1 + gtt) . (46) equation (44) can be rewritten to restore the newtonian constant of gravity. thereafter, the following was obtained: φ(r) = − gm r ( 1 + 6h2 r2 ) + gq2m 2r2 + o ( 1 r4 ) . (47) a comparison of the coefficients revealed the relationship between h and l: h = √ γ 6 l, (48) where h ∼ 10−35m is in the same order of magnitude as the planck length. viii conclusion this study proposes a novel spherically symmetric regular black hole solution. it was extended from our singularity-free newtonian gravity, in which ricci scalar and ricci curvature invariant does not diverge as r → 0. we prove that the physical meaning of h can be interpreted as magnetic monopole charges described in ned. the energy–momentum tensors of this source satisfy weak energy conditions. under weak field limits, the lagrangian density regresses to normal maxwell’s equations. the asymptotic behavior of the metric shows that it has the de sitter’s core in the center. moreover, when r tends to infinity, it regresses to the reissner–nordström solution in the r−2 term and meets the quantum correction in the r−3 term. the abovementioned results can be derived directly from our model without additional corrections, outperforming those produced by the bardeen and hayward models. it requires further investigations. acknowledgements special thanks to ruby lin of health 101 clinic; dr. simon lin and professor hoi-lai yu of the academia sinica for their guidance in thesis writing. [1] f lamy, theoretical and phenomenological aspects of non-singular black holes, doctoral dissertation, université sorbonne paris citéuniversité paris diderot (paris 7), (2018). [2] w huang, a new gravitation law, int. j. adv. sc. eng. technol. 8, 24 (2020). [3] r m wald, gravitational collapse and cosmic censorship, in: black holes, gravitational radiation and the universe, eds. b r iyer, b bhawal, pag. 69, springer, dordrecht (1999). [4] s jhingan, g magli, gravitational collapse of fluid bodies and cosmic censorship: analytic insights, in: recent developments in general relativity, eds. b casciaro, d fortunato, m francaviglia, a masiello, pag. 307, springer, milano (2000). 140006-7 https://tel.archives-ouvertes.fr/tel-02413360/ https://tel.archives-ouvertes.fr/tel-02413360/ https://tel.archives-ouvertes.fr/tel-02413360/ http://iraj.in/journal/ijaseat//paper_detail.php?paper_id=17691&namea_new_gravitation_law http://iraj.in/journal/ijaseat//paper_detail.php?paper_id=17691&namea_new_gravitation_law https://doi.org/10.1007/978-94-017-0934-7_5 https://doi.org/10.1007/978-94-017-0934-7_5 https://doi.org/10.1007/978-94-017-0934-7_5 https://doi.org/10.1007/978-88-470-2113-6_24 https://doi.org/10.1007/978-88-470-2113-6_24 https://doi.org/10.1007/978-88-470-2113-6_24 https://doi.org/10.1007/978-88-470-2113-6_24 papers in physics, vol. 14, art. 140006 (2022) / y. c. chou et al. [5] r penrose, gravitational collapse and spacetime singularities, phys. rev. lett. 14, 57 (1965). [6] s w hawking, g f r ellis, the large scale structure of space-time, cambridge university press, cambridge (1973). [7] j m m senovilla, singularity theorems and their consequences, gen. relativ. gravit. 30, 701 (1998). [8] j m bardeen, non-singular general-relativistic gravitational collapse, in: proc. int. conf. gr5, tbilisi, 174 (1968). [9] i dymnikova, the cosmological term as a source of mass, class. quantum gravity 19, 725 (2002). [10] p o mazur, e mottola, gravitational vacuum condensate stars, proc. natl. acad. sci. u.s.a. 101, 9545 (2004). [11] p nicolini, noncommutative nonsingular black holes, arxiv preprint hep-th/0510203, (2005). [12] s a hayward, formation and evaporation of nonsingular black holes, phys. rev. lett. 96, 031103 (2006). [13] s hossenfelder, l modesto, i prémont-schwarz, model for nonsingular black hole collapse and evaporation, phys. rev. d 81, 044036 (2010). [14] e ayón-beato, a garćıa, regular black hole in general relativity coupled to non-linear electrodynamics, phys. rev. lett. 80, 5056 (1998). [15] e ayón-beato, a garćıa, nonsingular charged black hole solution for nonlinear source, gen. rel. grav. 31, 629 (1999). [16] e ayón-beato, a garćıa, new regular black hole solution from nonlinear electrodynamics, phys. lett. b 464, 25 (1999). [17] m s ma, magnetically charged regular black hole in a model of nonlinear electrodynamics, ann. phys. 362, 529 (2015). [18] s h hendi, asymptotic reissner–nordström black holes, ann. phys. 333, 282 (2013). [19] l balart, e c vagenas, regular black holes with a nonlinear electrodynamics source, phys. rev. d 90, 124045 (2014). [20] s i kruglov, nonlinear electrodynamics and black holes, int. j. geom. methods mod. phys. 12, 1550073 (2015). [21] s i kruglov, nonlinear arcsin-electrodynamics and asymptotic reissner-nordström black holes, ann. phys. (berlin) 528, 588 (2016). [22] s i kruglov, asymptotic reissner-nordström solution within nonlinear electrodynamics, phys. rev. d 94, 044026 (2016). [23] e ayón-beato, a garćıa, the bardeen model as a nonlinear magnetic monopole, phys. lett. b 493, 149 (2000). [24] s i kruglov, black hole as a magnetic monopole within exponential nonlinear electrodynamics, ann. phys. 378, 59 (2017). [25] r v maluf, j c s neves, bardeen regular black hole as a quantum-corrected schwarzschild black hole, int. j. mod. phys. d 28, 1950048 (2019). [26] n e j bjerrum-bohr, j f donoghue, b r holstein, quantum gravitational corrections to the nonrelativistic scattering potential of two masses, phys. rev. d 67, 084033 (2003). [27] j f donoghue, general relativity as an effective field theory: the leading quantum corrections, phys. rev. d 50, 3874 (1994). [28] g g kirilin, i b khriplovich, quantum power correction of newton’s law, j. exp. theor. phys. 95, 981 (2002). [29] t de lorenzo, master’s thesis: investigating static and dynamic non-singular black holes, university of pisa (2014). 140006-8 https://doi.org/10.1103/physrevlett.14.57 https://doi.org/10.1103/physrevlett.14.57 https://doi.org/10.1023/a:1018801101244 https://doi.org/10.1023/a:1018801101244 https://doi.org/10.1088/0264-9381/19/4/306 https://doi.org/10.1088/0264-9381/19/4/306 https://doi.org/10.1073/pnas.0402717101 https://doi.org/10.1073/pnas.0402717101 https://doi.org/10.48550/arxiv.hep-th/0510203 https://doi.org/10.1103/physrevlett.96.031103 https://doi.org/10.1103/physrevlett.96.031103 https://doi.org/10.1103/physrevd.81.044036 https://doi.org/10.1103/physrevlett.80.5056 https://doi.org/10.1023/a:1026640911319 https://doi.org/10.1023/a:1026640911319 https://doi.org/10.1016/s0370-2693(99)01038-2 https://doi.org/10.1016/j.aop.2015.08.028 https://doi.org/10.1016/j.aop.2013.03.008 https://doi.org/10.1103/physrevd.90.124045 https://doi.org/10.1103/physrevd.90.124045 https://doi.org/10.1142/s0219887815500735 https://doi.org/10.1142/s0219887815500735 https://doi.org/10.1002/andp.201600027 https://doi.org/10.1103/physrevd.94.044026 https://doi.org/10.1016/s0370-2693(00)01125-4 https://doi.org/10.1016/s0370-2693(00)01125-4 https://doi.org/10.1016/j.aop.2016.12.036 https://doi.org/10.1142/s0218271819500482 https://doi.org/10.1142/s0218271819500482 https://doi.org/10.1103/physrevd.67.084033 https://doi.org/10.1103/physrevd.50.3874 https://doi.org/10.1134/1.1537290 https://doi.org/10.1134/1.1537290 https://www.researchgate.net/profile/tommaso-de-lorenzo/publication/283489779_investigating_static_and_dynamic_non-singular_black_holes/links/563a421d08aeed0531dcacdf/investigating-static-and-dynamic-non-singular-black-holes.pdf https://www.researchgate.net/profile/tommaso-de-lorenzo/publication/283489779_investigating_static_and_dynamic_non-singular_black_holes/links/563a421d08aeed0531dcacdf/investigating-static-and-dynamic-non-singular-black-holes.pdf https://www.researchgate.net/profile/tommaso-de-lorenzo/publication/283489779_investigating_static_and_dynamic_non-singular_black_holes/links/563a421d08aeed0531dcacdf/investigating-static-and-dynamic-non-singular-black-holes.pdf introduction energy–momentum tensor of nonlinear electrodynamics in general relativity construction of a singularity-free black hole model coupled with nonlinear electrodynamics energy condition horizon hawking radiation asymptotic behavior and quantum correction conclusion papers in physics, vol. 13, art. 130001 (2021) received: 23 september 2020, accepted: 11 january 2021 edited by: d. h. zanette licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.130001 www.papersinphysics.org issn 1852-4249 a method for continuous-range sequence analysis with jensen-shannon divergence m. a. ré1, 2*, g. g. aguirre varela2, 3� mutual information (mi) is a useful information theory tool for the recognition of mutual dependence between data sets. several methods have been developed fore estimation of mi when both data sets are of the discrete type or when both are of the continuous type. however, mi estimation between a discrete range data set and a continuous range data set has not received so much attention. we therefore present here a method for the estimation of mi for this case, based on the kernel density approximation. this calculation may be of interest in diverse contexts. since mi is closely related to the jensen shannon divergence, the method developed here is of particular interest in the problems of sequence segmentation and set comparisons. i introduction mutual information (mi) is a quantity whose theoretical base originates in information theory[1]. since mi between two independent random variables (rv) is zero, a non-null value of mi between these variables gives a measure of mutual dependence. when analyzing two data sets x and y (assumed to be the realization of two mutually dependent rvs) mi can give us a measure of the *re@famaf.unc.edu.ar �guiava@gmail.com 1 centro de investigación en informática para la ingenieŕıa, universidad tecnológica nacional, facultad regional córdoba, maestro lópez esq. cruz roja argentina, (5016) córdoba, argentina. 2 gfa facultad de matemática, astronomı́a, f́ısica y computación, universidad nacional de córdoba, av. medina allende s/n, ciudad universitaria, (5000) córdoba, argentina. 3 instituto de f́ısica enrique gaviola (ifeg), facultad de matemática, astronomı́a, f́ısica y computación, universidad nacional de córdoba, ciudad universitaria, (5000) córdoba, argentina. mutual dependence of these sets. although mi may be straightforwardly calculated when the underlying probability distributions are known, this is not usually the case when only the data sets are available. therefore, mi must be estimated from the data sets themselves. when x and y are the discrete type, mi may be estimated by substituting the joint probability of these variables by the relative frequency of appearance of each pair (x,y) in the data sequence [2, 3]. for real value data sets (or the discrete type with a wide range) estimation of mi by frequency of appearance is not applicable. the binning method [4] in turn requires large bins or large sequences in order to produce reasonable results. alternative proposals have been made for cases when both data sets are the continuous type [5]. estimation of mi between a discrete rv and a continuous one has not been so extensively considered, in spite of being a problem of interest in diverse situations. for instance, we could compare the day of the week (weekday-weekend, discrete) with traffic flow (continuous), quantifying this effect. in a different context we might wish to quan130001-1 papers in physics, vol. 13, art. 130001 (2021) / m. a. ré et al. tify the effect of a drug (administered or not, discrete) in medical treatment evaluation (electroencephalograms in epilepsy, continuous data). ross[6] has proposed a scheme for estimating mi based on the nearest neighbour method [4]. assuming a sequence of (x,y) pairs, with x being discrete and y continuous, the nearest neighbour method requires the pairs to be ordered by the y values. this requirement makes the proposal impractical, in sequence analysis for instance. the nearest neighbour method was also considered by kraskov et al. [7]. in their paper they suggest two ways of evaluating mi with this method. an alternative definition for mi is presented by gao et al. [8], also based on the distance between the elements of the sequence. in this paper we propose a more direct method for estimating mi between a discrete and a continuous data set, based on the kernel density approximation (kda)[4] for estimating the probability density function (pdf) of the continuous variable. for the discrete variable we make use of the usual frequency approximation [2, 3]. finally, mi is computed by the monte carlo integration. as shown by grosse et al. [2] mi can be identified with the jensen shannon divergence (jsd), a measure of dissimilarity between two probability distributions. jsd is a non-negative functional that equals zero when the distributions being compared are the same. this property makes jsd a useful tool for sequence segmentation [2, 3]. furthermore, in diverse contexts it is of interest to evaluate whether a given sequence matches a particular probability distribution. the most usual case is that of a normal distribution. nevertheless, this is a more general problem. for instance, in satellite synthetic aperture radar (sar) images the backscatter presents a multiplicative noise assumed to have an exponential distribution [9]. also, models for cloud droplet spectra assume a weibull distribution [10,11]. several indirect methods have been proposed for analysis of continuous range sequences. pereyra et al. [12] outlined a method based on wavelet transform to analyze electroencephalograms. recently, mateos et al. [13] have proposed a mapping from continuous value sequences into discrete state sequences previous to jsd calculation. several other mapping methods have been proposed in the literature to associate a discrete probability distribution with a real value series. here, by means of the kda we avoid resorting to any indirect method, approximating the probability densities of continuous range variables by this non parametric method. in section ii we present the calculation of mi and the arrangement for sequence segmentation with jsd. in section iii we test the perfomance of this method through numerical experiments. also considered is application of the method in edge detection in a satellite synthetic aperture radar (sar) image. in section iv we consider the results obtained. ii method in this section we present our proposal for estimating mi between discrete and continuous rvs, based on the kda estimator of a pdf. let us consider a sequence of pairs (x,y) with x as a variable of discrete range and y of continuous range. to calculate mi we resort only to the sequence itself, making use of no extra information. we start from the sequence of data pairs (x,y), and assume that these data are sampled from a joint probability density µ (x,y), although unknown at first. from the joint pdf the marginal probabilities p (x) = ∫ ∞ −∞ dy µ (x,y) (1a) φ (y) = ∑ x µ (x,y) (1b) are defined. the mi between the rvs x and y is expressed in terms of these pdfs as [1] i (x,y ) = ∑ x ∫ ∞ −∞ dyµ (x,y) ln [ µ (x,y) p (x) φ (y) ] . (2) note that if the variables x and y are statistically independent then µ (x,y) = p (x) φ (y), and in this case i (x,y ) = 0. in this way a value of i (x,y ) 6= 0 gives a measure of the mutual dependence of these variables. we may rewrite i (x,y ) in terms of the conditional pdfs µ (y | x) = µ (x,y) p (x) (3) as i (x,y ) = ∑ x p (x) ∫ ∞ −∞ dy µ (y | x) ln [ µ (y | x) φ (y) ] . (4) 130001-2 papers in physics, vol. 13, art. 130001 (2021) / m. a. ré et al. figure 1: kernel density approximation (kda) for the probability density in (10) calculated from 1000 pairs generated by the monte carlo method. for plot a ym = 1, while for plot b ym = 5. in both cases σg = 1. solid lines correspond to the analytic function and dashed lines to the kda. i kernel density approximation to carry out the calculation in (4), knowledge of the conditional pdfs is necessary. as mentioned, these densities are assumed to be unknown, and have to be estimated from the data themselves. here we make use of the kda [4], as summarized in the following. the conditional pdfs in eq. (3) are estimated considering separately each data set pair with a given value of x. we define the set cκ = {(x,y) /x = κ} (5) and for each set we approximate the conditional densities using a kda with a gaussian kernel µ̂ (y |x=κ) = 1 nκhκ 1 √ 2π ∑ yj�cκ exp [ − (y −yj) 2 2h2κ ] . (6) note that the sum is over the yj values in the set cκ, and nκ is the number of pairs in this set. the bandwidth, hκ, is chosen as the optimal value, as reported by [4] and followed by steuer et al. [5] hκ ' 1.06sκn−0.2κ (7) where s2κ is the variance of the sample. sheather [14] considered alternative values to detect bimodality; however, as they mention, there is little visual difference. the marginal probability of x is approximated by the frequency of ocurrence of each value p̂ (x = κ) = nκ n (8) and the marginal probability density of y by φ̂ (y) = ∑ x p̂ (x) µ̂ (y | x) . (9) we illustrate the results obtained with the kda by an example: let us consider the joint probability distribution µ (x,y) µ (x = 1,y) = 1 3 1 √ 2π exp [ − y2 2 ] (10a) µ (x = 2,y) = 2 3 1 √ 2πσg exp [ − (y −ym) 2 2σ2g ] (10b) and the corresponding marginal pdf φ (y) = 1 3 1 √ 2π e− y2 2 + 2 3 1 √ 2πσg e −(y−ym) 2 2σ2g . (11) we sampled 1000 pairs from this distribution for two different values of ym, and from these pairs we made an estimation of the conditional pdfs using the kda. in fig. 1a and 1b we plot the probability functions in (10) and (11) for two values of ym and the corresponding approximations. 130001-3 papers in physics, vol. 13, art. 130001 (2021) / m. a. ré et al. v1v2 . . . vn1−1vn1︸ ︷︷ ︸ n1 values vn1+1 . . . vn1+n2︸ ︷︷ ︸ n2 values figure 2: the segmentation problem. consider a sequence s made up of two stationary subsequences s1 and s2, with n1 and n2 elements respectively. the problem consists in determining the value of n1; i.e., the point when the statistical properties change. ii monte carlo integration after approximating the pdfs we have to compute the integrals in (4) to estimate mi. we recognize in these integrals the expectation value 〈 ln µ (y |x) φ (y) 〉 = ∞∫ −∞ dy µ (y |x) ln [ µ (y |x) φ (y) ] (12) that can be estimated by monte carlo integration [15] 〈 ln µ (y |κ) φ (y) 〉 ' 1 nκ ∑ yj�cκ ln [ µ̂ (yj |x=κ) φ̂ (yj) ] . (13) here the sum is again restricted to the yj values associated with a particular x value. note that in this sum we make use of the kernel approximation of the conditional pdfs in (6). substituting both approximations we finally get î (x,y ) ' 1 n ∑ x ∑ yj�cκ ln [ µ̂ (yj | x) φ̂ (yj) ] . (14) iii sequence segmentation the jsd is a measure of dissimilarity between probability distributions. originally proposed by burbea and rao [16] and lin [17] as a symmetrized version of kulback leibler divergence [1, 18], a generalized weighted jsd between two pdfs, f1,f2 is defined as d [f1,f2] = h (π1f1 + π2f2)−π1h (f1)−π2h (f2) (15) with πi arbitrary weights satisfying π1 + π2 = 1. here h is gibbs shannon entropy, defined for continuous range variables as h (fi) = − ∞∫ −∞ dy fi (y) ln [fi (y)] . (16) as shown by grosse et al. [2] jsd may be interpreted as mi between a discrete and a continuous variable by identifying the weights πi with the discrete variable probability in (1a): πi = p (x = i) (17) and the probability densities fi (y) with the conditional densities in (3) fi (y) = µ (y | x = i) . (18) with these identifications, the functionals in (15) and (4) are the same. the jsd and several generalizations have been succesfully applied to the sequence segmentation problem, the partition of a non-stationary sequence into stationary subsequences, for discrete range sequences. we propose here the extension of this method to continuous range sequences without resorting to discrete mapping, wavelet decomposition or any other indirect method of estimation of the probability distribution. the procedure for sequence segmentation may be stated in the following way: let us consider a sequence s with n elements made of two stationary subsequences s1 and s2, with n1 and n2 values respectively (n1 + n2 = n), schematically illustrated v1v2 . . .︸ ︷︷ ︸ ν1 . . .︸ ︷︷ ︸ ν1 vn1−1vn1vn1+1 . . . vn1+n2 figure 3: the sliding window method. a sliding window is defined for sequence segmentation. the window is divided into two subwindows of equal size. the center of the window is considered as the window position. the window is displaced along the sequence and the jsd between the subwindows is calculated. the segmentation point is identified as the window position at which jsd has its maximun value. 130001-4 papers in physics, vol. 13, art. 130001 (2021) / m. a. ré et al. figure 4: mutual information estimation for the joint distribution in (10). for the distribution in (10) the dots represent the average mi value for 100 data sets of 1000 (x,y) pairs each, with the bars indicating the standard deviation of each set. the black line is the analytical value of mi: a) as a function of the mean value ym in (10b)(the inset shows the distribution of mi for a particular value of ym for a dependent and an independent set), and b) changing σg, the standard deviation in (10b). the inset shows the same plot but in log-log scale to highlight the mi value for independent sets. in fig. 2. the aim is to determine the value of n1; i.e., the position of the last element in s1. in the algorithm proposed here we define a sliding window of fixed width over the sequence. the window is divided into two segments, each including ν1 elements (see fig. 3). we define the window position as that of the last element in the left section of the window. this window is displaced over the sequence and the window position where jsd reaches its maximun value is taken as the segmentation point. iii assessment results in this section we present the results of our assessment of the proposed method by considering two applications: the detection of mutual dependence between two rv sequences and the segmentation of a sequence. in the first case we generate sequences of two jointly distributed variables: one of discrete range and one of continuous range, and then we compute mi between these variables. in the second case we consider sequences made of two subsequences generated from diferent distributions. we detect the segmentation point following the procedure described in the previous section. we also apply the method to detect the edges between homogeneous regions in sar images. i mutual information between a discrete and a continuous variable we computed the mi between discrete and continuous variables. we generated 100 data sets, sampling 1000 (x,y) pairs from the distribution in (10) with different values of ym or σg, and from the joint distribution µ (x= 1,y) = 1 3 [θ (y−0.5) − θ (y+0.5)] (19a) µ (x= 2,y) = 2 3 1 a [ θ ( y + ym − a 2 ) − θ ( y + ym + a 2 )] (19b) with θ (y) the step function θ (y) = { 0 for y < 0 1 for y > 0 (20) with different values of ym or a. we estimated the mi, i (x,y ), from each set by the method described in the previous section. given that we are sampling the data pairs from known distributions, we are also able to calculate mi from the analytical expressions. in this way we may compare the results obtained from the approximation with the corresponding analytical results. 130001-5 papers in physics, vol. 13, art. 130001 (2021) / m. a. ré et al. figure 5: mutual information estimation for the joint distribution in (19). for the distribution in (19) the dots represent the average mi value for 100 data sets of 1000 (x,y) pairs each, with the bars indicating the standard deviation. the black line is the analytical value of mi while the dots represent the kernel density approximation (kda) values; a) as a function of mean value ym in (19b), and b) changing the width parameter a in (19b). in addition, we calculated the mi for samples of statistically independent variables to establish a significance value for the mi of the dependent variables. the analytical value in this case is zero, as already mentioned. the results of the calculation are shown in fig. 4 for the distribution in (10) and in fig. 5 for the distribution in (19), respectively. we include the average value of mi over the 100 data sets for the different values of the parameters, and the bars correspond to the standard deviation in each set. a small underestimation of the mi value can be seen in this last case. this may be attributed to a shortcoming of the kda at the borders of the interval of the uniform distribution. nevertheless, it is still possible to detect mutual dependence between the discrete and the continuous value sequences. to consider the effect of sample size, we repeated the experiment with the distribution in (10) for different values of n, the number of data pairs in each set. we again generated 100 data sets of n data pairs each. the results are shown in fig 6 for three sets of parameters. a slightly increasing overestimation of mi can be appreciated as n decreases. finally, we considered an usual situation when there is only one sample of data pairs available. we sampled 1000 pairs from the distribution in (10), the distribution in (19) and from the distribution µ (x = 1,y) = 1 3 exp (−y) θ (y) µ (x = 2,y) = 2 3 1 2 exp (−y/2) θ (y) . (21) for each sample we estimated mi by the approximate method in (14). to set up a significance value figure 6: mutual information estimation for the distribution in (10). for the distribution in (10) the dots represent the average value for 100 data sets of different numbers of (x,y) pairs. bars indicate the standard deviation, and dashed lines represent the analytical values of mi for the different sets of parameters. 130001-6 papers in physics, vol. 13, art. 130001 (2021) / m. a. ré et al. figure 7: segmentation point in artificial sequences. the jsd average computed for 500 sequences generated from rayleigh distributions. each sequence has a length of 500 elements divided into two subsequences with 250 elements each. the ratio of the mean values of the subsequences is given by rm = ml/mr = 5, where ml and mr are the mean values in the left and right subsequences, respectively. the sequences are analyzed with different window widths (ww). in all cases the window position (wp) of the maximum jsd average is at the segmentation point. for each sample we generated 100 data sets of 1000 pairs of independent variables. the discrete values were sampled from the distribution p (x) = nx 1000 (22) where nx gives the number of times that the value x appears in the original sequence, and the continuous values were sampled from the gaussian distribution µ (y) = 1 √ 2πs exp [ − (y −m)2 2s2 ] (23) independently of the value of x. here m is the mean value in the original sequence and s2 the sample variance. we calculated the mi for each data set and then the mi mean value and its variance. the results are included in table 1. a clear difference can be seen between the mi of the dependent values and those of the independent sequences. ii sequence segmentation to test the sequence segmentation method, we generated sets of 500 sequences of 500 values each, difigure 8: segmentation point in artificial sequences. the jsd average computed for 200 sequences generated from rayleigh distributions. each sequence has a length of 500 elements divided into two subsequences with 250 elements each. different values of the mean quotient rm = mr/ml are considered, where mr is the mean value of the right subsequence and ml the mean value of the left subsequence. in all cases a window width of 50 elements was used. even for the lowest quotient value the window position (wp) of the maximum jsd average is coincident with the segmentation point. vided into two subsequences with 250 values in each one. the sequences were generated from rayleigh distributions with a different mean value for each subsequence. the mean value of the first subsequence is denoted by ml , and the mean value of the second segment by mr ; we define the ratio of the mean values as rm = ml/mr. using the sliding window method, we analyzed a set with rm = 5 with several window widths. in fig. 7 we present the average value across the 500 sequences of jsd at each window position for the different widths considered. the average jsd has table 1: mutual information and significance value. mi of the sampled dependent sequences (see text) and the corresponding significance values computed from the independent sets. pdf mi significance value mean st. dev. gaussian 0.6359 4.5 × 10−3 1.8 × 10−3 uniform 0.1429 4.5 × 10−3 1.8 × 10−3 exponential 0.0718 4.5 × 10−3 1.9 × 10−3 130001-7 papers in physics, vol. 13, art. 130001 (2021) / m. a. ré et al. a maximum value at position 250, the segmentation point, even for a narrow window with 20 elements (10 elements in each subwindow), although in this case statistical fluctuations are more noticeable. to test the sensitivity of the method we generated sets with rm = 1.2, 1.5, 2, 5, 10. the results of the algorithm, with a window of 50 elements, are included in fig. 8. even for the smallest ratio considered, the segmentation point can be detected. finally, we present an example of application of the segmentation algorithm to detect the edge between homogeneous regions in a sar image. in sar images the backscatter is affected by speckle noise (a multiplicative noise). this noise in the backscatter amplitude is modelled by a rayleigh distribution in homogeneous regions. in fig. 9 we include a section of the sar image of an antarctic region, and the boundary detected between water and ice. on the right a plot of the values of the backscatter amplitude of the highlighted lines in the image and the jsd is included. there is good coincidence of the detected boundary with the contour in the image. iv discussion and conclusions in this paper we have presented a method for computing mutual information (mi) between discrete and continuous data sets, or alternatively, the jsd between continuous range data sets. the algorithm developed gives a measure of dissimilarity without resorting to an indirect method like those proposed in [12, 13]. neither is it necessary to have the continuous values ordered as in the nearest neighbour method [4, 6]. in fact, the calculation in (14) is based only on the registered data as they were recorded. the measure may be applied to two similar problems. on the one hand we can quantify the mutual dependence between discrete and continuous data sets, and on the other hand we can quantify the dissimilarity between two continuous data sets, as discussed in section ii. in section iii we applied the method to artificially-generated pairs of variables, finding good agreement with the corresponding analytical values as shown in figs. 4 and 5, although systematic underestimation occurs mainly when the difference is given by the width in uniform distributions (fig 5-b). we attribute this discrepancy to the abrupt decay of the uniform distribution at the borders of the interval, while the kda with a gaussian kernel extends to infinity. the mi values in these cases of mutually dependent variables are clearly distinguishable from the mi values of independent variables. we also considered the dependence of the results of this method on the length of the sequence. iin fig 6 a slightly increasing overestimation of mi is seen with decreasing length. nevertheless, there is good agreement for sequences of more than 400 pairs. in real situations we frequently have only one sequence of (x,y ) pairs. we have proposed a method for establishing a significance value by generating 100 sequences of independent variables with probability distributions given by the estimated marginal distribution for the discrete variable, and by a gaussian distribution for the continuous variable with the same mean value and variance as the marginal distribution of the original sequence. we have considered sequences generated from three distributions. in all three cases mi establishes a clear difference between dependent and independent sets, as shown in table 1. it has been shown that the jensen shannon divergence (jsd) is equivalent to mi [2]. therefore, the calculation method developed here will also be suitable for computing jsd between two continuous range data sets, and in this format the jsd may be applied to the sequence segmentation problem as proposed in section ii-iii. in this section we suggested a method based on a fixed-length sliding window. we considered the segmentation of artificially generated sequences in section iii-ii. the jsd average at each position in the sequences exhibits a maximum at the segmentation point, as shown in fig. 7. as we continue this work we will address the problem of comparing and analyzing electrophysiological signals. the segmentation method may also be of interest in detecting borders in images. work along these lines will be published elsewhere. acknowledgements we wish to acknowledge partial support from scyt utn through grant uti4811 and from secyt unc through grant 30720150100199cb. 130001-8 papers in physics, vol. 13, art. 130001 (2021) / m. a. ré et al. figure 9: border detection in sar images. the segmentation method was applied to detection of the border between homogeneous regions in a sar image. the image was analyzed line by line and the segmentation point at each line detected. the segmentation points are coincident with the border. [1] t cover, j thomas, elements of information theory, j. wiley, new york (2006). [2] i grosse, p bernaola-galván, p carpena, r román-roldán, j oliver, h. e. stanley, analysis of symbolic sequences using the jensenshannon divergence, phys. rev. e, 65, 041905 (2002). [3] m a ré, r k azad, generalization of entropy based divergence measures for symbolic sequence analysis, plos one 9, e93532 (2014). [4] b w silverman, density estimation for statistics and data analysis, chapman and hall, london (1986). [5] r steuer, j kurths, c o daub, j weise, j selbig, the mutual information: detecting and evaluating dependencies between variables, bioinformatics 18, s231 (2002). [6] b c ross, mutual information between discrete and continuous data sets, plos one 9, e87357 (2014). [7] a kraskov, h stögbauer, p grassberger, estimating mutual information, phys. rev. e 69, 066138 (2004). [8] w gao, s kannan, s oh, p viswanath estimating mutual information for discretecontinuous mixtures, 31st conference on neural information processing systems (nips), 5986 (2017). [9] a moreira, p prats-iraola, m younis, g krieger, i hajnsek, k p papathanassiou, a tutorial on synthetic aperture radar, ieee geosci. remote s. magazine 1, 6 (2013). [10] y liu, j hallett, on size distributions of cloud droplets growing by condensation: a new conceptual model, j. atmos. sci. 55, 527 (1998). [11] y liu, p h daum, j hallett, a generalized systems theory for the effect of varying fluctuations on cloud droplet size distributions j. atmos. sci. 59, 2279 (2002). 130001-9 https://doi.org/10.1103/physreve.65.041905 https://doi.org/10.1103/physreve.65.041905 https://doi.org/10.1371/journal.pone.0093532 https://doi.org/10.1093/bioinformatics/18.suppl_2.s231 https://doi.org/10.1371/journal.pone.0087357 https://doi.org/10.1371/journal.pone.0087357 https://doi.org/10.1103/physreve.69.066138 https://doi.org/10.1103/physreve.69.066138 https://papers.nips.cc/paper/2017/hash/ef72d53990bc4805684c9b61fa64a102-abstract.html https://papers.nips.cc/paper/2017/hash/ef72d53990bc4805684c9b61fa64a102-abstract.html https://papers.nips.cc/paper/2017/hash/ef72d53990bc4805684c9b61fa64a102-abstract.html 10.1109/mgrs.2013.2248301 10.1109/mgrs.2013.2248301 https://doi.org/10.1175/1520-0469(1998)055<0527:osdocd>2.0.co;2 https://doi.org/10.1175/1520-0469(2002)059<2279:agstft>2.0.co;2 https://doi.org/10.1175/1520-0469(2002)059<2279:agstft>2.0.co;2 papers in physics, vol. 13, art. 130001 (2021) / m. a. ré et al. [12] m e pereyra, p w lamberti, o a rosso, wavelet jensen-shannon divergence as a tool for studying the dynamics of frequency band components in eeg epileptic seizures, phys. a 379, 122 (2007). [13] d m mateos, l e riveaud, p w lamberti, detecting dynamical changes in time series by using jensen shannon divergence, chaos 27, 083118 (2017). [14] s j sheather, density estimation stat. sci. 19, 588 (2004). [15] a papoulis, probability, random variables and stochastic processes, mcgraw-hill, new york (1991). [16] j burbea, c r rao, on the convexity of some divergence measures based on entropy functions, ieee t. inform. theory 28, 489 (1982). [17] j lin, divergence measures based on the shannon entropy, ieee t. inform. theory 37, 145 (1991). [18] s kullback, r a leibler, on information and sufficiency, ann. math. stat. 22, 79 (1951). 130001-10 https://doi.org/10.1016/j.physa.2006.12.051 https://doi.org/10.1016/j.physa.2006.12.051 https://aip.scitation.org/doi/abs/10.1063/1.4999613 https://aip.scitation.org/doi/abs/10.1063/1.4999613 https://www.jstor.org/stable/4144429 https://www.jstor.org/stable/4144429 10.1109/tit.1982.1056497 10.1109/18.61115 10.1109/18.61115 https://www.jstor.org/stable/2236703 introduction method kernel density approximation monte carlo integration sequence segmentation assessment results mutual information between a discrete and a continuous variable sequence segmentation discussion and conclusions papers in physics, vol. 12, art. 120005 (2020) received: 25 april 2020, accepted: 30 july 2020 edited by: d. peres menezes reviewed by: e. m. yoshimura, instituto de f́ısica da univ. de são paulo, brazil licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.120005 www.papersinphysics.org issn 1852-4249 optimizing the shielding properties of strength-enhanced concrete containing marble a. abdel-latif m.1,2*, m. i. sayyed3, h. o. tekin4,5, m. m. kassab1 the purpose of this study is to develop a low cost, locally produced concrete mixture with optimum marble content. the resulting mixture would have enhanced strength properties compared to the non-marble reference concrete, and improved radiation shielding properties. to accomplish these goals five concrete mixtures were prepared, containing 0, 5, 10, 15 and 20 % marble waste powder as a cement replacement on the basis of weight. these samples were subjected to a compressive strength test. the shielding parameters such as mass attenuation coefficients (µm), mean free path (mfp), effective atomic number (zeff ) and exposure build-up factors (ebf) were measured, and results were compared with those obtained using the winxcom program and mcnpx code in the photon energy range of 0.015 3 mev. moreover, the macroscopic fast neutron removal cross-section (neutron attenuation coefficient) was calculated and the results presented. the results show that the sample containing 10 % marble has the highest compressive strength and potentially good gamma ray and neutron radiation shielding properties. i. introduction radiation shielding has recently become an important research topic in nuclear science, and is defined as the ability to reduce radiation effects through interaction with the shielding material. several parameters such as attenuation effectiveness, strength, and thermal properties influence the * aam00@fayoum.edu.eg 1 department of mathematics and eng. physics faculty of engineering fayoum university, 63514 fayoum, egypt. 2 college of industry and energy technology new cairo technological university, cairo, egypt. 3 department of physics, faculty of science university of tabuk, saudi arabia. 4 department of radiotherapy, vocational school of health services, uskudar university, turkey. 5 medical radiation research center (usmera), uskudar university, turkey. selection of radiation shielding materials. concrete is one of the most widely used materials in reactor shielding due to its intrinsic properties, such as cheapness, and the ease of preparation of different compositions and forms. moreover, its shielding properties depend strongly on the elemental composition of the prepared mixtures. an enormous amount of solid waste is generated annually in egypt as a by-product of mining, agricultural and industrial processes. due to various economic, social, and environmental restraints, the development of a suitable waste disposal method remains a top priority. the non-degradable waste by-products of mining and industry have long been targeted in research on concrete production. several researchers investigated the possible use of industrial by-products such as steel shots [1], steel particulates, used steel ball-bearings [2], electric arc furnace slag [3] and stone slurry [4], either as fine or coarse aggregates in concrete, and their effects on mechanical and radiation shielding properties 120005-1 papers in physics, vol. 12, art. 120005 (2020) / a. abdel-latif m. et al. were evaluated. among these mining and industrial by-products is marble dust powder, generated during the marble cutting process. marble processing plants cannot store large amounts of marble dust powder, so reusing it is of great environmental and economic benefit [5] and [6]. corinaldesi et al. (2010) found that replacing sand with marble powder at a rate of 10 % provides maximum compressive strength [7]. akkurt and altindag (2012) determined, both experimentally and theoretically, the linear attenuation coefficients of concrete containing marble powder in its fine aggregate form. the measured and calculated linear attenuation coefficients showed good agreement. finally, they concluded that marble can be used as an aggregate in the production of shielding concrete [8]. akkurt and el-khayatt (2013) also calculated the photon interaction parameters for concrete containing marble dust for the photon energy range of 1 kev-100 gev [10]. aliabdo et al. (2014) found that using marble powder as a partial replacement for cement or sand improves the physical properties of concrete [9]. ergün (2015) utilized marble powder together with diatomite as a partial replacement for cement. he found that either 5 % marble powder alone or 5 % marble powder along with 10 % diatomite can be used to enhance the mechanical properties of concrete [11]. furthermore, it was found that up to 10 % marble powder enhances the workability of the mixture, while maintaining its compressive strength [12]. in a recent review it was found that as the amount of marble powder fine aggregate increases within the mixture, concrete workability decreases, and the compressive strength of the concrete increases because of its caco3 and sio2 content [13]. moreover, the cement with optimal concrete strength was obtained using 10 % waste marble as a replacement for cement [14] and [15]. the purpose of this study is to develop a low cost, locally produced concrete mixture with optimum marble content, which is stronger than ordinary concrete and has enhanced gamma-ray and neutron shielding properties. ii. theoretical basis and calculations i. the mass attenuation coefficient, µm a mono-energetic gamma ray passing through matter is attenuated due to photoelectric absorption, scattering, and pair-production. attenuation behavior follows beer–lambert’s law [8, 9, 16] i = i0e −µid, (1) where the incident and the transmitted photon intensities are denoted by i0 and i respectively. moreover, d and µi are the thickness and the linear attenuation coefficient, respectively. also, the mass attenuation coefficient µm can be calculated by: µm = ∑ i ωi µi ρi (2) where ωi, µi and ρi are the weight fraction, linear attenuation coefficient, and the density of the ith constituent element. the mean free path (mfp) is the average distance traveled by a moving particle between successive impacts (collisions) mfp = 1 µ , (3) where µ is the linear attenuation coefficient. ii. effective atomic number, zeff the effective atomic number for low-z elements due to the inelastic scattering of gamma rays with material atoms is given by eq. 4 below [16–19], while for high-z elements such as molybdenum through uranium, the uncertainties at low energies (10 kev to 1 mev) range from 1 to 2 percent far from an absorption edge to 5 to 10 percent in the vicinity of an edge. in the range 1 to 100 mev uncertainties from pair production estimates are 2 to 3 percent, while above 100 mev they are 1 to 2 percent [20]. zeff = na ∑ i fiai µi ρi∑ i fi ai zi µi ρi , (4) where na is avogadro’s number, µi is the linear attenuation coefficient, ρi is the density, ai is the atomic mass, zi is the atomic number and fi is the mole fraction of the ith constituent element. mole fraction fi is given by fi = (ωi/ai)∑ i ( ω a ) i , (5) where omegai is the weight fraction. 120005-2 papers in physics, vol. 12, art. 120005 (2020) / a. abdel-latif m. et al. figure 1: schematic diagram of the modeled nai (tl) detector with simulation geometry. iii. mcnpx code (version 2.6.0) the monte carlo method is often employed for issues with a probabilistic structure. in this study, mcnpx (monte carlo n-particle transport code system-extended) version 2.6.0 [21] was used to investigate the µm of different concrete mixtures [22,23]. the schematic diagram shows the mcnpx gamma ray attenuation setup with five main pieces of simulation equipment: point isotropic radiation source, pb collimator for primary radiation beam, attenuator concrete sample, pb blocks to prevent scattered radiation and nai (tl) detector (see fig. 1) [22, 23]. the relative error rate observed was less than 0.1 % in the output file. iv. exposure build-up factor, ebf during the penetration of gamma photons through any material, they may be either absorbed or scattered by the atoms of this material. secondary radiation may arise due to the build-up of scattered photons inside the material. accordingly, it is necessary to estimate these build-up factors to determine effective exposure and energy deposition in the shielding material. the build-up of secondary radiation is characterized by the exposure buildup factor (ebf), defined as the ratio of the total gamma photon flux (absorbed and scattered) to the absorbed gamma photons of the incident beam [24, 25]. in this work, the geometrical progression (g-p) fitting method [26] was used due to a high level of accuracy. the ratio r = µcomp/µm, which represents the relative contribution of the mass attenuation coefficient due to compton scattering interaction (µcomp), was obtained for each sample over the photon energy range of 0.015 3 mev. the ratio r at a given energy value was then matched with the corresponding ratios r1 and r2 of known elements whose atomic numbers were z1 and z2, respectively, where r1 < r < r2. the equivalent atomic number (zeq) for each sample was obtained using the following formula of interpolation by [24–26]: zeq = z1 log(r2/r) + z2 log(r/r1) log(r2/r1) . (6) the build-up factors for each sample were calculated using the geometrical progression fitting function b(e,x) and k(e,x). b(e,x) = { 1 + (b− 1) (k x−1) (k−1) k 6= 1 1 + (b− 1)x k = 1 , (7) k(e,x) = cxa + d tanh ( x xk − 2 ) − tanh (−2) 1 − tanh (−2) , (8) where eq. (8) is valid for x ≤ 40 mfp and x is the source-to-detector distance in terms of the mfp. the geometrical progression parameters (b, c, a, xk and d) for the selected samples were obtained in advance using the following interpolating formula: p = p1 log(z2/zeq) + p2 log(zeq/z1) log(z2/z1) , (9) where p stands for the required g-p fitting parameter for the selected sample at a specific energy value, while p1 and p2 represent the values of the g-p fitting parameter corresponding to the atomic numbers z1 and z2, respectively. the g-p fitting parameters p1 and p2 can be obtained from the american national standard database which contains the exposure buildup g-p fitting parameters for 23 different elements, one compound (water), and one mixture (concrete) for different energies [24–26]. v. the macroscopic effective removal crosssection for fast neutron σr the attenuation of neutrons in matter obeys the following law: i = i0 exp (−σr x) (10) 120005-3 papers in physics, vol. 12, art. 120005 (2020) / a. abdel-latif m. et al. the effective fast neutron removal cross-section (neutron attenuation coefficient), σr, for a compound or a homogeneous mixture may be calculated using the values σr/ρ for various elements in the compound or mixture, using the following equations [27–34] σr/ρ = ∑ i wi(σr/ρ)i, (11) ρ = ∑ i ρi(σr/ρ)i. (12) where wi is the weight percentage, and ρi and (σr/ρ)i are the partial density and the mass fast neutron removal cross-section (mass attenuation coefficient) of the ith constituent, respectively. iii. materials and experimental procedures i. materials the fine aggregate used is natural siliceous yellow sand with a particle size less than 0.6 mm, with ordinary portland cement (opc) produced by the beni-suef cement factory, in egypt. also, the marble powder is white-colored and odorless, with quite low porosity and a grain-size less than 0.365 mm. ii. concrete sample preparation the experimental part of this study has three main goals: to analyze changes in the concrete chemical composition, to monitor the compressive strength, and evaluate the enhancement of the gamma shielding properties due to the incorporation of marble dust as a partial replacement for cement, on a weight basis. in order to achieve these goals, five different concrete mixtures were prepared where marble was used at a rate of 0 %, 5 %, 10 %, 15 % and 20 %, and tagged as cm1, cm2, cm3, cm4 and cm5, respectively. the variation in the samples was carried out in such a way that when the marble proportion was increased, the cement proportion was decreased by the same proportion. accordingly, the water to cement ratio (w/c) was varied with the varying marble content in the prepared mixtures. for each mixture nine cubic samples (5 cm × 5 cm × 5 cm) were cast. samples the first radioactive source (isotope) is placed in the device and then the unnnatenuated radiation intensity l 0 is measured one slice of the sample (1-1.5 cm thick) is placed and the transmitted radiation intensity is then measured another slice of the same sample (1-1.5 cm thick) is added to the first one and the transmitted radiation intensity is measured again continue adding slices and measuring the transmitted radiation intensity a linear regression is developed between the measured intensity and the overall thickness. the linear attenuatuation coefficient is calculated and hence the mass of sample cm1 replace the radioactive source with another source and repeat the previous steps for all isotopes repeat the previous steps for cm2, cm3, cm4 and cm5 group the measured mass attenuation versus the energy lines and compare it with mcnpx and xcom data figure 2: a flowchart describing the steps followed in the experiment to measure the mass attenuation coefficients. used for measuring the linear attenuation coefficient were then obtained by cutting the cubic samples into slices with thicknesses varying from 1 to 1.5 cm, and the flowchart was followed, as shown, (see fig. 2) to calculate the mass attenuation coefficients. the chemical composition of the constituent materials of each sample were analyzed using the xray fluorescence (xrf) technique; these compositions are listed in table 1. 120005-4 papers in physics, vol. 12, art. 120005 (2020) / a. abdel-latif m. et al. 0 5 10 15 20 104 106 108 110 112 114 116 118 120 region ii c om pr es si ve s tr en gt h (k g/ cm 2 ) marble concentration (%) compressive strength polynomial fit of compressive strength region i y = -0.0943x2 + 1.5857x + 110.29 r² = 0.9461 figure 3: compressive strength and marble concentration. iii. compressive strength test for each mixture, three concrete samples were put to a compressive strength test using an adr 2000 standard compression machine (2000 kn/450000 lbf capacity, rated power of 1350 w). the load was applied gradually at the rate of 140 kg/cm 2 per minute until the specimen failed, and the average reading was registered. iv. gamma ray shielding parameters experiment the radiation shielding experiments were carried out with the samples placed between sources 133 ba (0.356 mev), 137cs (0.662 and 0.911 mev), 60co (1.173 and 1.332 mev), and 232th (0.583 and 2.614 mev); a nai(ti) detector was connected to a multi-channel analyzer (mca) with pc to meatable 1: xrf analysis of the prepared mixtures. chemical composition (% by weight) element cm1 cm2 cm3 cm4 cm5 tio2 0.27 0.33 0.37 0.33 0.24 al2o3 2.88 3.27 3.38 2.86 2.53 fe2o3 1.89 2.06 2.38 1.92 1.55 mno 0.04 0.05 0.05 0.04 0.03 mgo 0.14 0.64 0.11 0.03 cao 19.26 18.69 20.28 16.95 15.09 na2 o 0.14 0.24 0.11 0.08 0.15 sio2 balance balance balance balance balance figure 4: the measured mass attenuation coefficient compared with that determined by mcnpx and winxcom. sure the linear gamma ray attenuation coefficient of each sample struck by gamma radiation. iv. results and discussion i. compressive strength the relationship between marble powder content and compressive strength is shown in fig. 3. it can be seen that the compressive strength of concrete containing marble increases as the marble content increases, until it reaches an absolute maximum value at a marble cement-replacement ratio of 10 % (region i), after which it starts to decrease as the marble content increases (region ii). thus, maximum compressive strength is typically obtained with the use of 10 % waste marble powder, in good agreement with the literature [13–15]. this may be attributed to the higher content of fe2o3 and cao in sample cm3 than in the other samples, as well as its higher density. 120005-5 papers in physics, vol. 12, art. 120005 (2020) / a. abdel-latif m. et al. figure 5: the effective atomic number, zeff , as a function of the incident photon energy. ii. gamma ray shielding parameters the mass attenuation coefficient, µm, was experimentally measured at different photon energy lines, and these results were then compared with those obtained theoretically using winxcom software and the monte-carlo simulation code mcnpx for a photon energy range of 0.015 3 mev. these results are displayed in fig. 4. it can be clearly seen that there is good agreement between the theoretically calculated µm and that measured experimentally. moreover, for very low photon energy (e < 15 kev), the mass attenuation coefficient µm has a very high value due to dominance of the photo-electric interaction. it then decreases as the incident photon energy increases, until it reaches a minimum value at a photon energy of 3 mev. using eq. (4) together with the calculated mass attenuation coefficient, the effective atomic number is obtained over the photon energy range of 0.015 3 mev and displayed in fig. 5. for very small decreases in energy down to a minimum value of 1.0 mev, then slightly increasing again as the energy increased to 3 mev, it was found that sample cm1 had the highest effective atomic number, followed by cm3. moreover, it is worth noting that the addition of marble led to a decrease in the effective atomic number. the mfp values for the different marble concentrations were calculated using mcnpx simulation code at different energy lines within the range 0.015 3 mev. the values obtained (see fig. 6) show where the mixture cm3 has the minimum numerical value for the mfp. however, the addition of marble did not figure 6: mfp and marble concentration at different energy lines. lead to a significant change in the mfp. iii. the energy exposure build-up factor, ebf variation in the ebf with photon energy at the penetration depths of 1, 5, 10 and 40 mfp is shown in fig. 7 (a-d). it is clear that the ebf value increases as the energy of the incident photon increases, until it reaches a maximum value, after which it decreases as the penetration depth increases. at this peak point, the compton scattering interaction is the dominant mechanism. this is followed by a decrease in the build-up factors with 0.01 0.1 1 1 2 r-cm1 r-cm2 r-cm3 r-cm4 r-cm5 r-cm1 r-cm2 r-cm3 r-cm4 r-cm5 r-cm1 r-cm2 r-cm3 r-cm4 r-cm5 r-cm1 r-cm2 r-cm3 r-cm4 r-cm5 e b f photon energy (mev) (a) 1 mfp 0.01 0.1 1 1 10 photon energy (mev) (b) 5 mfp 0.01 0.1 1 1 10 e b f photon energy (mev) (c) 10 mfp 0.01 0.1 1 100 101 102 photon energy (mev) (d) 40 mfp figure 7: the ebf as a function of photon energy at 1, 5, 10 and 40 mfp depth. 120005-6 papers in physics, vol. 12, art. 120005 (2020) / a. abdel-latif m. et al. 1 10 1.02 1.04 1.06 1.08 1.1 r-cm1 r-cm2 r-cm3 r-cm4 r-cm5 r-cm1 r-cm2 r-cm3 r-cm4 r-cm5 r-cm1 r-cm2 r-cm3 r-cm4 r-cm5 r-cm1 r-cm2 r-cm3 r-cm4 r-cm5 e b f penetration depth (mfp) (a) 0.015mev 1 10 1 10 100 penetration depth (mfp) (b) 0.15mev 1 10 1 10 100 e b f penetration depth (mfp) (c) 1.5mev 1 10 100 101 15 20 25 30 35 40 4 5 6 7 8 9 10 11 12 eb f e (mev) r-cm1 r-cm2 r-cm3 r-cm4 r-cm5 penetration depth (mfp) (d) 3mev figure 8: the ebf and penetration depth at photon energies 0.015, 0.15, 1.5 and 3 mev. any further increase in the energy of the incident photon, due to an increase in the contribution of the pair-production interaction at the expense of the compton scattering [33, 35]. variation in the ebf with penetration depth for the different concrete mixtures at an incident photon energy of 0.015, 0.15, 1.5 and 3 mev is shown in fig. 8 (a-d). it can be seen that the ebf increases with increased penetration depth for all concrete mixtures. at a photon energy of 0.015 mev, where photoelectric absorption is the dominant mechanism, the ebf shows that cm3 has better gamma ray shielding properties than the other samples. in contrast, at the higher energies of 0.15, 1.5 and 3 mev the ebf is independent of marble concentration. iv. the effective removal cross section for fast neutrons (neutron attenuation coefficient) the calculations of the fast neutron removal crosssection for the prepared samples are listed in table 2. variation in the neutron mean free path with marble powder concentration is illustrated in fig. 9. the results show that sample cm3 has, numerically, the minimum value for the neutron mean free path. similar to the case of gamma ray mfp, the addition of marble did not lead to a significant change in the fast neutron mfp. figure 9: the neutron mfp and marble concentrations. v. conclusions the following conclusions can be drawn: � the replacement of cement by marble waste powder enhances compressive strength, as replacing 10 % of cement with marble powder (cm3) led to an increase of 10 % in compressive strength with respect to the measured value for reference sample cm1. this may be attributed to the higher content of fe2o3 and cao than in the other samples, and is economically beneficial. � based on the mass attenuation coefficients, the effective atomic number, zeff , and the mean free path were calculated for the different mixtures. it was found that sample cm3, which contained 10 % marble, had better gamma and neutron shielding properties (i.e., minimum mfp and maximum effective atomic number zeff than the other mixtures. this may be due to its higher density compared to the other mixtures. � at the photon energy of 0.015 mev, where photoelectric absorption is the dominant mechanism, the ebf shows that cm3 has better gamma ray shielding properties than the other samples, while for e > 0.015 mev it is composition-independent. 120005-7 papers in physics, vol. 12, art. 120005 (2020) / a. abdel-latif m. et al. table 2: calculations of the fast neutron removal cross-section for the prepared samples. element cm1 cm2 cm3 cm4 cm5 part. dens. σrcm −1 part. dens. σrcm −1 part. dens. σrcm −1 part. dens. σrcm −1 part. dens. al 0.0422 0.0012 0.0483 0.0014 0.0502 0.0015 0.0421 0.0012 0.0370 0.0011 fe 0.0366 0.0008 0.0402 0.0009 0.0467 0.0010 0.0374 0.0008 0.0299 0.0006 ca 0.3812 0.0093 0.3731 0.0091 0.4066 0.0099 0.3371 0.0082 0.2978 0.0072 si 0.9775 0.0246 0.9820 0.0247 0.9544 0.0241 1.0109 0.0255 1.0374 0.0261 o 1.3228 0.0536 1.3368 0.0541 1.3370 0.0541 1.3471 0.0546 1.3507 0.0547 minor 0.0087 0.0002 0.0125 0.0003 0.0102 0.0003 0.0083 0.0002 0.0082 0.0002 total 2.7690 0.0897 2.7930 0.0906 2.8050 0.0908 2.7830 0.0905 2.7610 0.0900 [1] o gencel, a bozkurt, e kam, t korkut, determination and calculation of gamma and neutron shielding characteristics of concretes containing different hematite proportions, ann. nucl. energy 38, 1274 (2011). [2] a b azeez, k s mohammed, m m bakri, a hussin, a v sandu, a r razak, the effect of various waste materials’ contents on the attenuation level of anti-radiation shielding concrete, materials 2013, 6, 4836 (2013). [3] m maslehuddin, a naqvi, m ibrahim, z kalakada, radiation shielding properties of concrete with electric arc furnace slag aggregates and steel shots, ann. nucl. energy 53, 192 (2013). [4] n almeida, f branco, j r santos, recycling of stone slurry in industrial activities: application to concrete mixtures, buil. environ. 42, 810 (2007). [5] k e alyamaç, a a aydin, concrete properties containing fine aggregate marble powder, k.s.c.e. j. civ. eng. 19, 2208 (2015). [6] m r kumar, s k kumar, partial replacement of cement with marble dust powder, int. j. eng. res. appl. 5, 106 (2015). [7] v corinaldesi, g moriconi, a t naik, characterization of marble powder for its use in mortar and concrete, constr. build. mater. 24, 113 (2010). [8] i akkurt, r altindag, k gunoglu, h sarıkaya, photon attenuation coefficients of concrete including marble aggregate, ann. nucl. energy 43, 56 (2012). [9] i akkurt, a m el-khayatt, effective atomic number and electron density of marble concrete, j. radioanal. nucl. chem. 295, 633 (2013). [10] a a aliabdo, m abd elmoaty, e m auda, reuse of waste marble dust in the production of cement and concrete, constr. build. mater. 50, 28 (2014). [11] a ergun, effects of the usage of diatomite and waste marble powder as partial replacement of cement on the mechanical properties of concrete, constr. build. mater. 25, 806 (2011). [12] k vardhan, s goyal, r siddique, a m singh, mechanical properties and microstructural analysis of cement mortar incorporating marble powder as partial replacement of cement, constr. build. mater. 96, 615 (2015). [13] h s arel, recyclability of waste marble in concrete production, j. clean. prod. 131 , 179 (2016). [14] e t tunc, recycling of marble waste: a review based on strength of concrete containing marble waste, j. environ. manage. 231, 86 (2019). [15] a khodabakhshian, j de brito, m ghalehnovi, e a shamsabadi, mechanical, environmental and economic performance of structural concrete containing silica fume and marble industry waste powder, constr. build. mater. 169, 237 (2018). 120005-8 https://doi.org/10.1016/j.anucene.2011.08.010 https://doi.org/10.3390/ma6104836 https://doi.org/10.1016/j.anucene.2012.09.006 https://doi.org/10.1016/j.anucene.2012.09.006 https://doi.org/10.1016/j.buildenv.2005.09.018 https://doi.org/10.1016/j.buildenv.2005.09.018 https://doi.org/10.1007/s12205-015-0327-y https://www.ijera.com/papers/vol5_issue8/part 4/n5804106114.pdf https://www.ijera.com/papers/vol5_issue8/part 4/n5804106114.pdf https://doi.org/10.1016/j.conbuildmat.2009.08.013 https://doi.org/10.1016/j.conbuildmat.2009.08.013 https://doi.org/10.1016/j.anucene.2011.12.031 https://doi.org/10.1016/j.anucene.2011.12.031 https://doi.org/10.1007/s10967-012-2111-5 https://doi.org/10.1007/s10967-012-2111-5 https://doi.org/10.1016/j.conbuildmat.2013.09.005 https://doi.org/10.1016/j.conbuildmat.2013.09.005 https://doi.org/10.1016/j.conbuildmat.2010.07.002 https://doi.org/10.1016/j.conbuildmat.2015.08.071 https://doi.org/10.1016/j.jclepro.2016.05.052 https://doi.org/10.1016/j.jclepro.2016.05.052 https://doi.org/10.1016/j.jenvman.2018.10.034 https://doi.org/10.1016/j.conbuildmat.2018.02.192 https://doi.org/10.1016/j.conbuildmat.2018.02.192 papers in physics, vol. 12, art. 120005 (2020) / a. abdel-latif m. et al. [16] s r manohara, s m hanagodimath, l gerward, studies on effective atomic number, electron density and kerma for some fatty acids and carbohydrates, phys. med. biol. 53, n377 (2008). [17] s r manohara, s m hanagodimath, l gerward, photon interaction and energy absorption in glass: a transparent gamma-ray shield, j. nucl. mater. 393, 465 (2009). [18] v p singh, n m badiger, n kucuk, determination of effective atomic numbers using different methods for some low-z materials, j. nucl. chem. 2014, (2014). [19] v p singh, n m badiger, n kucuk, assessment of methods for estimation of effective atomic numbers of common human organ and tissue substitutes: waxes, plastics and polymers, radioprotection 49, 115 (2014). [20] j h hubbell, photon cross-sections, attenuation coefficients, and energy absorption coefficients from 10-kev to 100-gev, nsrds-nbs 29, (1969). [21] oak ridge national laboratory, mcnpx 2.4.0, monte carlo n-particle transport code system for multiparticles and high energy applications., radiation shielding information center, (2004). [22] b o el-bashir, m i sayyed, m h m zaid, k a matori, comprehensive study on physical, elastic and shielding properties of ternary bao-bi2o3-p2o5 glasses as a potent radiation shielding material, j. non-cryst. solids 468, 92 (2017). [23] h o tekin, t manici, simulations of mass attenuation coefficients for shielding materials using the mcnp-x code, nucl. sci. tech. 28, 95 (2017). [24] y harima, y sakamoto, s tanaka, m kawai, validity of the geometric-progression formula in approximating gamma-ray build-up factors, nucl. sci. eng. 94, 24 (1986). [25] j h hubbell, s m seltzer, x-ray mass attenuation coefficients: nist standard reference database 126, national institute of standards and technology (1996). [26] ansi, gamma-ray attenuation coefficients & buildup factors for engineering materials, american national standards institute [ansi] (1991). [27] j i wood, computational methods in reactor shielding, pergamon press, new york (1982). [28] a m el-khayatt, a el-sayed abdo, mercsfn: a program for the calculation of fast neutron removal cross-section in composite shields, ann. nucl. energy 36, 832 (2009). [29] a m el-khayatt, calculation of fast neutron removal cross-sections for some compounds and materials, ann. nucl. energy 37, 218 (2010). [30] a m el-khayatt, nxcom: a program for calculating attenuation coefficients of fast neutrons and gamma-rays, ann. nucl. energy 38, 128 (2011). [31] y elmahroug, b tellili, c souga, k manai, parshield: a computer program for calculating attenuation parameters of the gamma rays and fast neutrons, ann. nucl. energy 76, 94 (2015). [32] a el abd, g mesbah, m a nader, a ellithi, a simple method for determining the effective removal cross section for fast neutrons, j. radiat. nucl. appl. 2, 53 (2016). [33] a m madbouly, a a el-sawy, calculation of gamma and neutron parameters for some concrete materials as radiation shields for nuclear facilities, int. j. emerging trends in eng. develop. 3, 7 (2018). [34] m m kassab, s i el-kameesy, m m eissa, a abdel-latif m, a study of neutron and gamma-ray interaction properties with cobaltfree highly chromium maraging steel, j. mod. phys. 6, 1526 (2015). [35] c suteau, m chiron, an iterative method for calculating gamma-ray build-up factors in multi-layer shields, radiat. prot. dosim. 116, 489 (2005). 120005-9 https://doi.org/10.1088/0031-9155/53/20/n01 https://doi.org/10.1088/0031-9155/53/20/n01 https://doi.org/10.1016/j.jnucmat.2009.07.001 https://doi.org/10.1155/2014/725629 https://doi.org/10.1155/2014/725629 https://doi.org/10.1051/radiopro/2013090 https://doi.org/10.6028/nbs.nsrds.29 https://doi.org/10.6028/nbs.nsrds.29 https://inis.iaea.org/search/search.aspx?orig_q=rn:39098954 https://inis.iaea.org/search/search.aspx?orig_q=rn:39098954 https://doi.org/10.1016/j.jnoncrysol.2017.04.031 https://doi.org/10.1016/j.jnoncrysol.2017.04.031 https://doi.org/10.1007/s41365-017-0253-4 https://doi.org/10.1007/s41365-017-0253-4 https://doi.org/10.13182/nse86-a17113 https://dx.doi.org/10.18434/t4d01f https://dx.doi.org/10.18434/t4d01f https://webstore.ansi.org/standards/ansi/ansians1991-1534182 https://webstore.ansi.org/standards/ansi/ansians1991-1534182 https://doi.org/10.1016/j.anucene.2009.01.013 https://doi.org/10.1016/j.anucene.2009.10.022 https://doi.org/10.1016/j.anucene.2009.10.022 https://doi.org/10.1016/j.anucene.2010.08.003 https://doi.org/10.1016/j.anucene.2010.08.003 https://doi.org/10.1016/j.anucene.2014.09.044 https://doi.org/10.1016/j.anucene.2014.09.044 https://doi.org/10.18576/jrna/020203 https://doi.org/10.18576/jrna/020203 https://dx.doi.org/10.26808/rs.ed.i8v4.02 https://dx.doi.org/10.26808/rs.ed.i8v4.02 http://www.scirp.org/journal/paperinformation.aspx?paperid=59708&#abstract http://www.scirp.org/journal/paperinformation.aspx?paperid=59708&#abstract https://doi.org/10.1093/rpd/nci192 https://doi.org/10.1093/rpd/nci192 introduction theoretical basis and calculations the mass attenuation coefficient, m effective atomic number, zeff mcnpx code (version 2.6.0) exposure build-up factor, ebf the macroscopic effective removal cross-section for fast neutron r materials and experimental procedures materials concrete sample preparation compressive strength test gamma ray shielding parameters experiment results and discussion compressive strength gamma ray shielding parameters the energy exposure build-up factor, ebf the effective removal cross section for fast neutrons (neutron attenuation coefficient) conclusions papers in physics, vol. 1, art. 010001 (2009) received: 17 june 2009, accepted: 28 august 2009 edited by: s. a. cannas licence: creative commons attribution 3.0 doi: 10.4279/pip.010001 www.papersinphysics.org issn 1852-4249 correlation between asymmetric profiles in slits and standard prewetting lines salvador a. sartarelli,1∗ leszek szybisz2−4† the adsorption of ar on substrates of li is investigated within the framework of a density functional theory which includes an effective pair potential recently proposed. this approach yields good results for the surface tension of the liquid-vapor interface over the entire range of temperatures, t, from the triple point, tt, to the critical point, tc. the behavior of the adsorbate in the cases of a single planar wall and a slit geometry is analyzed as a function of temperature. asymmetric density profiles are found for fluid confined in a slit built up of two identical planar walls leading to the spontaneous symmetry breaking (ssb) effect. we found that the asymmetric solutions occur even above the wetting temperature tw in a range of average densities ρ ∗ ssb1 ≤ ρ ∗ av ≤ ρ∗ssb2, which diminishes with increasing temperatures until its disappearance at the critical prewetting point tcpw. in this way a correlation between the disappearance of the ssb effect and the end of prewetting lines observed in the adsorption on a one-wall planar substrate is established. in addition, it is shown that a value for tcpw can be precisely determined by analyzing the asymmetry coefficients. i. introduction the study of physisorption of fluids on solid substrates had led to very fascinating phenomena mainly determined by the relative strengths of ∗e-mail: asarta@ungs.edu.ar †e-mail: szybisz@tandar.cnea.gov.ar 1 instituto de desarrollo humano, universidad nacional de general sarmiento, gutierrez 1150, ra–1663 san miguel, argentina. 2 laboratorio tandar, departamento de f́ısica, comisión nacional de enerǵıa atómica, av. del libertador 8250, ra–1429 buenos aires, argentina. 3 departamento de f́ısica, facultad de ciencias exactas y naturales, universidad de buenos aires, ciudad universitaria, ra–1428 buenos aires, argentina. 4 consejo nacional de investigaciones cient́ıficas y técnicas, av. rivadavia 1917, ra–1033 buenos aires, argentina. fluid-fluid (f-f) and substrate-fluid (s-f) attractions. in the present work we shall refer to two of such features. one is the prewetting curve identified in the study of fluids adsorbed on planar surfaces above the wetting temperature tw (see, e.g., pandit, schick, and wortis [1]) and the other is the occurrence of asymmetric profiles of fluids confined in a slit of identical walls found by van leeuwen and collaborators in molecular dynamics calculations [2, 3]. it is known that for a strong substrate (i.e., when the s-f attraction dominates over the f-f one) the adsorbed film builds up continuously showing a complete wetting.in such a case, neither prewetting transitions nor spontaneous symmetry breaking (ssb) of the profiles are observed, both these phenomena appear for substrates of moderate strength. the prewetting has been widely analyzed for adsorption of quantum as well as classical fluids. a 010001-1 papers in physics, vol. 1, art. 010001 (2009) / s. a. sartarelli et al. summary of experimental data and theoretical calculations for 4he may be found in ref. [4]. studies of other fluids are mentioned in ref. [5]. these investigations indicated that prewetting is present in real systems such as 4he, h2, and inert gases adsorbed on alkali metals. on the other hand, after a recent work of berim and ruckenstein [6] there is a renewal of the interest in searching for the ssb effect in real systems. these authors utilized a density functional (df) theory to study the confinement of ar in a slit composed of two identical walls of co2 and concluded that ssb occurs in a certain domain of temperatures. in a revised analysis of this case, reported in ref. [7], we found that the conditions for the ssb were fulfilled because the authors of ref. [6] had diminished the s-f attraction by locating an extra hard-wall repulsion. however, it was found that inert gases adsorbed on alkali metals exhibit ssb. results for ne confined by such substrates were recently reported [8]. the aim of the present investigation is to study the relation between the range of temperatures where the ssb occurs and the temperature dependence of the wetting properties. in this paper we illustrate our findings describing the results for ar adsorbed on li. previous df calculations of ancilotto and toigo [9] as well as grand canonical monte carlo (gcmc) simulations carried out by curtarolo et al. [10] suggest that ar wets li at a temperature significantly below tc. so, this system should exhibit a large locus of the prewetting line and this feature makes it very convenient for our study as it was already communicated during a recent workshop [11]. the paper is organized in the following way. the theoretical background is summarized in sec. ii.. the results, together with their analysis, are given in sec. iii.. sec. iv. is devoted to the conclusions. ii. theoretical background in a df theory, the helmholtz free energy fdf[ρ(r)] of an inhomogeneous fluid embedded in an external potential usf (r) is expressed as a functional of the local density ρ(r) (see, e.g., ref. [12]) fdf[ρ(r)] = νid kb t ∫ dr ρ(r){ln[λ3ρ(r)] − 1} + ∫ dr ρ(r) fhs[ρ̄(r); dhs] + 1 2 ∫ ∫ dr dr′ρ(r) ρ(r′′) φattr(| r − r′ |) + ∫ dr ρ(r) usf (r) . (1) the first term is the ideal gas free energy, where kb is the boltzmann constant and λ =√ 2 π h̄2/mkb t the de broglie thermal wavelength of the molecule of mass m. quantity νid is a parameter introduced in eq. (2) of [13] (in the standard theory it is equal unity). the second term accounts for the repulsive f-f interaction approximated by a hard-sphere (hs) functional with a certain choice for the hs diameter dhs. in the present work we have used for fhs[ρ̄(r); dhs] the expression provided by the nonlocal df (nldf) formalism developed by kierlik and rosinberg [14] (kr), where ρ̄(r) is a properly averaged density. the third term is the attractive f-f interactions treated in a mean field approximation (mfa). finally, the last integral represents the effect of the external potential usf (r) exerted on the fluid. in the present work, for the analysis of physisorption we adopted the ab initio potential of chismeshya, cole, and zaremba (ccz) [15] with the parameters listed in table 1 therein. i. effective pair attraction the attractive part of the f-f interaction was described by an effective pair interaction devised in ref. [5], where the separation of the lennard-jones (lj) potential introduced by weeks, chandler and andersen (wca) [16] is adopted φwcaattr (r) =   −ε̃ff , r ≤ rm 4ε̃ff [( σ̃f f r )12 − ( σ̃f f r )6] , r > rm . (2) 010001-2 papers in physics, vol. 1, art. 010001 (2009) / s. a. sartarelli et al. here rm = 21/6σ̃ff is the position of the lj minimum. no cutoff for the pair potential was introduced. the well depth ε̃ff and the interaction size σ̃ff are considered as free parameters because the use of the bare values εff/kb = 119.76 k and σff = 3.405 å overestimates tc. so, the complete df formalism has three adjustable parameters (namely, νid, ε̃ff , and σ̃ff ), which were determined by imposing that at l-v coexistence, the pressure as well as the chemical potential of the bulk l and v phases should be equal [i.e., p(ρl) = p(ρv) and µ(ρl) = µ(ρv)]. the procedure is described in ref. [5]. in practice, we set dhs = σ̃ff and imposed the coexistence data of ρl, ρv, and p(ρl) = p(ρv) = p0 for ar quoted in table x of ref. [17] to be reproduced in the entire range of temperatures t between tt = 83.78 k and tc = 150.86 k. ii. euler-lagrange equation the equilibrium density profile ρ(r) of the adsorbed fluid is determined by a minimization of the free energy with respect to density variations with the constraint of a fixed number of particles n δ δρ(r) [ fdf[ρ(r)] −µ ∫ dr ρ(r) ] = 0 . (3) here the lagrange multiplier µ is the chemical potential of the system. in the case of a planar symmetry where the flat walls exhibit an infinite extent in the x and y directions, the profile depends only on the coordinate z perpendicular to the substrate. for this geometry, the variation of eq. (3) yields the following euler-lagrange (e-l) equation δ[(fid + fhs)/a] δρ(z) + ∫ l 0 dz′ρ(z′)φ̄attr(| z −z′ |) + usf (z) = µ , (4) where δ(fid/a) δρ(z) = νid kb t ln [λ 3 ρ(z)] , (5) and δ(fhs/a) δρ(z) = fhs[ρ̄(z); dhs] + ∫ l 0 dz′ρ(z′) δfhs[ρ̄(z′); dhs] δρ̄(z′) δρ̄(z′) δρ(z) . (6) here fid/a and fhs/a are free energies per unit of one wall area a. l is the size of the box adopted for solving the e-l equations. the boundary conditions for the one-wall and slit systems are different and will be given below. the final e-l equation may cast into the form νid kb t ln [λ 3 ρ(z)] + q(z) = µ , (7) where q(z) = fhs[ρ̄(z); dhs] + ∫ l 0 dz′ρ(z′) δfhs[ρ̄(z′); dhs] δρ̄(z′) δρ̄(z′) δρ(z) + ∫ l 0 dz′ρ(z′) φ̄attr(| z −z′ |) + usf (z) . (8) the number of particles ns per unit area, a, of the wall is ns = n a = ∫ l 0 ρ(z) dz . (9) in order to get solutions for ρ(z), it is useful to rewrite eq. (7) as ρ(z) = ρ0 exp ( − q(z) νid kb t ) , (10) with ρ0 = 1 λ3 exp ( µ νid kb t ) . (11) the relation between µ and ns is obtained by substituting eq. (10) into the constraint of eq. (9) µ = −νid kb t × ln [ 1 nsλ3 ∫ l 0 dz exp ( − q(z) νidkbt )] .(12) when solving this kind of systems, it is usual to define dimensionless variables z∗ = z/σ̃ff for the distance and ρ∗ = ρσ̃3ff for the densities. in these units the box size becomes l∗ = l/σ̃ff . 010001-3 papers in physics, vol. 1, art. 010001 (2009) / s. a. sartarelli et al. iii. results and analysis in order to quantitatively study the adsorption of fluids within any theoretical approach,one must require the experimental surface tension of the bulk liquid-vapor interface, γlv, to be reproduced satisfactorily over the entire tt ≤ t ≤ tc temperature range. therefore, we shall first examine the prediction for this observable before studying the adsorption phenomena. i. surface tension of the bulk liquid-vapor interface figure 1 shows the experimental data of γlv taken from table ii of ref. [18]. in order to theoretically evaluate this quantity the e-l equations for free slabs of ar, i.e. setting usf (z) = 0 , (13) were solved imposing periodic boundary conditions ρ(z = 0) = ρ(z = l). at a given temperature t , for a sufficiently large system one must obtain a wide central region with ρ(z ' l/2) = ρl(t) and tails with density ρv(t), where the values of ρl(t) and ρv(t) should be those of the liquid-vapor coexistence curve. the surface tension of the liquid-vapor interface is calculated according to the thermodynamic definition γlv = (ω + p0 v )/a = ω/a + p0 l , (14) where ω = fdf −µn is the grand potential of the system and p0 the pressure at liquid-vapor coexistence previously introduced. we solved a box with l∗ = 40. the obtained results are plotted in fig. 1 together with the prediction of the fluctuation theory of critical phenomena γlv = γ0lv(1 − t/tc) 1.26 with γ0lv = 17.4 k/å 2 (see, e.g., [19]). one may realize that our values are in satisfactory agreement with experimental data and the renormalization theory over the entire range of temperatures tt ≤ t ≤ tc, showing a small deviation near tt. ii. adsorption on one planar wall it is assumed that the physisorption of ar on a one wall substrate of li is driven by the ccz potential, i.e., figure 1: surface tension of ar as a function of temperature. squares are experimental data taken from table ii of ref. [18]. the solid curve corresponds to the fluctuation theory of critical phenomena and the circles are present df results. figure 2: adsorption isotherms for the ar/li system, i.e., ∆µ as a function of coverage γ`. uptriangles correspond to t = 119 k; circles to t = 118 k; diamonds to t = 117 k; squares to t = 116 k; down-triangles to t = 114 k and stars to t = 112 k. usf (z) = uccz(z) . (15) the e-l equations were solved in a box of size l∗ = 40 by imposing ρ(z > l) = ρ(z = l). the solution gives a density profile ρ(z) and the corresponding chemical potential µ. adsorption 010001-4 papers in physics, vol. 1, art. 010001 (2009) / s. a. sartarelli et al. isotherms at a given temperature were calculated as function of the excess surface density. this quantity, also termed coverage, is often expressed in nominal layers ` γ` = (1/ρ 2/3 l ) ∫ ∞ 0 dz[ρ(z) −ρb] , (16) where ρb = ρ(z →∞) is the asymptotic bulk density and ρl the liquid density at saturation for a given temperature. by utilizing the results for µ obtained from the e-l equation and the value µ0 corresponding to saturation at a given temperature t , the difference ∆µ = µ−µ0 was evaluated. figure 2 shows the adsorption isotherms for temperatures above tw, where an equal area maxwell construction is feasible. this is just the prewetting region characterized by a jump in coverage γ`. the size of this jump depends on temperate. the largest jump occurs at tw and diminishes for increasing t until its disappearance at tcpw. density profiles just below and above the coverage jump for t = 114 k are displayed in fig. 3, in that case γ` jumps from 0.5 to 3.6. therefore, the formation of the fourth layer may be observed in the plot. figure 3: examples of density profiles of ar adsorbed on a surface of li at t = 114 k displayed as a function of the distance from the wall located at z∗ = 0. dashed curves are profiles for γ` below the coverage jump, while solid curves are stable films above this jump. the wetting temperature tw can be obtained from the analysis of the values of ∆µ/kb at which figure 4: prewetting line for ar adsorbed on li. the solid curve is the fit to eq. (17) and reaches the ∆µpw/kb = 0 line at tw = 110.1 k. the jump in coverage occurs at each considered temperature. the behavior ∆µpw/kb vs t is displayed in fig. 4. a useful form for determining the temperature tw was derived from thermodynamic arguments [20] ∆µpw(t) = µpw(t) −µ0(t) = apw (t −tw)3/2 . (17) here apw is a model parameter and the exponent 3/2 is fixed by the power of the van der walls tail of the adsorption potential usf (z) '−c3/z3. the fit of the data of ∆µ/kb to eq. (17) yielded tw = 110.1 k and apw/kb = −0.16 k−1/2. on the other hand, according to fig. 2, the critical prewetting point tcpw lies between t = 118 and 119 k. at the latter temperature, the film already presents a continuous growth. our values of tw and tcpw are smaller than those obtained from prior df calculations [9] (tw = 123 k and tcpw ' 130 k) and gcmc simulations [10] (tw = 130 k). the difference with the df evaluation of ref. [9] is due to the use of different effective pair potentials as we explain in ref. [5], where the adsorption of ne is studied. the present approach gives a reasonable γlv, while that of ref. [9] fails dramatically close to tt. the difference with the gcmc results cannot be interpreted in a straightforward way. 010001-5 papers in physics, vol. 1, art. 010001 (2009) / s. a. sartarelli et al. iii. confinement in a planar slit in the slit geometry, where the ar atoms are confined by two identical walls of li the s-f potential becomes usf (z) = uccz(z) + uccz(l−z) . (18) the walls were located at a distance l∗ = 40, this width guarantees that the pair interaction between two atoms located at different walls is negligible. in fact, this width is wider than l∗ = 29.1, which was utilized in the pioneering molecular dynamics calculations [2, 3]. accordingly, the e-l equations were solved in a box of size l∗ = 40. in this geometry, the repulsion at the walls causes the profiles ρ(z = 0) and ρ(z = l) to be equal to zero. the solutions were obtained at a fixed dimensionless average density defined in terms of n, a, and l as ρ∗av = n σ̃ 3 ff/al = n ∗ s /l ∗. figure 5: free energy per particle (in units of kb t) for ar confined in a slit of li with l∗ = 40 at t = 115 k displayed as a function of the average density. the curve labeled by circles corresponds to symmetric solutions, while that labeled by triangles corresponds to asymmetric ones. the ssb occurs in a certain range of average density ρ∗ssb1 ≤ ρ ∗ av ≤ ρ∗ssb2. for temperatures below tw = 110.1 k, we obtained large ranges of ρ∗av where the asymmetric solutions exhibit a lower free energy than the corresponding symmetric ones. in spite of the fact that there is a general idea that a connection exists between the ssb effect and nonwetting, we have found, by contrast, that ssb behavior extends above the wetting temperature. furthermore, we have also found a relation between prewetting and ssb. figure 5 shows the free energy per particle, fdf = fdf/n, for both symmetric and asymmetric solutions for the ar/li system at t = 115 k > tw as a function of the average density. according to this picture, the ground state (g-s) exhibits asymmetric profiles between a lower and an upper limit ρ∗ssb1 = 0.057 ≤ ρ ∗ av ≤ ρ∗ssb2 = 0.192. out of this range no asymmetric solutions were obtained form the set of eqs. (7)-(12). similar features were obtained for higher temperatures until t = 118 k, above this value the profiles corresponding to the g-s are always symmetric. figure 6 shows three examples of solutions determined at t = 115 k. the result labeled 1 is a small asymmetric profile, that labeled 2 is the largest asymmetric solution at this temperature. so, by further increasing ρ∗av, the ssb effect disappears and the g-s becomes symmetric, as indicated by the curve labeled 3. when the asymmetric profiles occur, the situation is denoted as partial (or one wall) wetting. the symmetric solutions account for a complete (two wall) wetting. these different situations can be interpreted in terms of the balance of γsl, γsv and γlv surface tensions, carefully discussed in previous works [2, 3, 7]. here we shall restrict ourselves to briefly outline the main features. when the liquid is adsorbed symmetrically like in the case of profile 3 in fig. 6, there are two s-l and two l-v interfaces. hence, the total surface excess energy may be written as γ sym tot = 2 γsl + 2 γlv . (19) on the other hand, for a asymmetric profile γasytot becomes γ asy tot = γsl + γlv + γsv . (20) the three quantities of the r.h.s. of this equation are related by young’s law (see, e.g., eq. (2.1) in ref. [21]) γsv = γsl + γlv cos θ , (21) where θ is the contact angle defined as the angle between the wall and the interface between the liquid and the vapor (see fig. 1 in ref. [21]). by using young’s law, the eq. (20) may be rewritten as 010001-6 papers in physics, vol. 1, art. 010001 (2009) / s. a. sartarelli et al. γ asy tot = 2 γsl + γlv (1 + cos θ) , (22) with cos θ = (γsv − γsl)/γlv < 1. if one changes γsl by increasing enough ns (as shown in fig. 5), and/or t , and/or the strength of usf (z), eventually the equality γsv −γsl = γlv may be reached yielding cos θ = 1. then, the system would undergo a transition to a symmetric profile where both walls of the slit are wet. figure 6: density profiles of ar confined in a slit of li with l∗ = 40 at t = 115 k. the displayed spectra denoted by 1, 2 and 3 correspond to average densities ρ∗av = 0.074, 0.192 and 0.218, respectively. it is important to remark that, indeed, there are two degenerate asymmetric solutions. besides that one shown in fig. 6 where the profiles exhibit the thicker film adsorbed on the left wall (left asymmetric solutions las), there is an asymmetric solution with exactly the same free energy but where the thicker film is located near the right wall (right asymmetric solutions ras). the asymmetry of density profiles may be measured by the quantity ∆n = 1 ns ∫ l/2 0 dz [ρ(z) −ρ(l−z)] . (23) according to this definition, if the profile is completely asymmetrical about the middle of the slit, i.e. for: (i) ρ(z < l/2) 6= 0 and ρ(z ≥ l/2) = 0; or (ii) ρ(z < l/2) = 0 and ρ(z ≥ l/2) 6= 0 this figure 7: asymmetry parameter for ar confined by two li walls separated by a distance of l∗ = 40 as a function of average density. from outside to inside the curves correspond to temperatures t = 112, 114, 115, 116, 117 and 118 k. the asymmetric solutions occur for different ranges ρ∗ssb1 ≤ ρ ∗ av ≤ ρ∗ssb2. figure 8: circles stand for both branches of the asymmetry parameter for ar confined in an l∗ = 40 slit of li walls for temperatures between tw and tcpw. the solid curve is the fit to eq. (24) used to determine tcpw. quantity becomes +1 or −1, respectively, while for symmetric solutions it vanishes. we evaluated the asymmetry coefficients of solutions obtained for increasing temperatures up to t = 118 k. the results for las profiles at tem010001-7 papers in physics, vol. 1, art. 010001 (2009) / s. a. sartarelli et al. peratures larger that tw are displayed in fig. 7 as a function of the average density. one may observe how the range ρ∗ssb1 ≤ ρ ∗ av ≤ ρ∗ssb2 diminishes under increasing temperatures. the ssb effect persists at most for the critical ρ∗av(crit) = (17/24) σ̃2ff × 10 −2 ' 0.074 with σ̃ff expressed in å. we shall demonstrate that by analyzing the data of ∆n for ρ∗av(crit) it is possible to determine the critical prewetting point. figure 8 shows these values for both the las and ras profiles, calculated at different temperatures, suggesting a rather parabolic shape. so, we propose a fit to the following quartic polynomial t = tcpw + a2∆ 2 n + a4∆ 4 n . (24) this procedure yielded tcpw = 118.4 k, a2 = −14.14 k, and a4 = −16.63 k. the obtained value of tcpw is in agreement with the limits established when analyzing the adsorption isotherms of the one-wall systems displayed in fig. 2. these results indicate that the disappearance of the ssb effect coincides with the end of the prewetting line. iv. conclusions we have performed a consistent study within the same df approach of free slabs of ar, the adsorption of these atoms on a single planar wall of li and its confinement in slits of this alkali metal. good results were obtained for the surface tension of the liquid-vapor interface. the analysis of the physisorption on a planar surface indicates that ar wets surfaces of li in agreement with previous investigations. the isotherms for the adsorption on one planar wall exhibit a locus of prewetting in the µ − t plane. a fit of such data yielded a wetting temperature tw = 110.1 k. in addition, these isotherms also show that the critical prewetting point tcpw lies between t = 118 and 119 k. these results for tw and tcpw are slightly below the values obtained in refs. [9, 10], the discrepancy is discussed in the text. on the other hand, this investigation shows that the profiles of ar confined in a slit of li present ssb. this effect occurs in a certain range of average densities ρ∗ssb1 ≤ ρ ∗ av ≤ ρ∗ssb2, which diminishes for increasing temperatures. the main output of this work is the finding that above the wetting temperature the ssb occurs until tcpw is reached. to the best of our knowledge this is the first time that such a correlation is reported. furthermore, it is shown that by examining the evolution of the asymmetry coefficient one can precisely determine tcpw. the obtained value tcpw = 118.4 k lies in the interval established when analyzing the adsorption on a single wall. acknowledgements this work was supported in part by the grants pict 31980/5 from agencia nacional de promoción cient́ıfica y tecnológica, and x099 from universidad de buenos aires, argentina. [1] r pandit, m schick, m wortis, systematics of multilayer adsorption phenomena on attractive substrates phys. rev. b 26, 5112 (1982). [2] j h sikkenk, j o indekeu, j m j van leeuwen, e o vossnack, molecular-dynamics simulation of wetting and drying at solid-fluid interfaces phys. rev. lett. 59, 98 (1987). [3] m j p nijmeijer, c bruin, a f bakker, j m j van leeuwen, wetting and drying of an inert wall by a fluid in a molecular-dynamics simulation, phys. rev. a 42, 6052 (1990). [4] l szybisz, adsorption of superfluid 4he films on planar heavy-alkali metals studied with the orsay-trento density functional, phys. rev. b 67, 132505 (2003). [5] s a sartarelli, l szybisz, i urrutia, adsorption of ne on alkali surfaces studied with a density functional theory, phys. rev. e 79, 011603 (2009). [6] g o berim, e ruckenstein, symmetry breaking of the fluid density profiles in closed nanoslits, j. chem. phys. 126, 124503 (2007). [7] l szybisz, s a sartarelli, density profiles of ar adsorbed in slits of co2: spontaneous symmetry breaking revisited, j. chem. phys. 128, 124702 (2008). 010001-8 papers in physics, vol. 1, art. 010001 (2009) / s. a. sartarelli et al. [8] s a sartarelli, l szybisz, i urrutia, spontaneous symmetry breaking and first-order phase transitions of adsorbed fluids, int. j. bifurcation chaos (in press). [9] f ancilotto, f toigo, prewetting transitions of ar and ne on alkali-metal surfaces surface, phys. rev. b 60, 9019 (1999). [10] s curtarolo, g stan, m j bojan, m w cole, w a steele, threshold criterion for wetting at the triple point, phys. rev. e 61, 1670 (2000). [11] l szybisz and s a sartarelli, adsorción de gases nobles sobre sustratos planos de metales alcalinos, communication at the workshop trefemac09 held at the univerisidad nacional de la pampa, santa rosa, argentina, may 4-6 (2009). [12] p i ravikovitch, a vishnyakov, a v neimark, density functional theories and molecular simulations of adsorption and phase transitions in nanopores, phys. rev. e 64, 011602 (2001). [13] f ancilotto, s curtarolo, f toigo, m w cole, evidence concerning drying behavior of ne near a ce surface, phys. rev. lett. 87, 206103 (2001). [14] e kierlik, m l rosinberg, free-energy density functional for the inhomogeneous hardsphere fluid: application to interfacial adsorption, phys. rev. a 42, 3382 (1990). [15] a chizmeshya, m w cole, e. zaremba, weak biding potentials and wetting transitions, j. low temp. phys. 110, 677 (1998). [16] j d weeks, d chandler, h c andersen, role of repulsive forces in determining the equilibrium structure of simple fluids, j. chem. phys. 54, 5237 (1971). [17] v a rabinovich, a a vasserman, v i nedostup, l s veksler, thermophysical properties of neon, argon, krypton and xenon, hemisphere, washington dc (1988). [18] s-t wu, g-s yan, surface tensions of simple liquids, j. chem. phys. 77, 5799 (1982). [19] j vrabec, g k kedia, g fuchs, h hasse, vapour-liquid coexistence of the truncated and shifted lennard-jones fluid, mol. phys. 104, 1509 (2006). [20] e cheng, g mistura, h c lee, m h w chan, m w cole, c carraro, w f saam, f toigo, wetting transitions of liquid hydrogen films, phys. rev. lett. 70, 1854 (1993). [21] p g de gennes, wetting: statics and dynamics, rev. mod. phys. 57, 827 (1985). 010001-9 papers in physics, vol. 14, art. 140013 (2022) received: 11 august 2021, accepted: 05 august 2022 edited by: g. nicora reviewed by: j. tocho, universidad nacional de la plata, argentina d. g. perez, pontificia universidad católica de valparáıso, chile licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.140013 www.papersinphysics.org issn 1852-4249 observation of atmospheric scintillation during the 2020 total eclipse in northern patagonia christian t. schmiegelow1∗, mart́ın drechsler1, lautaro e. filgueira1, nicolás a. nuñez barreto1 and franco meconi2 during the december 2020 total eclipse we registered time resolved light measurements of incident solar light with vis-nir photodiodes in northern patagonia. signals compatible with the observation of shadow bands in the 200 s before and after totality were observed. a strong increase in the normalized noise spectral densities of recorded incident radiation near totality suggests the presence of shadow bands. originally high-altitude balloon measurements and spatial correlation ground measurements were planned, but the harsh climate conditions limited the campaign’s results. i introduction total solar eclipses have long fascinated, intrigued and haunted humanity. these events, being so rare, are difficult to record and study. moreover, uncontrollable meteorological conditions have a long-standing tendency of spoiling planned groundbased observations. a quite common, but sometimes elusive, phenomenon called shadow bands occurs near the totality of solar eclipses. minutes before and after totality, bands of randomly moving shadows may appear projected over the earth’s surface. over the past century many efforts were made to model [1–5] and to measure [6–11] these bands, and there are still open questions. the most accepted explanation for shadow bands bases its analysis on different versions of atmospheric scintillation theories. for example, an early model based on inversion layers in the lower atmosphere was used by gaviola to predict band size as ∗ schmiegelow@df.uba.ar 1 departamento de f́ısica e instituto de f́ısica de buenos aires, fceyn-uba y conicet, argentina. 2 “terraza al cosmos”, balboa 334, caba, argentina. a function of the height of these layers [1]. more recent studies include a formal description of turbulence. commonly, the models used are based on analysing propagation of light from an extended thin source (the sun almost covered by the moon) through air layers with different indices of refraction, with changing velocities and directions [2–4]. shadow bands have typically been observed in the minute or two before and after totality, with characteristic wavelengths of a few centimeters, moving chaotically at average velocities in the low m/s range. two principal observation methods have been implemented: video/photography and time resolved photovoltaic measurements of the light intensity at one or various correlated positions. if shadow bands are indeed an atmospheric phenomenon, then they should not be observed at high altitudes above the troposphere. these altitudes are easily reached by low cost high-altitude balloons. indeed, in several eclipses measurements were carried out from such balloons and from equivalent ground stations. the correlation between these measurements can help prove or debunk different proposed mechanisms and underlying models. 140013-1 papers in physics, vol. 14, art. 140013 (2022) / schmiegelow et al. during the 2017 total eclipse observed in north america, such a correlated measurement was carried out by a team from the university of pittsburgh [5]. there, a clear 4.5 hz signal was observed during approximately 2 minutes before and after totality on all detectors, both ground and airborne (at a height of ≈ 25 km). the omnipresence of this signal may suggest that the origin of shadow bands is not exclusively due to atmospheric scintillations. we designed a campaign to try and answer some of these questions during the 2020 total eclipse in northern patagonia, spanning argentina and chile. unfortunately, the campaign was strongly threatened by the weather conditions. sparse clouds and high speed wind gusts over 60 km/h ruined some of the ground-based detectors and crashed the high-altitude balloon. however, with three working ground-based detectors, we were able to measure the characteristic chaotic spectrum. we did not detect any coherent signal at 4.5 hz. also, we were able to determine that the relative amplitude of the chaotic signal grows notoriously (at least ten times) in the minutes before and after totality, showing it is indeed an effect of the eclipse. ii campaign and objectives a campaign was designed to observe and analyse shadow bands during the 2020 eclipse in patagonia. time-resolved light intensity measurements, with both ground-based as well as airborne detectors, were planned. due to climate inclemency, only some of the objectives were met. in this section, we describe the planned campaign, its objectives and the difficulties encountered. on the 14th of december 2020, a total solar eclipse was to be seen across northern patagonia at noon. we chose the city of valcheta as base of observation site. this location was favorable as it was situated in the totality region, with a totality time of over two minutes, and the city administration was very enthusiastic and supportive to receive and host scientists during this event. two missions were planned. one aimed at comparing the differences between groundand high-altitude measurements, and another to test coherence at distances above 10 cm. for the mission involving the high-altitude balloon, two identical detectors were built (see next section for details). one of the detectors was placed in the balloon payload, while the other was placed 1 m over ground, near the city of valcheta (40◦ 41’ 12.7” s, 66◦ 9’ 43.3” w). the balloon was set to fly from the vicinity of the city of sierra colorada (40◦ 33’ 59.1”, s 67◦ 51’ 29.3” w) which is roughly 140 km west of valcheta. this election was made such that, at totality, the balloon would be flying roughly over valcheta. the launch location was also chosen so that, when considering the winds during ascent and descent, the payload would land on firm land and not at sea, which is a demanding requirement in windy patagonia. the harsh weather conditions with gusts of over 60 km/h made one balloon burst before flight, and a spare one broke while ascending. this impeded the high-altitude measurements. fortunately, the detector was recovered before the eclipse and was used as a ground station in sierra colorada (detector pdb). while windy, the sky over sierra colorada was clear, so this detector recorded the whole eclipse from 20 minutes before and up to 20 minutes after with no appreciable problems. meanwhile, in valcheta, where the original ground station (detector pda) was placed, the sky was partially clouded, only clearing up somewhat during a short minute or two before totality, and 10 minutes later. data from these two detectors was recovered successfully and is presented in the following sections, see fig. 1. at the sierra colorada location the total eclipse lasted 1 min 50.4 s; it occurred at an azimuth of 9.2◦ and was observed at an umbral depth of 47.88% (21.6 km), the path of totality having a width of 90.1 km. the moon-sun size ratio was 1.025 and the umbral velocity was 0.67 km/s. correspondingly, at the valcheta location, the umbral depth was 96.59% (43.4 km), with totality occurring at an azimuth of 1.6◦ and lasting 2 min 9 s. this data, which was used to plan the mission, was collected from the very complete web page on eclipses by x.m. jubier [12]. a second, ground-based experiment was aimed at studying the spatial coherence of shadow bands beyond a few centimeters. a single acquisition system with two synchronized detectors was built for this purpose. the detectors (pdc and pdd) were placed 3 meters apart and 1 meter above the ground near the city of valcheta. these detectors were also shadowed by the clouds before and after totality. the windy conditions partially disconnected 140013-2 papers in physics, vol. 14, art. 140013 (2022) / schmiegelow et al. the ground cable of one of the detectors (pdd) rendering its data useless. the data from the remaining detector (pdc) was recovered but is not presented in the following sections, since its behavior is similar to pda. all data recorded is publicly available1. iii hardware detection units consisted of a non-polarized photodiode with a trans-impedance amplifier, an analogto-digital converter (adc) and a data storage system. time tagging and storage was handled by an arduino mega microcontroller board. all systems were powered by two 4.2 v lithium batteries in series, with an autonomy of more than 24 hours. all peripherals were embedded into the arduino board via a shield, specifically designed for this purpose, and were supplied from the regulated 5 v source of the main board. two identical units with only one photodiode were produced. one was meant to fly on the balloon (pdb), the other to stay on ground (pda). for these units we used a pair of hamamatsu s 1226-bk, which were provided kindly by our colleagues at the university of pittsburgh, to match what they used for the 2017 observation [5]. these photodiodes were directly soldered to the shield board. a third unit, with two photodiodes, was also produced to measure time correlated signals at separate detectors. for these units we used two fast photodiodes, model vpw24r. the detectors were placed 2 m apart from each other and connected to the board by a 1 m long bipolar telephone cable. the trans-impedance amplifier was designed to work optimally with a positive power supply and for the light levels expected at ≈± 15 min from totality. for this purpose a lm354 operational amplifier was chosen. to avoid problems at low light levels, a 90 mv bias voltage was applied in the noninverting terminal of the amplifier such that the output of the amplifier is never near the ground rail [13]. the negative feedback network was composed of a 50 kω resistor in parallel with a 100 pf capacitor, giving a cutoff frequency of 31 khz. the gain was chosen so that the amplifier would roughly 1http://users.df.uba.ar/schmiegelow/eclipse2020 desaturate at the light level expected ≈ 15 minutes before totality. as an adc we used an ads1115 chip, which has 16 bit resolution and was set to work at a sampling frequency of 250 hz. timing of measurements was handled independently by the adc chip, which has an internal oscillator that handles timing when running in continuous mode. time stamping of the measured readings was performed by the micro-controller with microsecond resolution following the system clock. as an extra absolute time reference, the readings of a real time clock with second resolution were recorded. initially a higher sampling frequency was chosen (the chip’s maximum is 860 hz). however, at high sampling rates, we found the micro-controller would occasionally miss reading some samples. this happened when it was busy performing other tasks such as saving data to memory. at 250 hz we found almost no missed measurements and a low jitter in recorded times. the unit with two detectors (pdc and pdd), worked with two adcs each at a sampling frequency of 125 hz, to ensure stable reading of both. in all cases, the programmable gain amplifier of the adc was set to unity, which gives a range of ± 4.096 v and a resolution of 125 µv. although this amplifier could have been dynamically adjusted to have better resolution at low light levels, we opted for a more robust approach, which proved to work well. the code used for the acquisition on the arduino mega board as well as the schematics of the circuits have been publicly released2. future measurements should consider the following modifications or upgrades to the system: optimizing code or changing the microprocessor to allow for higher sampling rates and matching of the sampling rate with the filter of the trans-impedance amplifier to avoid aliasing. iv results i time series we begin the analysis of the recorded data by identifying characteristic behavior in the time series of light intensity recorded. fig. 1 shows the recovered 2https://code.df.uba.ar/schmiegelow/photodiode-datalogger 140013-3 http://users.df.uba.ar/schmiegelow/eclipse2020 https://code.df.uba.ar/schmiegelow/photodiode-data-logger https://code.df.uba.ar/schmiegelow/photodiode-data-logger papers in physics, vol. 14, art. 140013 (2022) / schmiegelow et al. figure 1: time series of the photodiodes signals. (a) whole time series of the balloon photodiode pdb (top), and a ground photodiode pda (bottom). in grey shade, the totality of the eclipse. (b) zoom of one of the moments when a cloud passed above the ground photodiodes, highlighted in green in the main plot. (c) three zooms of three different moments of each signal, highlighted in red in the main plot. values from the two relevant photodiodes pda and pdb. in column a) we show a global view of the data recorded from ≈15 min before totality to ≈15 minutes after. the top plot (pdb) corresponds to the detector near sierra colorada, where there were no clouds. one sees the expected progressive light intensity reduction until totality, and then a smooth uncovering of the sun. additionally, we see some short peaks, which we attribute mainly to dust and dirt flying by due to the hard winds. the second row shows the results from the detector near valcheta (pda). there, intermittent cloud coverage hindered most observation except for the few minutes before totality, where clouds partially cleared out. cloud coverage can be seen as sudden drops of the light intensity measured. a detail of one of these events is shown in column b), which corresponds to the area shaded green in column a). it is interesting to look at the fluctuations of the intensity, because these carry the information on the shadow bands. to do so, we start by observing the time resolved fluctuations in 1 second at different times from totality (tn). such details are shown in column c) of fig. 1c for t0 = {0, 97, 200} s. all detectors show a clear reduction in absolute noise amplitude approaching totality. in both detectors pda and pdb we observe that during totality the noise is at the level of the digitisation step (125 µv). a detailed study of the time dependence of the noise densities is presented in the next section. ii spectrograms and noise spectral density we analyze the noise spectral density of the data acquired by the photodiodes at different time frames in a fashion similar to previous work [5,6]. to produce reliable and informative spectrograms, we proceed to normalize the signal in the following procedure: the time series of each photodiode is divided in chunks of 2048 points (this way, all chunks represent a time interval of ≈ 9 s). then, each chunk is fitted by a linear function, which is used to normalize that group of data. by computing the discrete fourier transform and taking the square of its absolute value, the noise spectral density of each chunk is obtained. in fig. 2a-b we show the corresponding normalized and non-normalized spectrogram for the pdb data. the normalization plays two roles: 1) it removes the constant frequency component at ≈ 0.3 hz, which is produced by the constant change in light intensity, and is not associated with shadow bands, and 2) it enhances / normalizes the relative noise to overall light intensity near totality. as seen in fig. 2b the normalized data shows a strong increase of noise spectral density in the ≈ 200 s before and after totality, with a 140013-4 papers in physics, vol. 14, art. 140013 (2022) / schmiegelow et al. figure 2: spectrograms and power spectra at three different moments of the eclipse. a) spectrogram of pdb signal, without any normalization. b) spectrogram of pdb, with the signal normalized. c) normalized power spectra in log-log scale calculated for n = 2048 points of pda (blue) and pdb (black) at 550 s, 150 s and 0 s before totality. linear fits to the are showed (red dashed), used to calculate the exponents for noise decay as in fig. 3b. noise power is calculated for each normalized chunk by computing the discrete fourier transform and taking the square of its absolute value. sharp drop when the sun is fully covered. this noise increase and sudden drop is the strongest evidence that the observed signal is a direct and unique consequence of the eclipse. to quantify the change in noise figures, we fit spectra of each chunk with a power function. we find that all spectra are well represented by a simple power function from which we can extract the two characteristic numbers: the exponent and the power spectral density at a given frequency (we choose as a reference 1 hz). a snippet of such fits is seen in fig. 2c, where the noise spectral density is plotted in log-log scale for three characteristic moments of the normalized signal of both photodiodes. one before the shadow bands appear (t = −550 s), one when the phenomenon is occurring (t = −150 s) and the last one is during totality (t = 0 s). comparing the first two cases, we see a clear increase of the noise intensity from around 10−8 to above 10−7 at 1 hz. during totality, or during cloudy moments in pda, the noise exponent drops to zero and the noise intensity drops a few orders of magnitude. this is consistent with the presence of shadow bands before and after totality, dominating the noise spectra during these time frames, while electronic noise appears to be the leading source of noise during totality. we repeat this process for all chunks and plot both the exponents and relative powers as a function of time. this analysis is condensed in fig. 3 for all data sets. as a reference, the first row a) shows the time-amplitude dependence. the noise exponents, as a function of time are shown in fig. 3b. we see a characteristic exponent between −1 and −1.5 for pdb, and between −1.5 and −1.8 for pda, which is observed for all moments except when there is strong coverage by clouds or during totality, when the exponent tends to zero. we attribute the difference in exponents between detectors to a slight hardware difference, as we recorded compatible exponents for each detector on calibration runs with direct sunlight. apart from this technical difference, detector pdb shows an interesting behavior approaching totality. the exponent seems to converge from a value in the vicinity of −1 away from totality to the lower −1.5 value near totality. also, this behavior seems to be antisymmetric. we do not have a clear explanation for this. background checks with both detectors under direct sun-light and on cloudy days showed fairly reproducible exponents with values compatible with each photodiode. finally, we note that the exact value of this exponent was dependent on the frequency fit range and could vary between −1.0(1) and −1.3(1) for pdb between −1.6(1) and −1.8(1) for pda and when varying the frequency range from 1 to 10 hz in the lower limit of the fit. for all analysis here we used a frequency range of 1-100 hz. finally, we discuss the behavior of the amplitude of the noise spectral density as a function of time. these results are shown in fig. 3c. as a reference amplitude, we show the noise density at 1 hz which is extracted from the fit to the individual normal140013-5 papers in physics, vol. 14, art. 140013 (2022) / schmiegelow et al. figure 3: spectral properties of the measured time series. a) the time series for each photodiode. each signal is normalized in chunks of n=2048 points, and the noise exponent of the power spectrum for each of these chunks is plotted in b). the exponents are calculated by performing linear fits of each chunk in log-log scale. the value of this fit at 1 hz is showed in c), giving the noise at that frequency as a function of time. ized spectra. the clearest result is seen for detector pdb, where the absence of clouds allowed for a clean signal, and the increase in noise near totality can be seen. during totality, the noise amplitude drops to values compatible with zero. this increase and drop can also be guessed at in the moments before totality for detector pda. however, in this case, the intermittent presence of clouds made the reading unreliable. to confirm that the increase in noise spectral density is eclipse related, we tested our devices under similar lighting conditions. we placed the detectors under different sunlight conditions and controlled the light intensity over the detector using a combination of crossed plastic polarizers and neutral density filters. this allowed us to generate signals in all the range, from 1000 up to 30000 adc units. in all cases, when lighting was on average stable (no strong clouds crossing by) we observed noise amplitudes below 10−8, independent of signal amplitude. this confirms that the increase of noise density in the 200 s before and after totality is indeed a signature of scintillation occurring because of the eclipse. we close our analysis noting that none of the stations could reliably identify shadow bands with the eye or video cameras. it is not clear to us if they were difficult to observe because the weather conditions made observations difficult, or if these conditions indeed made the bands too fast, chaotic or small to catch them with the eye or camera. v conclusions by using time resolved measurements of light intensity on single photodiodes we were able to identify a clear signal compatible with the appearance of shadow bands with detectors at near sea level. the signal showed a strong increase in noise spectral density of the recorded signal in the ± 200 s 140013-6 papers in physics, vol. 14, art. 140013 (2022) / schmiegelow et al. around totality with a strong drop during totality. the signals observed were all of chaotic nature, and no leading tone in the low frequency range was detected. the use of specially designed acquisition systems with sufficient time resolution, adequate calibration, memory and autonomy, allowed the recording of the onset and culmination of this effect with unprecedented completeness. future eclipses will provide new opportunities to repeat these measurements, accumulating evidence which will help us continue developing the understanding of this elusive phenomena, which only a few have been able to observe. a final remark: as it is with field work, sometimes the great power of natural forces affects plans, which, if caught with time and wit, might be used to the experiment’s favour. in this case, the best data was obtained by the detector that could not fly, and which, after balloon burst and payload recovery, was reconfigured and set to measure from ground. acknowledgements the authors acknowledge: the government of the city of valcheta for providing housing and accommodations for the placement of scientific equipment during the event; russell johnson clark from the university of pittsburgh who kindly sent a couple of photodiodes which were used in detectors pda and pdb; the team of terraza al cosmos, who participated in the organization and helped enormously during the event, in particular alex sly and his family, lola banfi, mikel aboitiz and mateo ingouville; the civil association amsat argentina, with whom the balloon launch was planned and executed; the asociación argentina de f́ısica that provided funds to cover some of the costs of this mission; laura morales’ and gabriela nicora’s disinterested help in various aspects of the overall organization of the scientific event. [1] e. gaviola, on shadow bands at total eclipses of the sun, popular astronomy 56, 353 (1948). [2] j. quann and c. daly, the shadow band phenomenon, journal of atmospheric and terrestrial physics 34, 577 (1972). [3] j. codona, the scintillation theory of eclipse shadow bands, astronomy and astrophysics 164, 415 (1986). [4] h. zhan and d. voelz, wave optics modeling of solar eclipse shadow bands, in 2019 ieee aerospace conference, 1, ieee (2019). [5] j. p. madhani, g. e. chu, c. v. gomez, s. bartel, r. j. clark, l. w. coban, m. hartman, e. m. potosky, s. m. rao, and d. a. turnshek, observation of eclipse shadow bands using high altitude balloon and ground based photodiode arrays, journal of atmospheric and solar-terrestrial physics, 211, 105420 (2020). [6] l. a. marschall, r. mahon, and r. c. henry, observations of shadow bands at the total solar eclipse of 16 february 1980, applied optics, 23, 4390 (1984). [7] b. jones and c. jones, shadow bands during the total solar eclipse of 11 july 1991 journal of atmospheric and terrestrial physics, 56, 1535 (1994). [8] d. georgobiani, j. kuhn, and j. beckers, using eclipse observations to test scintillation models, solar physics 156, 1 (1995). [9] b. w. jones, shadow bands during the total solar eclipse of 3 november 1994, journal of atmospheric and terrestrial physics, 58, 1309 (1996). [10] b. w. jones, shadow bands during the total solar eclipse of 26 february 1998, journal of atmospheric and solar-terrestrial physics, 61, 965 (1999). [11] s. gladysz, m. redfern, and b. w. jones, shadow bands observed during the total solar eclipse of 4 december 2002, by high-resolution imaging, journal of atmospheric and solarterrestrial physics 67, 899 (2005). [12] x. m. jubier, solar and lunar eclipses, http://xjubier.free.fr/en/ accessed: 2020-1130. [13] j. caldwell, 1 mhz single-supply photodiode amplifier reference design, ti designs precision: verified design, tidu535-november, 1 (2014). 140013-7 https://ui.adsabs.harvard.edu/abs/1948pa.....56..353g https://doi.org/10.1016/0021-9169(72)90144-4 https://doi.org/10.1016/0021-9169(72)90144-4 https://doi.org/10.1109/aero.2019.8741706 https://doi.org/10.1109/aero.2019.8741706 https://doi.org/10.1016/j.jastp.2020.105420 https://doi.org/10.1016/j.jastp.2020.105420 https://doi.org/10.1364/ao.23.004390 https://doi.org/10.1364/ao.23.004390 https://doi.org/10.1016/0021-9169(94)90082-5 https://doi.org/10.1016/0021-9169(94)90082-5 https://doi.org/10.1016/0021-9169(94)90082-5 https://doi.org/10.1007/bf00669570 https://doi.org/10.1016/0021-9169(95)00162-x https://doi.org/10.1016/0021-9169(95)00162-x https://doi.org/10.1016/0021-9169(95)00162-x https://doi.org/10.1016/s1364-6826(99)00072-3 https://doi.org/10.1016/s1364-6826(99)00072-3 https://doi.org/10.1016/s1364-6826(99)00072-3 https://doi.org/10.1016/j.jastp.2005.02.012 https://doi.org/10.1016/j.jastp.2005.02.012 http://xjubier.free.fr/en/ https://www.ti.com/lit/pdf/tidu535 https://www.ti.com/lit/pdf/tidu535 https://www.ti.com/lit/pdf/tidu535 introduction campaign and objectives hardware results time series spectrograms and noise spectral density conclusions papers in physics, vol. 3, art. 030003 (2011) received: 6 june 2011, accepted: 13 july 2011 edited by: d. restrepo reviewed by: j. h. muñoz, universidad del tolima, ibagué, colombia; and centro brasileiro de pesquisas fisica licence: creative commons attribution 3.0 doi: 10.4279/pip.030003 www.papersinphysics.org issn 1852-4249 calculation of almost all energy levels of baryons mario everaldo de souza 1∗ it is considered that the effective interaction between any two quarks of a baryon can be approximately described by a simple harmonic potential. the problem is firstly solved in cartesian coordinates in order to find the energy levels irrespective of their angular momenta. then, the problem is also solved in polar cylindrical coordinates in order to take into account the angular momenta of the levels. comparing the two solutions, a correspondence is made between the angular momenta and parities for almost all experimentally determined levels. the agreement with the experimental data is quite impressive and, in general, the discrepancy between calculated and experimental values is below 5%. a couple of levels of ∆, n, σ±, and ω present discrepacies between 6.7% and 12.5% [n(1655), n(1440), n(1675), n(1685), n(1700), n(1710), n(1720), n(1990), n(2600), ∆(1700), ∆(2000), ∆(2300), σ±(1189), λ(1520), ω(1672) and ω(2250)]. i. introduction there are several important works that deal with the calculation of the energy levels of baryons. one of the most important ones is the pioneering work of gasiorowicz and rosner [1] which has calculation of baryon energy levels and magnetic moments of baryons using approximate wavefunctions. another important work is that of isgur and karl [2] which strongly suggests that non-relativistic quantum mechanics can be used in the calculation of baryon spectra. other very important attempts towards the understanding of baryon spectra are the works of capstick and isgur [3], bhaduri et al. [4] murthy et al. [5], murthy et al. [6] and stassat et al. [7]. still another important work that attempts to describe baryon spectra is the recent work of hosaka, toki and takayama [8] that makes use of a non-central harmonic potential (called by the authors the deformed oscillator ) and is able to describe many levels. this present work describes many more levels and is more consistent in the characterization of angular momenta and parities of levels. it is an updated version of the pre-print of ref. [9]. ii. the approximation for the effective potential the effective potential between any two quarks of a baryon is not known and thus a couple of different potentials can be found in the literature. of course, the effective potential is the result of the attractive and repulsive forces of qcd and is completely justified because, as it is well known that the strong ∗e-mail: mariodesouza.ufs@gmail.com 1 universidade federal de sergipe, departamento de f́ısica, av. marechal rondon, s/n, campus universitário, jardim rosa elze 49100-000, são cristovão, brazil. 030003-1 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza force becomes repulsive for very short distances, and thus repulsion and attraction can form a potential well that can be approximated with a harmonic potential about the equilibrium point. taking into consideration the work of isgur and karl [2] about the use of non-relativistic quantum mechanics, and considering that the three quarks of a baryon are always on a plane, we consider that the system can be approximately described by three non-central and non-relativistic linear harmonic potentials. this is a calculation quite different from those found in the literature and explains almost all energy levels of baryons. iii. calculation in cartesian coordinates and comparison with experimental data the initial calculation, in which we have used cartesian coordinates, does not, of course, consider the angular momentum of the system, that is, it does not take into account the symmetries of the system. this calculation is important for the identification of the energy levels given by the experimental data, and for the assignment of the angular momenta later on. also, it allows the prediction of many energy levels. since each oscillator has two degrees of freedom, the energy of the system of 3 quarks is given by [10] en,m,k =hν1(n + 1) + hν2(m + 1) + hν3(k + 1) (1) where n,m,k = 0, 1, 2, 3, 4, . . . of course, we identify hν1, hν2, hν3 with the ground states of the corresponding energy levels of baryons, and thus hν1, hν2, hν3 are equal to the masses of constituent quarks. since we do not take isospin into account, we cannot distinguish between n and ∆ states, or between σ and λ states. the experimental values for the baryon levels were taken from particle data group (nakamura et al. [11]). the masses of constituent quarks are taken as mu = md = 0.31 gev, ms = 0.5 gev, mc = 1.7 gev, mb = 5 gev, and mt = 174 gev. we have, thus, the following formulas (see table 1) for the energy levels of all known baryons up to now: baryons formulas for the energy levels (in gev) n, ∆−, ∆++ en,m,k = 0.31(n + m + k + 3) λ0, σ+, σ0, σ− en,m,k = 0.31(n + m + 2) + 0.5(k + 1) ξ0, ξ− en,m,k = 0.31(n + 1) + 0.5(m + k + 2) ω− en,m,k = 0.5(n + m + k + 3) λ+c , σ + c , σ ++ c , σ 0 c en,m,k = 0.31(n + m + 2) + 1.7(k + 1) ξ0c, ξ + c en,m,k = 0.31(n + 1) + 0.5(m + 1) + 1.7(k + 1) ω0c en,m,k = 0.5(n + m + 2) + 1.7(k + 1) xcc en,m,k = 0.31(n + 1) + 1.7(m + k + 2) λ0b en,m,k = 0.31(n + m + 2) + 5(k + 1) ξ0b , ξ − b en,m,k = 0.31(n + 1) + 0.5(m + 1) + 5(k + 1) ω−b en,m,k = 0.5(n + m + 2) + 5(k + 1) table 1: formulas for most energy levels of all baryons. in tables 1 to 11, ec is the calculated value by the above formulas, em is the measured value and the error is given by error = 100% ×|em − ec|/ec . within the scope of our simple calculation, many levels are degenerate, of course. further calculations, taking into account spin-orbit and spin-spin effects, should lift part of the degeneracy. we notice that these effects are quite complex. states such as 1.70(n)d13 and 1.70(∆)d33 clearly show that isospin does not play an important role in the splitting of the levels. in general, the error is below 5%. 030003-2 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza state(n,m,k) ec (gev) em (gev) error(%) l2i,2j parity 0, 0, 0 0.93 0.938(n) 0.9 p11 + n + m + k = 1 1.24 1.232(∆) 0.6 p33 + n + m + k = 2 1.55 1.44(n) 7.1 p11 + n + m + k = 2 1.55 1.52(n) 1.9 d13 − n + m + k = 2 1.55 1.535(n) 1.0 s11 − n + m + k = 2 1.55 1.6(∆) 3.1 p33 + n + m + k = 2 1.55 1.62(∆) 4.5 s31 − n + m + k = 2 1.55 1.655(n) 6.7 s11 − n + m + k = 2 1.55 1.675(n) 8.1 d15 − n + m + k = 2 1.55 1.685(n) 8.7 f15 + n + m + k = 2 1.55 1.70(n) 9.7 d13 − n + m + k = 2 1.55 1.70(∆) 9.7 d33 − n + m + k = 2 1.55 1.72(n) 11.0 p13 + n + m + k = 3 1.86 1.71(n) 8.1 p11 + n + m + k = 3 1.86 1.90(n) 2.2 p13 + n + m + k = 3 1.86 1.90(∆) 2.2 s31 − n + m + k = 3 1.86 1.905(∆) 2.4 f35 + n + m + k = 3 1.86 1.91(∆) 2.7 p31 + n + m + k = 3 1.86 1.92(∆) 3.2 p33 + n + m + k = 3 1.86 1.93(∆) 3.8 d35 − n + m + k = 3 1.86 1.94(∆) 4.3 d33 − n + m + k = 3 1.86 2.0(n) 7.5 f15 + n + m + k = 4 2.17 1.95(∆) 10.1 f37 + n + m + k = 4 2.17 1.99(n) 8.3 f17 + n + m + k = 4 2.17 2.00(∆) 7.8 f35 + n + m + k = 4 2.17 2.08(n) 4.1 d13 − n + m + k = 4 2.17 2.09(n) 3.7 s11 − n + m + k = 4 2.17 2.10(n) 3.2 p11 + n + m + k = 4 2.17 2.15(∆) 0.9 s31 − n + m + k = 4 2.17 2.19(n) 0.9 g17 − n + m + k = 4 2.17 2.20(n) 1.4 d15 − n + m + k = 4 2.17 2.20(∆) 1.4 g37 − n + m + k = 4 2.17 2.22(n) 2.3 h19 + n + m + k = 4 2.17 2.225(n) 2.5 g19 − n + m + k = 4 2.17 2.3(∆) 6.0 h39 + n + m + k = 5 2.48 2.35(∆) 5.2 d35 − n + m + k = 5 2.48 2.39(∆) 3.6 f37 + n + m + k = 5 2.48 2.40(∆) 3.2 g39 − n + m + k = 5 2.48 2.42(∆) 2.4 h3,11 + n + m + k = 6 2.79 2.60(n) 6.8 i1,11 − n + m + k = 6 2.79 2.70(n) 3.2 k1,13 + n + m + k = 6 2.79 2.75(∆) 1.4 i3,13 − n + m + k = 7 3.10 2.95(∆) 4.8 k3,15 + n + m + k = 7 3.10 3.10(n) 0 l1,15 − n + m + k = 8 3.21 ? ? ? ? table 2: energy levels of baryons n and ∆. 030003-3 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza state (n,m,k) ec (gev) em (gev) error (%) l2i,2j parity 0, 0, 0 1.12 1.116(λ) 0.4 p01 + 0, 0, 0 1.12 1.189(σ±) 6.2 p11 + 0, 0, 0 1.12 1.193(σ0) 6.5 p11 + n + m = 1,k = 0 1.43 1.385(σ) 3.2 p13 + n + m = 1,k = 0 1.43 1.405(λ) 1.7 s01 − n + m = 1,k = 0 1.43 1.48(σ) 3.5 ? ? 0, 0, 1 1.62 1.52(λ) 6.2 d03 − 0, 0, 1 1.62 1.56(σ) 3.7 ? + 0, 0, 1 1.62 1.58(σ) 2.5 d13 − 0, 0, 1 1.62 1.60(λ) 1.2 p01 + 0, 0, 1 1.62 1.62(σ) 0 s11 − 0, 0, 1 1.62 1.66(σ) 2.5 p11 + 0, 0, 1 1.62 1.67(λ) 3.1 s01 − n + m = 2,k = 0 1.74 1.67(σ) 4.0 d13 − n + m = 2,k = 0 1.74 1.69(λ) 2.9 d03 − n + m = 2,k = 0 1.74 1.69(σ) 2.9 ? ? n + m = 2,k = 0 1.74 1.75(σ) 0.6 s11 − n + m = 2,k = 0 1.74 1.77(σ) 1.7 p11 + n + m = 2,k = 0 1.74 1.775(σ) 2.0 d15 − n + m = 2,k = 0 1.74 1.80(λ) 3.4 s01 − n + m = 2,k = 0 1.74 1.81(λ) 4.0 p01 + n + m = 2,k = 0 1.74 1.82(λ) 4.6 f05 + n + m = 2,k = 0 1.74 1.83(λ) 5.2 d05 − n + m = 1,k = 1 1.93 1.84(σ) 4.7 p13 + n + m = 1,k = 1 1.93 1.88(σ) 2.6 p11 + n + m = 1,k = 1 1.93 1.89(λ) 2.1 p03 + n + m = 1,k = 1 1.93 1.915(σ) 0.8 f15 + n + m = 1,k = 1 1.93 1.94(σ) 0.5 d13 − n + m = 3,k = 0 2.05 2.00(λ) 2.5 ? ? n + m = 3,k = 0 2.05 2.00(σ) 2.5 s11 − n + m = 3,k = 0 2.05 2.02(λ) 1.5 f07 + n + m = 3,k = 0 2.05 2.03(σ) 1.0 f17 + n + m = 3,k = 0 2.05 2.07(σ) 1.0 f15 + n + m = 3,k = 0 2.05 2.08(σ) 1.5 p13 + 0, 0, 2 2.12 2.10(σ) 0.9 g17 − 0, 0, 2 2.12 2.10(λ) 0.9 g07 − 0, 0, 2 2.12 2.11(λ) 0.5 f05 + n + m = 2,k = 1 2.24 2.25(σ) 0.5 ? ? n + m = 4,k = 0 2.36 2.325(λ) 1.5 d03 − n + m = 4,k = 0 2.36 2.35(λ) 0.4 h09 + n + m = 1,k = 2 2.43 2.455(σ) 1.0 ? ? n + m = 3,k = 1 2.55 2.585(λ) 1.4 ? ? table 3: energy levels of σ and λ. 030003-4 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza state (n,m,k) ec (gev) em (gev) error (%) l2i,2j parity 0, 0, 3 2.62 2.62(σ) 0 ? ? n + m = 5,k = 0 2.67 ? ? ? ? n + m = 2,k = 2 2.74 ? ? ? ? n + m = 4,k = 1 2.86 ? ? ? ? n + m = 1,k = 3 2.93 ? ? ? ? n + m = 6,k = 0 2.98 3.00(σ) 0.7 ? ? n + m = 3,k = 2 3.05 ? ? ? ? n + m = 0,k = 4 3.12 ? ? ? ? n + m = 5,k = 1 3.17 3.17(σ) 0 ? ? n + m = 2,k = 3 3.24 ? ? ? ? n + m = 2,k = 3 3.29 ? ? ? ? table 3 (cont.): energy levels of σ and λ. state (n,m,k) ec (gev) em (gev) error (%) l2i,2j parity 0, 0, 0 1.31 1.315(ξ0) 0.5 p11 + 0, 0, 0 1.31 1.321(ξ−) 0.8 p11 + 1, 0, 0 1.62 1.53 5.6 p13 + 1, 0, 0 1.62 1.62 0 ? ? 1, 0, 0 1.62 1.69 4.3 ? ? n = 0,m + k = 1 1.81 1.82 0.6 d13 − 2, 0, 0 1.93 1.95 1.0 ? ? n = 1,m + k = 1 2.12 2.03 4.2 ? ? n = 1,m + k = 1 2.12 2.12 0 ? ? n = 3,m = k = 0 2.24 2.25 0.5 ? ? n = 0,m + k = 2 2.31 2.37 2.6 ? ? n = 2,m + k = 1 2.43 ? ? ? ? n = 4,m = k = 0 2.55 2.5 2.0 ? ? n = 1,m + k = 2 2.62 ? ? ? ? table 4: energy levels of ξ. state (n,m,k) ec (gev) em (gev) error (%) 0, 0, 0 1.5 1.672 11.17 n + m + k = 1 2.0 2.25 12.5 n + m + k = 2 2.5 2.38 4.8 n + m + k = 2 2.5 2.47 1.2 n + m + k = 3 3.0 ? ? table 5: energy levels of ω. state (n,m,k) ec (gev) em (gev) error (%) 0, 0, 0 2.32 2.285 1.5 n + m = 1,k = 0 2.63 2.594 0.1 n + m = 1,k = 0 2.63 2.625 0.2 n + m = 2,k = 0 2.94 ? ? table 6: energy levels of λc. 030003-5 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza state (n,m,k) ec (gev) em (gev) error (%) 0, 0, 0 2.51 2.46(ξ+c ) 2.0 0, 0, 0 2.51 2.47(ξ0c) 1.6 1, 0, 0 2.82 2.79 1.1 1, 0, 0 2.82 2.815 0.2 0, 1, 0 3.01 2.93 2.7 0, 1, 0 3.01 2.98 1.0 0, 1, 0 3.01 3.055 1.5 2, 0, 0 3.13 3.08 1.6 2, 0, 0 3.13 3.123 0.2 1, 1, 0 3.32 ? ? 3, 0, 0 3.44 ? ? table 7: energy levels of ξc. state (n,m,k) ec (gev) em (gev) error (%) 0, 0, 0 2.7 2.704 0.2 n + m = 1,k = 0 3.2 ? ? n + m = 2,k = 0 3.7 ? ? table 8: energy levels of ωc. state (n,m,k) ec (gev) em (gev) error (%) 0, 0, 0 5.62 5.6202 0.004 n + m = 1,k = 0 5.93 ? ? n + m = 2,k = 0 6.24 ? ? table 9: energy levels of λ0b . state (n,m,k) ec (gev) em (gev) error (%) 0, 0, 0 5.81 5.79(ξ0b ) 0.2 0, 0, 0 5.81 5.79(ξ−b ) 0.2 1, 0, 0 6.12 ? ? 0, 1, 0 6.31 ? ? table 10: energy levels of ξb. state (n,m,k) ec (gev) em (gev) error (%) 0, 0, 0 6.0 6.071 1.2 n + m = 1,k = 0 6.5 ? ? n + m = 2,k = 0 7.0 ? ? table 11: energy levels of ωb. 030003-6 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza we can predict the energy levels of many heavy baryons, probably already found by the lhc or to be found in the near future. there are, for example, the baryon levels (in gev): • scc, en,m,k = 0.5(n + 1) + 1.7(m + k + 2); • ccc, en,m,k = 1.7(n + m + k + 3); • ccb, en,m,k = 1.7(n + m + 2) + 5(k + 1); • cbb, en,m,k = 1.7(n + 1) + 5(m + k + 2); • etc. iv. calculation in polar cylindrical coordinates and comparison with experimental data in order to take into account angular momentum and parity, we have to use spherical or polar coordinates. since the 3 quarks of a baryon are always in a plane, we can use polar coordinates and choose the z axis perpendicular to this plane. now the eigenfunctions are eigenfunctions of the orbital angular momentum. thus, we have three oscillators in a plane and we consider them to be independent. using again the non-relativistic approximation, the radial schrödinger equation for the stationary states of each oscillator is given by [12, 13] [ − h̄2 2µ ( ∂2 ∂ρ2 + 1 ρ ∂ ∂ρ − mz ρ2 ) + 1 2 µω2ρ2 ] rem(ρ) = erem(ρ) (2) where mz is the quantum number associated with lz, µ is the reduced mass of the oscillator, and ω is the oscillator frequency. therefore, we have three independent oscillators with orbital angular momenta ~l1, ~l2 and ~l3 whose z components are lz1, lz2 and lz3. of course, the system has total orbital angular momentum ~l = ~l1 + ~l2 + ~l3 and each ~li has a quantum number li associated with it. the eigenvalues of the energy levels are given by [12, 13] e =(2r1 + |mz1| + 1)hν1 + (2r2 + |mz2| + 1)hν2 + (2r3 + |mz3| + 1)hν3 (3) in which ri = 0, 1, 2, . . . and it is a radial quantum number, and |mzi| = 0, 1, . . . , li. comparing equation (3) with equation (1), we have n = 2r1 +|mz1|; m = 2r2 +|mz2|; k = 2r3 +|mz3|. let us recall that if we have three angular momenta ~l1, ~l2 and ~l3 associated to the quantum numbers l1, l2 and l3, the total orbital angular momentum ~l is described by the quantum number l given by l1 + l2 + l3 ≥ l ≥ ||l1 − l2|− l3| (4) where l1 ≥ |mz1|; l2 ≥ |mz2|; l3 ≥ |mz3|. because the three quarks are on a plane, only ri and mzi are good quantum numbers, that is, li are not good quantum numbers and their possible values are found indirectly by means of mzi due to the condition li ≥ mzi. this means that the upper values of li cannot be found from the model, and as a consequence, the upper value of l cannot be found either. we only determine the values of l comparing the experimental results of the energies of the baryon states with the energy values calculated by enmk. this is a limitation of the model. the other models have many limitations too. for example, in the deformed oscillator model some quantum numbers are not good either and are only approximate and there is not a direct relation between n and l where n is the total quantum number. in a certain way, a baryon is a tri-atomic molecule of three quarks and thus some features of molecules may show up and that is indeed the case. taking into account spin, we form the total angular momentum ~j = ~l+ ~s whose quantum numbers are j = l± s where s = 1/2, 3/2. as we will see, we will be able to describe almost all baryon levels. as in the case of the rotational spectra of triatomic molecules [14], due to the couplings of the different angular momenta, it is expected that there should exist a minimum value of j = k for the total angular momentum and, thus, j should have the possible values j = k,k + 1,k + 2,k + 3, . . .. but in the case of baryons, this feature does not always appear to happen. i. baryons n and λ we will classify the levels according to table 1 and take j = l± 1/2 or j = l± 3/2(∆). 030003-7 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza a. level (n = m = k = 0; 0.93 gev) the first state of n is the state (n = m = k = 0) with energy 0.93 gev. therefore, in this case l1 = l2 = l3 = 0 and thus l = 0. this is the positive parity state p11 l n ∆ parity 0 0.938p11 + b. level (n = m = k = 1; 1.24 gev) this is the first state of λ. as n + m + k = 1, we have 2r1 +|mz1|+ 2r2 +|mz2|+ 2r3 +|mz3| = 1, and thus |mz1|+ |mz2|+ |mz3| = 1, and l1 + l2 + l3 ≥ 1, and we can choose the sets |mz1| = 1, |mz2| = |mz3| = 0; |mz2| = 1, |mz1| = |mz3| = 0; |mz3| = 1, |mz1| = |mz2| = 0, and l1 = 1, l2 = l3 = 0 or l2 = 1, l1 = l3 = 0 or still l3 = 1, l1 = l2 = 0 which produce l ≥ 0 (ground state) and thus the level l n ∆ parity 0 1.232p33 + c. level (n = m = k = 2; 1.55 gev) in this case n = m = k = 2 = 2r1 + |mz1| + 2r2 + |mz2|+2r3 +|mz3|. this means that |mz1|+|mz2|+ |mz3| = 2, 0 and we have the sets of possible values of l1, l2, l3 l1, l2, l3 2, 0, 0 0, 2, 0 0, 0, 2 1, 1, 0 l 2 2 2 0, 1, 2 l1, l2, l3 1, 0, 1 0, 1, 1 0, 0, 0 l 0, 1, 2 0, 1, 2 0 in which the second row presents the values of l that satisfy the condition l1 + l2 + l3 ≥ 2, 0. as 2 is a lower bound, we can also have l = 3. there are, therefore, the following possible states l n ∆ parity 0 1.44p11 1.6p33 + 1 1.535s11; 1.655s11 1.62s31 − 1.52d13; 1.7d13 − 2 1.72p13 ? + 1.685f15 3 1.675d15 1.70d33 − d. level (n = m = k = 3; 1.86 gev) since n = m = k = 3 = 2r1 + |mz1|+ 2r2 + |mz2|+ 2r3 + |mz3|, |mz1| + |mz2| + |mz3| = 3, 1, and thus l1 + l2 + l3 ≥ 3, 1. we have, therefore, the possibilities l = 4, 3, 2, 1, 0 because of the condition l ≥ ||l1 − l2|− l3| and we can arrange the levels in the form l n ∆ parity 0 1.71p11 1.91p31 + 1 1.90s31 − 1.93d35 − 2 1.90p13 ? + 2.0f15 + 3 1.92p33, 1.94d33 − 4 1.905f35 + e. level (n = m = k = 4; 2.17 gev) this energy level is split in many close levels. following what we have done above n = m = k = 4 = 2r1 + |mz1| + 2r2 + |mz2| + 2r3 + |mz3|, which yields |mz1| + |mz2| + |mz3| = 4, 2, 0, and thus l1 + l2 + l3 ≥ 4, 2, 0. we have therefore for l the possible values l = 6, 5, 4, 3, 2, 1, 0 and the following assignments l n ∆ parity 0 2.10p11 ? + 1 2.09s11 2.15s31 − 2.08d13 2 1.95f37 + 3 2.20d15 ? − 2.19g17 − 4 2.22h19 2.00f35 + 5 2.225g19 2.20g37 − 6 2.30h39 + f. level (n = m = k = 5; 2.48 gev) doing as above n = m = k = 5 = 2r1+|mz1|+2r2+ |mz2|+2r3 +|mz3|, and thus |mz1|+|mz2|+|mz3| = 5, 3, 1. that is, l1 + l2 + l3 ≥ 5, 3, 1, and so we may have l = 5, 4, 3, 2, 1, 0 because of the conditions l ≥ ||l1 − l2|− l3| and li ≥ |mzi|. experimentally, though, we note that k = 1, and hence we have the possible arrangement of levels 030003-8 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza l n ∆ parity 1 2.35d35 − 2 2.39f37 + 3 2.40g39 − 4 2.42h3,11 + 5 ? − g. level (n = m = k = 6; 2.79 gev) we have n = m = k = 6 = 2r1 + |mz1| + 2r2 + |mz2| + 2r3 + |mz3| and so |mz1| + |mz2| + |mz3| = 6, 4, 2, 0 and thus l1 + l2 + l3 ≥ 6, 4, 2, 0, and l can be l = 6, 5, 4, 3, 2, 1, 0, but, from the experimental values, we note that k = 5 and so there are the possible states l n ∆ parity 5 2.60i1,11 2.75i3,13 − 6 2.70k1,13 ? + h. level (n = m = k = 7; 3.10 gev) from n = m = k = 7 = 2r1 + |mz1| + 2r2 + |mz2|+2r3 +|mz3| we obtain |mz1|+|mz2|+|mz3| = 7, 5, 3, 1 and hence l1 + l2 + l3 ≥ 7, 5, 3, 1, and thus the possible values for l are l = 7, 6, 5, 4, 3, 2, 1, 0, but we note that k = 7. therefore, we have the list of states l n ∆ parity 6 2.95k3,15 + 7 3.10l1,15 ? − ii. baryons σ and λ we will classify the levels according to table 2. again j = l± 1/2. a. level (n = m = k = 0; 1.12 gev) in this state l1 = l2 = l3 = 0 and thus l = 0 and we have the state l σ λ parity 0 1.189(σ±)p11; 1.116p01 + 1.193(σ0)p11 b. level (n = m = k = 1; 1.43 gev) from n = m = 1,k = 0 we obtain 2r1 + |mz1| + 2r2 + |mz2| = 1 and 2r3 + |mz3| = 0 which make |mz1| + |mz2| = 1 and |mz3| = 0. that is, we have the condition l1 + l2 ≥ 1, l3 ≥ 0 which allows us to have the possibilities l = 0, 1, 2 and the states l σ λ parity 0 + 1 ? 1.405s01 − 2 1.358p13 ? + thus, the most probable values of l for the state 1.48(σ) are l = 0, 1. maybe k = 1 in this case and thus l = 0 may be suppressed. c. level (0, 0, 1; 1.62 gev) for n = m = 0 and k = 1 we have |mz1| = |mz2| = 0 and |mz3| = 1. that is, we have the condition l1 ≥ 0, l2 ≥ 0, l3 ≥ 1 which allows us to choose l1 = l2 = 0, l3 = 1; l1 = l3 = 1, l2 = 0; l1 = 0, l2 = l3 = 1, and thus l ≥ 0, 1, 2, and the states l σ λ parity 0 1.66p11 1.60p01 + 1 1.62s11 1.67s01 − 1.58d13 1.52d03 − 2 ? ? + d. level (n + m = 2,k = 0; 1.74 gev) in this case n + m = 2 = 2r1 + |mz1| + 2r2 + |mz2| and k = 2r3 + |mz3| = 0, and thus we obtain |mz1| + |mz2| = 2, 0 and |mz3| = 0. thus, we have the conditions l1 + l2 ≥ 2, 0 and l3 ≥ 0. we can then choose l1 = l2 = l3 = 0; l1 = l2 = 1, l3 = 0; l1 = 2, l2 = l3 = 0; l2 = 2, l1 = l3 = 0; l1 = 3, l2 = l3 = 0, and we may have thus l = 0, 1, 2, 3 and the assignments l σ λ parity 0 1.77p11 1.81p01 + 1 1.75s11 1.80s01 − 1.67d13 1.69d03 − 2 ? 1.82f05 + 3 1.775d15 1.83d05 − we can then say that the state 1.69(σ) is probably a f15 state. 030003-9 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza e. level (n + m = 1,k = 1; 1.93 gev) we have n + m = 1 = 2r1 + |mz1| + 2r2 + |mz2| and k = 2r3 + |mz3| = 1, from which we obtain |mz1| + |mz2| = 1 and |mz3| = 1. hence, we have the condition l1 + l2 ≥ 1 and l3 ≥ 1. we can then have the sets l1 = 1, l2 = 0, l3 = 1; l1 = 0, l2 = 1, l3 = 1. both yield l ≥ 2, 1, 0 and thus we identify the states l σ λ parity 0 1.88p11 ? + 1 1.94d13 ? − 2 1.84p13 1.89p03 + 1.915f15 + f. level (n + m = 3,k = 0; 2.05 gev) with n + m = 3 = 2r1 + |mz1| + 2r2 + |mz2| and k = 2r3 + |mz3| = 0 we obtain |mz1| + |mz2| = 3, 1 and |mz3| = 0, and thus the conditions l1 +l2 ≥ 3, 1 and l3 ≥ 0 which yield l ≥ 4, 3, 2, 1, 0, and the possible identification taking into account that maybe k = 1 l σ λ parity 1 2.00s11 ? − 2 2.08p13 ? + 2.07f15 3 ? ? − 4 2.03f17 2.02f07 + g. level (0, 0, 2; 2.12 gev) in this case n = 0 = 2r1 + |mz1|, m = 0 = 2r2 + |mz2| and k = 2 = 2r3 + |mz3| and thus |mz1| = |mz2| = 0 and |mz3| = 2, 0. hence, we have the condition l1 ≥ 0, l2 ≥ 0 and l3 ≥ 2, 0. we can then choose the sets l1 = l2 = l3 = 0; l1 = 0, l2 = 0, l3 = 2; l1 = 0, l2 = 0, l3 = 3; l1 = l2 = l3 = 1; l1 = l2 = 1, l3 = 0 which make l ≥ 3, 2, 1, 0, and probably k = 2. hence, we have the possible states l σ λ parity 2 ? 2.11f05 + 3 2.10g17 2.10g07 − h. level (n + m = 4,k = 0; 2.36 gev) from n+m = 4 = 2r1 +|mz1|+2r2 +|mz2| and k = 2r3 +|mz3| = 0 we obtain |mz1|+|mz2| = 4, 2, 0 and |mz3| = 0, and thus the conditions l1 + l2 ≥ 4, 2, 0 and l3 ≥ 0 which produce l ≥ 4, 3, 2, 1, 0, and the possible identification l σ λ parity 0 1 2.325d03 − 2 ? 3 ? 4 2.35h09 + probably in this case k = 1 and the levels with l = 2, 3 are missing just because of a lack of experimental data. it appears that there is no state of σ. v. discussion and conclusion one can immediately ask about the spin degrees of freedom of the three quarks since the spin-spin interaction makes a contribution to the mass. we can say that we took care of part of it because the formulas of the energy levels depend on the three parameters hν1, hν2 and hν3 which are assigned according to the masses of the constituent quarks which have already taken into account the spinspin interaction because the masses of constituent quarks are in perfect agreement with the ground state levels of baryons. of course, the spin-spin interaction contribution depends on the energy level as is well known from the bottomonium spectrum, for example. but, as it is seen in the spectrum of bottomonium, the spin-spin contribution diminishes as the energy of the level increases. in bottomonium, the difference between the energies of ηb(1s) and υ(1s) is about 69.4 mev, while between ηb(2s) and υ(2s) it is about 36.3 mev, and between ηb(3s) and υ(3s) it is about 25.2 mev, where we have used, for the energies of ηb(2s) and ηb(3s), the predicted values from reference [15], 9987.0 mev and 10330 mev, respectively. in the case of baryons, the spin-spin interaction varies from 15 mev to 30 mev for levels of n, σ, ξ and λ [16]. therefore, we observe that the spin-spin interaction is of the order of magnitude of the splitting beween neighboring levels. for example, the measured energy of the d13 level of n is 1.52 mev, while our calculated value is 1.55 mev, and thus the difference is 0.03 gev= 30 mev which is of the order of the spin-spin interaction. and that is why 030003-10 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza there are large discrepancies in the calculation of the lowest levels of ω because in this case all quark spins are parallel and thus, the total spin-spin contribution is larger than in other baryons in which two spins are up and the other spin is down. for the lowest state of ω, the discrepancy is about 1.672 gev − 1.5 gev = 0.172 gev = 172 mev. this is actually the worse calculation. but we either consider the mass of constituent quarks or we try to find tentative values for the masses of quarks like is done in qcd models which use a quite different range of arbitrary quark masses. the use of the constituent quark mass is completely justifiable in our case because we do not attempt to calculate at all the splitting between neighboring baryon levels. such calculation can be made in the future upon improving the present model. we only addressed the angular momenta of n, ∆, σ and λ due to a lack of experimental data for the other baryons. of course, the state n = m = l = 0 is missing for the ∆ particle because this corresponds to the ground state of the nucleon. we notice that the simple model above describes almost all energy levels of baryons. the splitting for a certain l is quite complex. sometimes, there is almost no dependence on spin, such as, for example, the states of σ with l = 2, 2.08p13 and 2.07f15. on the other hand, the states of σ with the same l = 2, 1.84p13 and 1.915f15, present a strong spin-orbit dependence. it can just be a matter of obtaining more accurate experimental results. it is important to observe that part of the splitting is primarily caused by the spin-orbit interaction and is very complex because, in some cases, it appears to be the normal spin-orbit and, in other cases, it appears to be the inverted (negative) spin-orbit. in the simple model above, the oscillators were considered approximately independent but there may exist some coupling among them and this can contribute to the splitting of levels. as we have discussed in the first paragraph, part of the splitting should be attributed to the spinspin interaction which was not taken into account in a detailed way. of course, part of it was considered inside the values of the three parameters hν1, hν2 and hν3 which are taken as the masses of the three constituent quarks of a given baryon. it is important to observe that the discrepancy between calculated and measured values diminishes as the energy increases. this fact shows that the splitting is mainly caused by the spin-spin interaction. another important conclusion is that with the simple model above we cannot calculate the values of k, and from the above results we note that it is a quite difficult task because there appears to exist no pattern with respect to this. as in the case of triatomic molecules, the values of k are found from the experimental data. as we notice, in the above tables the increase in the energy of levels allows the existence of higher values of l (and j). this is an old fact and is so because equation (3) has a linear dependence on |lzi|. for experimentalists, the classifications above are very important and can help them in the prediction of energies and angular momenta of levels. an old version of this work that appeared in ref. [9] predicted the energies of all levels which have lately been reported, and this is a very important fact. for example, for ξc it predicted the levels (on page 8 of [9]) with energies 2.82, 3.01 and 3.13, and since 2002 the following corresponding levels of ξc have been found: 2.815, (2.93; 2.98; 3.055); (3.08; 3.123). as it is well known, the first order correction term of anharmonicity in an oscillator for each degree of freedom is of the form ∆e = a ( p + 1 2 )2 (5) where a is a constant and p is a non-negative integer (p = 0, 1, 2, 3, . . .). therefore, the calculated energies of levels with high quantum numbers would be away from the experimental values. this is not observed above and, thus, the anharmonicity should be quite low. for example, for n+m+k = 7 of n we obtain that the experimental and calculated values are the same (3.10 gev). in the case of σ, we have the same kind of behavior because for (n+m = 5,k = 1) we also have the same calculated and experimental value for σ (3.17 gev). the assignments of the angular momenta for some few levels are only reasonable attempts. it is the case, for example, of the level 2.0f15 of n which can belong to either the (n + m + k = 3) or to the (n + m + k = 4) levels. we chose the former because 2.0 is closer to 1.86 than to 2.17. for the level 1.99f17 we chose the (n + m + k = 4) level because it appears that the highest value of j 030003-11 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza for the level (n + m + k = 3) is 5. it is a strange feature that the level (n + m + k = 5) only contains ∆′s . having in mind what has been justified above, we chose the 2.35d35 level of ∆ belonging to (n+m+k = 5) as 2.35 is closer to 2.48 than to 2.17. the level 2.60i1,11 of n was assigned as belonging to (n + m + k = 6) because its energy is between 2.55 gev and 2.75 gev. the level 1.74d13 of σ was chosen as belonging to (n + m = 2,k = 0) because (0, 0, 1) already has a d13 level for σ(1.58). we made similar considerations in the choice of the levels 1.83(λ)d05, 1.84(σ)p13 and 1.94(σ)d13. these ambiguities will be settled either with data with smaller widths or with a more improved model. some levels are not described by the simple approximation above. it is the case, for instance, of ξ(1530)p13 which is probably a composite of ξ(0, 0, 0) ≡ ξ(1.31) with a pion excitation (that is, it is a hadronic molecule). its decay is actually ξ(1.31)π. in the same way, the state σc(2455) appears to be a composite state of λ+c (2285) and a pion excitation. the same appears to hold for the other known states of λc. as a whole, the model describes quite well the baryonic spectra but it is far from describing the detailed splitting which appears to be quite complex and may depend on the spin-spin interaction. it does not provide a way of calculating the values of k. with the acquisition of more data from other baryons, we may be able to find more patterns and to improve the model. due to the complexity of the problem, we will probably have to go back and forth several times in the improvements of the model as it has been done in the description of the molecular spectrum of molecules. but it is still the only model that describes almost all levels of baryons in a consistent way and is able to predict the energies of levels yet to be found experimentally. acknowledgements i thank the comments of the referee prof. josé muñoz. [1] s gasiorowicz, j l rosner, hadron spectra and quarks, am. j. phys. 49, 954 (1981). [2] n isgur, g karl, p-wave baryons in the quark model, phys. rev. d 18, 4187 (1978). [3] s capstick, n isgur, baryons in a relativized quark model with chromodynamic, phys. rev. d 34, 2809 (1986). [4] r k bhaduri, b k jennings, j c waddington, rotational bands in the baryon spectrum, phys. rev. d 29, 2051 (1984). [5] m v n murthy, m dey, r k bhaduri, rotational bands in the baryon spectrum. ii, phys. rev. d 30, 152 (1984). [6] m v n murthy, m brack, r k bhaduri, b k jennings, the spin-orbit puzzle in the spectra and deformed baryon model, z. phys. c 29, 385 (1985). [7] p stassart, f stancu, j-m richard, l theuβl, on the scalar meson exchange in the baryon spectra, j. phys. g: nucl. partic. 26, 397 (2000). [8] a hosaka, h toki, m takayama, baryon spectra in deformed oscillator quark model, mod. phys. lett. 13, 1699 (1998). [9] m e de souza, calculation of the energy levels and sizes of baryons with a noncentral harmonic potential, arxiv:hep-ph/0209064v1 (2002). [10] m e de souza, the energies of baryons, in: proceedings of the xiv brazilian national meeting of the physics of particles and fields, eds. a j da silva, a suzuki, c dobrigkeit, c z de vasconcelos, c wotzacek, r f ribeiro, s r de oliveira, s a dias, pag. 331, sociedade brasileira de f́ısica, são paulo (1993). [11] k nakamura et al.(particle data group), review of particle physics, j. phys. g: nucl. partic. 37, 075021 (2010). [12] r shankar, principles of quantum mechanics, plenum press, new york (1994). [13] w greiner, j a maruhn, nuclear models, springer, berlin (1996). [14] g herzberg, infrared and raman spectra, van nostrand heinhold company, new york (1945). [15] l. bai-qing and c. kuang-ta, bottomonium spectrum with screeened potential, commun. theor. phys. 52, 653 (2009). 030003-12 papers in physics, vol. 3, art. 030003 (2011) / m. e. de souza [16] h hassanabadi, a a rajabi, determination of the potential coefficients of the baryons and the effect of spin and isospin potential on their energy, in: 11th internation conference on meson-nucleon physics and structure of the nucleon, eds. h machner, s krewald, pag. 128, ikp, forschungzentrum jülich (2007). 030003-13 papers in physics, vol. 1, art. 010005 (2009) received: 9 july 2009, accepted: 6 november 2009 edited by: j. a. bertolotto licence: creative commons attribution 3.0 doi: 10.4279/pip.010005 www.papersinphysics.org issn 1852-4249 the effect of the lateral interactions on the critical behavior of long straight rigid rods on two-dimensional lattices p. longone,1 d. h. linares,1 a. j. ramirez-pastor1∗ using monte carlo simulations and finite-size scaling analysis, the critical behavior of attractive rigid rods of length k (k-mers) on square lattices has been studied. an ordered state, with the majority of k-mers being horizontally or vertically aligned, was found. this ordered phase is separated from the disordered state by a continuous transition occurring at a critical density θc, which increases linearly with the magnitude of the lateral interactions. i. introduction the study of systems of hard non-spherical colloidal particles has, for many years, been attracting a great deal of interest and the activity in this field is still growing [1–14]. an early seminal contribution to this subject was made by onsager [1] with his paper on the isotropic-nematic (i-n) phase transition in liquid crystals. the onsager’s theory predicted that very long and thin rods interacting with only excluded volume interaction can lead to long-range orientational (nematic) order. thus, at low densities, the molecules are typically far from each other and the resulting state is an isotropic gas. however, at large densities, it is more favorable for the molecules to align spontaneously (there are many more ways of placing nearly aligned rods than randomly oriented ones), and a nematic phase is present at equilibrium. interestingly, a number of papers have appeared recently, in which the i-n transition was studied in two dimensions [10–14]. in ref. [10], the au∗e-mail: antorami@unsl.edu.ar 1 departamento de f́ısica, instituto de f́ısica aplicada, universidad nacional de san luis-conicet, chacabuco 917, d5700bws san luis, argentina. thors gathered strong numerical evidence to suggest that a system of square geometry, with two allowed orientations, shows nematic order at intermediate densities for k ≥ 7 and provided a qualitative description of a second phase transition (from a nematic order to a non-nematic state) occurring at a density close to 1. however, the authors were not able to determine the critical quantities (critical point and critical exponents) characterizing the i-n phase transition occurring in the system. this problem was resolved in refs. [11, 12], where an accurate determination of the critical exponents, along with the behavior of binder cumulants, showed that the transition from the low-density disordered phase to the intermediate-density ordered phase belongs to the 2d ising universality class for square lattices and the three-state potts universality class for honeycomb and triangular lattices. later, the i-n phase transition was analyzed by combining monte carlo (mc) simulations and theoretical analysis [13, 14]. the study in refs. [13, 14] allowed (1) to obtain θc as a function of k for square, triangular and honeycomb lattices, being θc(k) ∝ k−1 (this dependence was already noted in ref. [10]); and (2) to determine the minimum value of k (kmin), which allows the formation of a nematic phase on triangular (kmin = 7) and honeycomb (kmin = 11) lattices. 010005-1 papers in physics, vol. 1, art. 010005 (2009) / p. longone et al. in a recent paper, fischer and vink [15] indicated that the transition studied in refs. [10–14] corresponds to a liquid-gas transition, rather than i-n. this interpretation is consistent with the 2dising critical behavior observed for monodisperse rigid rods on square lattices [11]. this point will be discussed in more detail in sec. iii. in contrast to the systems studied in refs. [10– 14], many rod-like biological polymers are formed by monomers reversibly self-assembling into chains of arbitrary length. consequently, these systems exhibit a broad equilibrium distribution of filament lengths. a model of self-assembled rigid rods has been recently considered by tavares et al. [16]. the authors focused on a system composed of monomers with two attractive (sticky) poles that polymerize reversibly into polydisperse chains and, at the same time, undergo a continuous i-n phase transition. the obtained results revealed that nematic ordering enhances bonding. in addition, the average rod length was described quantitatively in both phases, while the location of the ordering transition, which was found to be continuous, was predicted semiquantitatively by the theory. beyond the differences between lattice geometry and the characteristics of the rods (self-assembled or not), one fundamental feature is preserved in all the studies mentioned above. this is the assumption that only excluded volume interactions between the rods are considered (except in ref. [16], where monomers with two attractive bonding sites polymerize into polydisperse rods). moreover, one often encounters phrases in the literature, such as “this theory [onsager’s theory] shows that repulsive interactions [excluding volume interactions] alone can lead to long-range orientational nematic order, disproving the notion that attractive interactions are a prerequisite” [17], which could be ambiguous with respect to the role that attractive lateral interactions between the rods should play in reinforcing (or not) the nematic order. in this context, it is of interest and of value to inquire how the existence of lateral interactions between the rods influences the phase transition occurring in the system. the objective of this paper is to provide a thorough analysis in this direction. for this purpose, an exhaustive study of the phase transition occurring in a system of attractive rigid rods deposited on square lattices was performed. the results revealed that (i) the orientational order survives in the presence of attractive lateral interactions; (ii) the critical density shifts to higher values as the magnitude of the lateral interactions is increased; and (iii) the continuous transition becomes first order for interaction strength w > wc (in absolute values). the outline of the paper is as follows. in sec. ii we describe the lattice-gas model and the simulation scheme. in sec. iii we present the mc results. finally, the general conclusions are given in sec. iv. ii. lattice-gas model and monte carlo simulation scheme we address the general case of adsorbates assumed to be linear rigid particles containing k identical units (k-mers), with each one occupying a lattice site. small adsorbates would correspond to the monomer limit (k = 1). the distance between kmer units is assumed to be equal to the lattice constant; hence exactly k sites are occupied by a k-mer when adsorbed (see fig. 1). the surface is represented as an array of m = l × l adsorptive sites in a square lattice arrangement, where l denotes the linear size of the array. in order to describe the system of n k-mers adsorbed on m sites at a given temperature t , let us introduce the occupation variable ci which can take the values ci = 0 if the corresponding site is empty and ci = 1 if the site is occupied. on the other hand, molecules adsorb or desorb as one unit, neglecting any possible dissociation. under these considerations, the hamiltonian of the system is given by h = w ∑ 〈i,j〉 cicj − n (k − 1)w + �o ∑ i ci (1) where w is the nearest-neighbor (nn) interaction constant which is assumed to be attractive (negative), 〈i, j〉 represents pairs of nn sites and �o is the energy of adsorption of one given surface site. the term n (k − 1)w is subtracted in eq. (1) since the summation over all the pairs of nn sites overestimates the total energy by including n (k − 1) bonds belonging to the n adsorbed k-mers. because the surface was assumed to be homogeneous, the interaction energy between the adsorbed dimer and the atoms of the substrate �o was neglected for the sake of simplicity. 010005-2 papers in physics, vol. 1, art. 010005 (2009) / p. longone et al. figure 1: linear tetramers adsorbed on square lattices. full and empty circles represent tetramer units and empty sites, respectively. in order to characterize the phase transition, we use the order parameter defined in ref. [11], which in this case can be written as δ = |nh − nv| nh + nv (2) where nh(nv) is the number of rods aligned along the horizontal (vertical) direction. when the system is disordered (θ < θc), all orientations are equivalents and δ is zero. as the density is increased above θc, the k-mers align along one direction and δ is different from zero. thus, δ appears as a proper order parameter to elucidate the phase transition. the problem has been studied by grand canonical mc simulations using a typical adsorptiondesorption algorithm. the procedure is as follows. once the value of the chemical potential µ is set, a linear k-uple of nearest-neighbor sites is chosen at random and an attempt is made to change its occupancy state with probability w = min {1, exp (−∆h/kbt )}, where ∆h = hf −hi is the difference between the hamiltonians of the final and initial states and kb is the boltzmann constant. in addition, displacement (diffusional relaxation) of adparticles to nearest-neighbor positions, by either jumps along the k-mer axis or reptation by rotation around the k-mer end, must be allowed in order to reach equilibrium in a reasonable time. a mc step figure 2: adsorption isotherms (coverage versus chemical potential) for k = 10, l = 100 different w/kbt ’s as indicated. inset: adsorption phase diagram of attractive 10-mers on square lattices. (mcs) is achieved when m k-uples of sites have been tested to change its occupancy state. typically, the equilibrium state can be well reproduced after discarding the first r′ = 107 mcs. then, the next r = 2×107 mcs are used to compute averages. in our mc simulations, we varied the chemical potential µ and monitored the density θ and the order parameter δ, which can be calculated as simple averages. the reduced fourth-order cumulant ul introduced by binder [18] was calculated as: ul = 1 − 〈δ4〉 3〈δ2〉2 , (3) where 〈· · ·〉 means the average over the mc simulation runs. all calculations were carried out using the baco parallel cluster (composed by 60 pcs each with a 3.0 ghz pentium-4 processor and 90 pcs each with a 2.4 ghz core 2 quad processor) located at instituto de f́ısica aplicada, universidad nacional de san luis-conicet, san luis, argentina. iii. results the calculations were developed for linear 10-mers (k = 10). with this value of k and for non010005-3 papers in physics, vol. 1, art. 010005 (2009) / p. longone et al. interacting rods, it is expected the existence of a nematic phase at intermediate densities [10]. the surface was represented as an array of adsorptive sites in a square lattice arrangement with conventional periodic boundary conditions. the effect of finite size was investigated by examining lattices with l = 50, 100, 150, 200. in order to understand the basic phenomenology, we consider, in the first place, the behavior of the adsorption isotherms in presence of attractive lateral interactions between the k-mers. fig. 2 shows typical adsorption isotherms (coverage versus µ/kbt ) for linear 10-mers with different values of the lateral interaction (the solid circles represent the langmuir case, w/kbt = 0). the isotherms shift to lower values of chemical potential, and their slopes increase as the ratio w/kbt increases (in absolute value). for interaction strength above the critical value (w > wc, in absolute values) the system undergoes a firstorder phase transition, which is observed in the clear discontinuity in the adsorption isotherms1. in the case studied, this critical value is approximately wc/kbt ≈ −0.80 (or kbtc/w ≈ −1.25). the behavior of the adsorption isotherms also allows us to calculate the phase diagram of the adsorbed monolayer in “temperature-coverage” coordinates. in fact, once obtained the real value of the chemical potential (or critical chemical potential µc) in the two-phase region, the corresponding critical densities can be easily calculated. by repeating this procedure for different temperatures ranging between 0 and tc, the coexistence curve can be built [20]. a typical phase diagram, obtained in this case for attractive 10-mers, is shown in the inset of fig. 2. on the basis of the study in fig. 2, our next objective is to obtain evidence for the existence of nematic order in the range −0.80 ≤ w/kbt < 0 of attractive interactions. for this purpose, the behavior of the order parameter δ as a function of coverage was analyzed for k = 10, l = 100 and dif1in this situation, which has been observed experimentally in numerous systems, the only phase which one expects is a lattice-gas phase at low coverage, separated by a two-phase coexistence region from a “lattice-fluid” phase at higher coverage. this condensation of a two-dimensional gas to a two-dimensional liquid is similar to that of a lattice-gas of attractive monomers. however, the symmetry particlevacancy (valid for monoatomic particles) is broken for kmers and the isotherms are asymmetric with respect to θ = 0.5. figure 3: surface coverage dependence of the nematic order parameter for k = 10, l = 100 different w/kbt ’s as indicated. ferent values of the lateral interaction. the results are shown in fig. 3, revealing that (i) the orientational order survives in the presence of attractive lateral interactions and (ii) the critical density shifts to higher values as the magnitude of the lateral interactions is increased. in order to corroborate the results obtained in the last figure, we now study the dependence of θc on w/kbt . in the case of the standard theory of fss [18, 19], when the phase transition is temperature driven, the technique allows for various efficient routes to estimate tc from mc data. one of these methods, which will be used in this case, is from the temperature dependence of ul(t ), which is independent of the system size for t = tc. in other words, tc is found from the intersection of the curve ul(t ) for different values of l, since ul(tc) =const. in our study, we modified the conventional fss analysis by replacing temperature by density [11]. under this condition, the critical density has been estimated from the plots of the reduced four-order cumulants ul(θ) plotted versus θ for several lattice sizes. as an example, fig. 4 shows the results for w/kbt = −0.125. in this case, the value obtained was θc = 0.542(2). in the inset, the data are plotted over a wider range of temperatures, exhibiting the typical behavior of the cumulants in the presence of a continuous phase 010005-4 papers in physics, vol. 1, art. 010005 (2009) / p. longone et al. figure 4: curves of ul(θ) vs θ for k = 10, w/kbt = −0.125 and square lattices of different sizes. from their intersections one obtained θc. in the inset, the data are plotted over a wider range of densities. transition. the procedure of fig. 4 was repeated for −0.80 ≤ w/kbt < 0, showing that the values of θc increase linearly with the magnitude of the lateral couplings (see solid squares in fig. 5). the critical line (dotted line in the figure) was obtained from the linear fit of the numerical data. as it is possible to observe, the range of coverage at which the transition occurs diminishes as w/kbt is increased (in absolute value). this finding indicates that the presence of attractive lateral interactions between the rods does not favor the formation of nematic order in the adlayer. the phenomenon can be understood from the behavior of the second virial coefficient, which will initially decrease on introducing attractive w. this decrease implies that the isotherms shift to lower values of chemical potential, and consequently, the critical point shifts to higher densities. we did not assume any particular universality class for the transitions analyzed here in order to calculate their critical densities, since the analysis relied on the order parameter cumulant’s properties. however, the fixed value of the cumulants, u ∗ = 0.617(9), is consistent with the extremely precise transfer matrix calculation of u ∗ = figure 5: temperature-coverage phase diagram corresponding to attractive k-mers with k = 10. the inset in the upper-left (lower-right) corner shows a typical configuration in the nematic (isotropic) phase. 0.6106901(5) [21] for the 2d ising model. this finding may be taken as an indication that the phase transition belongs to the 2d ising universality class. with respect to the behavior of the system for w/kbt < −0.80, the adsorbed layer “jumps” from a low-coverage phase to a high-coverage phase. this effect, which has been discussed in fig. 2, is represented in fig. 5 by the dashed coexistence line. the low-coverage phase is an isotropic state, similar to that observed for w/kbt > −0.80 and low density (see inset in the lower-right corner of fig. 5). on the other hand, the high-coverage phase is also an isotropic state, but characterized by the presence of local orientational order (domains of parallel k-mers). a typical configuration in this regime is shown in fig. 6). finally, it is worth pointing out that: (1) the behavior of the order parameter in fig. 3 clearly indicates that the transition from the low-density disordered phase to the intermediate-density ordered phase is an isotropic to nematic phase transition (when all the words have the usual meaning). in this case, the transition under study belongs to the 2d ising universality class. it can also be thought of as an unmixing or liquid-gas transition [15]. for this reason we have called gas and liquid to the 010005-5 papers in physics, vol. 1, art. 010005 (2009) / p. longone et al. figure 6: typical configuration of the adlayer in the high-coverage phase and w/kbt < −0.80. phases reported in fig. 5; and (2) even though it has not been rigorously proved yet, a second phase transition for non-interacting rods at high densities has been theoretically predicted [10] and numerically confirmed [13]. this result has not been confirmed for the case of attractive rods. an exhaustive study on this subject will be the object of future work. iv. conclusions we have addressed the critical properties of attractive rigid rods on square lattices with two allowed orientations, and shown the dependence of the critical density on the magnitude of the lateral interactions w/kbt . the results were obtained by using mc simulations and fss theory. several conclusions can be drawn from the present work. on the one hand, we found that even though the presence of attractive lateral interactions between the rods does not favor the formation of nematic order in the adlayer, the orientational order survives in a range that goes from w/kbt = 0 up to wc/kbt ≈ −0.80 (wc/kbt represents the critical value at which occurs a typical transition of condensation in the adlayer). in this region of w/kbt , the critical density increases linearly with the magnitude of the lateral couplings. on the other hand, the evaluation of the fixed point value of the cumulants u ∗ = 0.617(9) indicates that, as in the case of non-interacting rods, the observed phase transition belongs to the universality class of the two-dimensional ising model. with respect to the behavior of the system for w/kbt < −0.80, the continuous transition becomes first order. thus, the adsorbed layer jumps from a low-coverage phase, similar to that observed for w/kbt > −0.80 and low density, to an isotropic phase at high coverage, characterized by the presence of local orientational order (domains of parallel k-mers) future efforts will be directed to (1) extend the study to repulsive lateral interaction between the k-mers; (2) obtain the whole phase diagram in the space (temperature-coverage-rod’s size); (3) develop an exhaustive study on critical exponents and universality and (4) characterize the second phase transition from a nematic order to a non-nematic state occurring at high density. acknowledgements this work was supported in part by conicet (argentina) under project number pip 112-200801-01332; universidad nacional de san luis (argentina) under project 322000 and the national agency of scientific and technological promotion (argentina) under project 33328 pict 2005. [1] l onsager, the effects of shape on the interaction of colloidal particles, ann. n. y. acad. sci. 51, 627 (1949). [2] p j flory, thermodynamics of high polymer solutions, j. chem. phys. 10, 51 (1942); p j flory, principles of polymers chemistry, cornell university press, ithaca, ny (1953). [3] m l huggins, some properties of solutions of long-chain compounds, j. phys. chem. 46, 151 (1942); m l huggins, thermodynamic properties of solutions of long-chain compounds, ann. n. y. acad. sci. 43, 1 (1942); m l huggins, theory of solutions of high polymers, j. am. chem. soc. 64, 1712 (1942). [4] j p straley, liquid crystals in two dimensions, phys. rev. a 4, 675 (1971). 010005-6 papers in physics, vol. 1, art. 010005 (2009) / p. longone et al. [5] j vieillard-baron, phase transitions of the classical hard-ellipse system, j. chem. phys. 56, 4729 (1972). [6] d frenkel, r eppenga, evidence for algebraic orientational order in a two-dimensional hardcore nematic, phys. rev. a 31, 1776 (1985). [7] k j strandburg, two-dimensional melting, rev. mod. phys. 60, 161 (1988). [8] a j phares, f j wunderlich, thermodynamics and molecular freedom of dimers on plane triangular lattices, j. math. phys. 27, 1099 (1986). [9] a j phares, f j wunderlich, j d curley, d w grumbine jr, structural ordering of interacting dimers on a square lattice, j. phys. a: math. gen. 26, 6847 (1993). [10] a ghosh, d dhar, on the orientational ordering of long rods on a lattice, eur. phys. lett. 78, 20003 (2007). [11] d a matoz-fernandez, d h linares, a j ramirez-pastor, determination of the critical exponents for the isotropic-nematic phase transition in a system of long rods on twodimensional lattices: universality of the transition, europhys. lett. 82, 50007 (2008). [12] d a matoz-fernandez, d h linares, a j ramirez-pastor, critical behavior of long linear k-mers on honeycomb lattices, physica a 387, 6513 (2008). [13] d h linares, f romá, a j ramirez-pastor, entropy-driven phase transition in a system of long rods on a square lattice, j. stat. mech. p03013 (2008). [14] d a matoz-fernandez, d h linares, a j ramirez-pastor, critical behavior of long straight rigid rods on two-dimensional lattices: theory and monte carlo simulations, j. chem. phys. 128, 214902 (2008). [15] t fischer, r l c vink, restricted orientation “liquid crystal” in two dimensions: isotropicnematic transition or liquid-gas one(?), europhys. lett. 85, 56003 (2009). [16] j m tavares, b holder, m m telo da gama, structure and phase diagram of self-assembled rigid rods: equilibrium polydispersity and nematic ordering in two dimensions, phys. rev. e 79, 021505 (2009). [17] h h wensink, columnar versus smectic order in systems of charged colloidal rods, j. chem. phys. 126, 194901 (2007). [18] k binder, applications of the monte carlo method in statistical physics. topics in current physics, springer, berlin (1984). [19] v privman, finite size scaling and numerical simulation of statistical systems, world scientific, singapore (1990). [20] t l hill, an introduction to statistical thermodynamics, addison wesley publishing company, reading, ma (1960). [21] g kamieniarz, h w j blöte, universal ratio of magnetization moments in two-dimensional ising models, j. phys. a: math. gen. 26, 201 (1993). 010005-7 papers in physics, vol. 13, art. 130003 (2021) received: 17 february 2021, accepted: 17 april 2021 edited by: o. fojón reviewed by: d. k. nguyen, thu dau mot university, vietnam p. f. weck, university of nevada, las vegas, usa licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.130003 www.papersinphysics.org issn 1852-4249 an ab initio study of small gas molecule adsorption on the edge of n-doped sawtooth penta-graphene nanoribbons nguyen thanh tien1*, le vo phuong thuan1, tran yen mi1 adsorption of the toxic gas molecules carbon monoxide (co), carbon dioxide (co2) and ammonia (nh3) on the edge of n-doped sawtooth penta-graphene nanoribbons (n:sspgnrs) was studied using first-principles methods. basing our study on density functional theory (dft), we investigated adsorption configurations, adsorption energy, charge transfer, and the electronic properties of co-, co2and nh3adsorbed onto n:sspgnrs. we found that co and co2 are chemisorbed on the edge of n:sspgnr, while nh3 is physisorbed. current-voltage (i–v) characteristics were also investigated using the non-equilibrium green’s function (negf) approach. gas molecules can modify the current of a device based on n:sspgnrs. the results indicate the potential of using n:sspgnrs for detection of these toxic gas molecules. i introduction detecting gas molecules using semiconductor gas sensors is important for agriculture, chemical controls, environmental monitoring, and medical applications [1, 2]. low-dimensional systems, especially material systems based on graphene, have for many years demonstrated outstanding developments in sensors and transistor applications [3–8]. however, since the band gap of graphene is almost equal to 0, it has not been fully exploited in the semiconductor industry [9]. the immense success of graphene [10–12] was followed by in-depth studies and encouraging efforts to find other twodimensional (2d) nanostructures, such as silicene [13], phosphorene [14] single-layer graphitic zinc oxide [15], h-boron nitride [16], and transition-metal dichalcogenides [17]. penta-graphene (pg), a novel wide band gap carbon allotrope, and pg-like mate*nttien@ctu.edu.vn 1 college of natural sciences, can tho university, 3-2 road, can tho city 94115, vietnam rials were discovered in early 2015 [18–20]. it was found that pg is ultra-strong, and can sustain a temperature of 1000 k with grain boundaries. it displays a quasi-direct intrinsic band gap of 3.2 ev [18, 21]. additionally, using the g0w0 approximation, the pg band gap was calculated as 4.2 ev [22]. in contrast to hydrogenated graphene, hydrogenation of pg leads to a notable increase (76% rise) in thermal conductivity instead of the 63% reduction expected due to heat dispersal in device operation [23]. furthermore, pg is noteworthy due to its unique mechanical properties and anisotropic mechanical behavior [24]. unlike graphene, pg has a buckle structure, which allows it to adsorb gas molecules in rich configurations. thus, pg is considered an excellent base for the development of gas sensors which can detect harmful gases such as co, co2 and nh3 [25–27]. cutting 2d-pg sheets in various crystallography directions obtains more different penta-graphene nanoribbons (pgnrs). z. q. fan et al. found that the sawtooth-sawtooth pgnr (sspgnr) is a more stable structure than the other three pgnr struc130003-1 papers in physics, vol. 13, art. 130003 (2021) / n. t. tien et al. figure 1: schematic of possible n:sspgnr adsorbent sites for gas molecules, consisting of a) c-hn:sspgnr, b) h-n-n:sspgnr, c) c-n-n:sspgnr. tures [28]. pgnr receives major attention in semiconductor material science since its energy band gap can be controlled effectively in many ways, such as by applying an electric field and bending [29], doping [30, 31], and edge termination [32, 33]. in our previous studies [31] we discovered that the current intensity of n:sspgnr increases to about 108 times that of pure sspgnr. this is convenient for determining current strength in electronic devices, including sensors. furthermore, we studied the adsorption feature of the gas molecules (co, co2 and nh3) on the sspgnr surface [34]. the results confirm that sspgnr is sensitive to co and nh3 molecules, but less sensitive to the co2 molecule [26, 27]. however, it is important to study the absorption configurations at the edges of nanoribbons [8, 35]. sspgnr can provide prior adsorption sites at its edges, which can serve as ideal gas sensor materials. nitrogen doping at the carbon edge of pgnrs allows prior adsorption at this edge without hydrogen passivation. the carbon at the edge without hydogen passivation presents the dangling effect that favorably adsorbs co, co2 and nh3. in this study, using dft and negf methods we investigate theoretically the electronic and transport properties of n:sspgnr when gas molecules (co, co2 and nh3) are adsorbed on their edge. the paper is organized as follows: the subject and research objectives are presented in the introduction section; in section ii the computational methods are discussed; section iii contains the results and discussion of the electronic and transport properties of n:sspgnr with the adsorbed figure 2: adsorption configurations of c-hn:sspgnr consist of: a) c-h-n:sspgnr, b) isolated gas molecules and c) configurations of c-h-n:sspgnr after adsorption gas molecules. gas molecules on the edge; in section iv the conclusions are presented. ii computational methods the electronic and transport properties of n:sspgnrs which adsorb gas molecules were explored by first-principles calculations based on dft and negf, using the atomistix toolkit (atk) software package (version 2017.1) [36, 37]. the width of the studied structure was six sawtooth chains. a 10 å vacuum space was introduced along non periodic (i.e., x and y) directions to ensure the isolation of n:sspgnrs from their periodic replicas. the samples were optimized using dft calculations within the generalized gradient approximation (gga) of perdew burke ernzerhof (pbe) [40], with the following similar conditions: 1 x 1 x 9 kpoint sampling, a cut-off energy of 790 ev and electron temperature of 300 k. considering the electric polarization effect, a double-zeta-polarized basis set was used to expand the electron wave function. the self-consistent field tolerance was set as 10−6 ha. furthermore, the ground state configuration was obtained at the convergence precision of energy for the maximum energy change, the maximum force, the maximum stress and the maximum displacement of 2.10−5 ev/atom, 0.05 ha/å, 0.05 ev å−3 and 0.005 å, respectively. 130003-2 papers in physics, vol. 13, art. 130003 (2021) / n. t. tien et al. table 1: adsorption orientations and corresponding numbers for the adsorption structures of co, co2 and nh3 on the edge of n:sspgnr gas configurations adsorption orientations notation of n:sspgnr co c-h-n:sspgnr vertical to c and o is downward coch1 vertical to c and o is upward coch2 h-n-n:sspgnr vertical to n and o is downward cohn1 vertical to nintro and o is upward cohn2 c-n-n:sspgnr vertical to c and o is downward cocn1 vertical to c and o is upward cocn2 vertical to n and o is downward cocn3 vertical to n and o is upward cocn4 co2 c-h-n:sspgnr vertical to c co2ch1 horizontal to c and co2 perpendicular to pg co2ch2 horizontal to c and co2 parallel to pg co2ch3 h-n-n:sspgnr vertical to n co2hn1 horizontal to n and co2 perpendicular to pg co2hn2 horizontal to n and co2 parallel to pg co2hn3 c-n-n:sspgnr vertical to c co2cn1 horizontal to c and co2 perpendicular to pg co2cn2 horizontal to c and co2 parallel to pg co2cn3 vertical to n co2cn4 horizontal to n and co2 perpendicular to pg co2cn5 horizontal to n and co2 parallel to pg co2cn6 nh3 c-h-n:sspgnr gas molecule is downward nh3ch1 gas molecule is upward nh3ch2 h-n-n:sspgnr gas molecule is downward nh3hn1 gas molecule is upward nh3hn2 c-n-n:sspgnr gas molecule is downward nh3cn1 gas molecule is upward nh3cn2 gas molecule is downward nh3cn3 gas molecule is upward nh3cn4 iii results and discussion i structure stability to gauge the capacity of n:sspgnrs to detect gas molecules (co, co2 and nh3), the adsorption of gas molecules was investigated on their edges, as the edge is the most reactive site on the ribbon due to the presence of dangling bonds. figure 1 depicts three possible adsorbent sites on the edge of an n:sspgnr, including: a) removing a passive h atom at the top of the c atom (ch-n:sspgnr), b) removing a passive h atom at the top of the n atom (h-n-n:sspgnr), and c) removing both passive h atoms on the edge of the ribbon (c-n-n:sspgnr). to explore the preferred configuration, we had to identify the model which the guest molecules would approach uninhibitedly. we in turn determined the most appropriate configuration for each gas molecule: co, co2 and nh3. we first considered possible adsorption configurations of the co molecule on the edge of an n:sspgnr; co can be adsorbed vertically on the edge with the o atom either upward or downward. therefore, there are 8 possible adsorption configurations with the co molecule. similarly, there are 12 possible adsorption configurations for the co2 molecule and 8 for the nh3 molecule. these configurations are listed in table 1. to determine the preferred configuration, we calculated the adsorption energy (ead) of all the configurations considered, as follows [38, 39]: ead = etotal − eribbon − egas, (1) where etotal, eribbon and egas are the total energies of a considered configuration after gas molecule adsorption, removing a passive h atom nanoribbon and isolated gas molecules, respectively. as per the definition adopted here, negative adsorp130003-3 papers in physics, vol. 13, art. 130003 (2021) / n. t. tien et al. tion energy shows that the process is exothermic in nature while the magnitude signifies thermodynamic stability. computed results indicated that coch2, co2ch2 and nh3ch2 (fig. 2) were the preferred configurations of co, co2 and nh3 on n:sspgnrs, respectively. the adsorption energy of these samples decreased gradually from -0.26 to -2.96 ev in the following order: ead(nh3) > ead(co2) > ead(co). it is obvious that co adsorption is the most stable. in all three adsorption cases, the n:sspgnr edge atom closest to gas molecules is a c atom rather than an n atom. this suggests that the most effective n:sspgnr configuration for adsorption of gas molecules is the c-h-n:sspgnr configuration. the adsorption distances of co, co2, and nh3 are 1.291 å, 1.487 å, and 3.317 å, respectively, as shown in table 2. therefore, coand co2adsorbed n:sspgnrs are more capable of chemical adsorption than physical adsorption, while an nh3adsorbed n:sspgnr is more likely to adsorb physically [41]. ii parameters of structures table 3 presents the parameters of three adsorption structures after relaxation. it can be clearly seen that the bond lengths of the gas molecules vary slightly. namely, after the adsorption, the bond lengths of two adsorbed gas molecules (co, co2) are longer than those of isolated gas molecules. the opposite is true of nh3 gas molecules. in addition, the bond angle of the co2 molecule is reduced from 180◦ to 122◦. the bond angles of the nh3 gas molecules change negligibly. in particular, the bond length of the co adsorption sample changes the most. this is also one reason the adsorption energy in this sample is the largest. on the other hand, the bond lengths (d1, d2, d3) and table 2: adsorption energy (ead), adsorption distance (h), band gap (eg) and charge transfer (q) from gas molecules to c-h-n:sspgnr. samples ead h eg q (ev) (å) (ev) (e) coch2 1.96 1.291 2.41 0.14 co2ch2 1.31 1.487 2.21 0.47 nh3ch2 -0.26 3.317 1.60 0.03 figure 3: density of states of systems: a) c-h-n:sspgnr(co) and c-h-n:sspgnr; b) ch-n:sspgnr(co2) and c-h-n:sspgnr; c) c-hn:sspgnr(nh3) and c-h-n:sspgnr. the bond angles (β1, β2, β3) close to the edge of c-h-n:sspgnr substrate were also changed after the gas molecule adsorption. the changes in the structural parameters due to this interaction is the basis of the change in electronic properties, which will be analyzed in the next section. iii electronic properties the electronic properties of c-h-n:sspgnr were studied to understand its capacity to detect co, co2 and nh3. we first investigated the density of states (dos) of co, co2 and nh3-adsorbed c-hn:sspgnr, as shown in fig. 3. we also see that after the adsorption of gas molecules, the dos of systems changed, and the band gaps of all three samples expanded. specifically, the band gap of co-adsorbed c-h-n:sspgnr showed the largest increase; its band gap increased to 2.41 ev while the band gap of c-h-n:sspgnr was only 1.44 ev. on the other hand, the sample with the smallest band gap change was the nh3-adsorbed c-h-n:sspgnr sample (table 2). the trend seen in the calculated charge transfer (q) in table 2 can be understood as capacity for relative electron donation or electron withdrawal of the adsorbed molecules. the positive q values mean that charge was transferred from the adsorbed molecules to the c-h-n:sspgnr in all three cases, the charge transferred from the co2 to the c-h-n:sspgnr being the largest (q = 0.47e). 130003-4 papers in physics, vol. 13, art. 130003 (2021) / n. t. tien et al. table 3: bond lengths of gas molecules before and after adsorption (l1 to l6), bond angles of gas molecules before and after adsorption (α1 to α4), bond lengths at edge of c-h-n:sspgnr before and after gas molecule adsorption (d1 to d4), and bond angles of c-h-n:sspgnr before and after adsorption for gas molecules ( β1 to β4 ). samples c-h-n:sspgnr co co2 nh3 coch2 co2ch2 nh3ch2 l1 (å) 1.128 1.184 l2 (å) 1.128 1.213 l3 (å) 1.128 1.213 l4 (å) 1.163 1.022 l5 (å) 1.163 1.026 l6 (å) 1.163 1.029 α1 ( ◦) 180 122 α2 ( ◦) 107.8 106.0 α3 ( ◦) 107.8 106.1 α4 ( ◦) 107.8 106.5 d1 (å) 1.338 1.339 1.366 1.338 d2 (å) 1.540 1.397 1.425 1.536 d3 (å) 1.508 1.414 1.409 1.507 β1 ( ◦) 122.5 125.4 132.6 123.1 β2 ( ◦) 108.5 102.9 108.6 108.8 β3 ( ◦) 102.9 108.8 106.3 102.8 β4 ( ◦) 120.4 131.2 111.4 120.2 to better understand the cause of the changes in band gaps after adsorption, we analyzed the contribution of the gas molecules by drawing the total density of states (tdos) and the local density of states (ldos), shown in fig. 4. we also see that the main contribution to the changes in dos is not due to the gas molecules but to changes in the substrate. in particular, the co molecule has the most influence on dos and the nh3 molecule has the least. in all cases, the major contribution to the changes in band gap width is the p orbitals of the atoms. on the other hand, as can be see from fig. 3 and fig. 4, in all three cases there is an overlap between the dos lines, which confirms the connection between the gas molecules and the substrate. however, in the case of co and co2 adsorption, there is an overlap and interlacing between tdos and ldos, so we can confirm that there is chemical bond formation. the adsorption in these two cases is chemical. in contrast, in the case of nh3 the adsorption is only the overlapping between tdos and ldos; nh3 can only physically adsorb on the edge of n:sspgnr. figure 5 shows the electron density difference (edd) for all three adsorbed samples. the edd was calculated using the following formula: ∆ρ = ρ(t otal) − ρ(ribbon) − ρ(gas). (2) here, ρ(t otal) and ρ(ribbon) represent the total electron densities of the n:sspgnr with and without adsorbed gas molecules, respectively, and ρ(gas) is the electron density of the isolated gas molecules. figure 4: tdos of systems: a) c-h-n:sspgnr(co); b) c-h-n:sspgnr(co2); c) c-h-n:sspgnr(nh3) and ldos of the gas molecules (the filled area under the dos curve). 130003-5 papers in physics, vol. 13, art. 130003 (2021) / n. t. tien et al. figure 5: the electron density difference for co, co2 and nh3-adsorbed n:sspgnr. the isosurface value is taken as 0.009 ev å−3. the electron density difference is defined as the valence electron density minus the neutral atom electron density. it can be seen from the edd plots that the charges were accumulated over the adsorbed gas molecules. the formation of chemical bonds is evident in the case of co and co2 adsorption. the electron density at the interface region between co, co2 and n:sspgnr indicates that the adsorbed gas molecule does form covalent bonds with the n:sspgnr after the adsorption process. in contrast, in the case of nh3 adsorption, no formation of covalent bonds was found, because there is little electron density difference at the interface between the nh3 gas molecule and n:sspgnr. iv transport properties in the previous section, we showed that the adsorption of gas molecules causes changes in the electronic band gaps of n:sspgnr. for further verification of co, co2 and nh3 detection by n:sspgnr, we studied the transport properties of c-h-n:sspgnr before and after the adsorption of gas molecules. the current voltage (i-v) characteristics were obtained using a two-probe model. figure 6 shows the correlation between the current and the bias voltage of the adsorbed samples and the pure sample. we can see that all of these lines have the same shape. on the graph there is a conduction pause although the bias voltage increases. when the bias voltage reaches a certain limit value (1.6 v), the current begins to increase, then decreases until saturation at 2.0 v. specifically, except for the co2 adsorption sample, the current of the remaining samples starts to increase at the bias voltage of 1.6 v (threshold bias). for semiconductors, when the polarizing voltage is not figure 6: calculated i-v curves for four structures: c-h-n:sspgnr, c-h-n:sspgnr(co), c-hn:sspgnr(co2) and c-h-n:sspgnr(nh3). large enough the device stops conducting, but when it is large enough for an electron to cross the barrier, the device begins to conduct electricity. thus, we can conclude that the structures under consideration are semiconductors. the maximum current obtained for pure c-h-n:sspgnr, co-adsorbed c-h-n:sspgnr, co2-adsorbed c-h-n:sspgnr, and nh3-adsorbed c-h-n:sspgnr is 0.3 na, 0.7 na, 13.7 na, and 0.4 na, respectively. in all four cases, the maximum currents occur at the bias voltage of 1.8 v. the maximum current of the co2 adsorption sample is the highest, 45 times greater than that of the pure sample. the maximum current of the nh3 adsorption sample is the lowest, only 1.3 times greater than that of the pure sample. all these findings suggest that the n:sspgnr current can be distinguished before and after molecule adsorption [42]. to explain the change in the trend of the i-v curves in fig. 6, we considered the bias dependent transmission spectra, t(e,vb), of the four studied structures via fig. 7 (left-hand panels). we see that the t(e,vb) of the pure c-h-n:sspgnr, the co-adsorbed c-h-n:sspgnr, and the nh3adsorbed c-h-n:sspgnr differ very little. however, the t(e,vb) spectrum of the co2-adsorbed c-h-n:sspgnr is very different from the three other samples. this is also related to the charge transfer phenomenon mentioned in table 2. furthermore, the various maximum currents occur at 130003-6 papers in physics, vol. 13, art. 130003 (2021) / n. t. tien et al. figure 7: left-hand panels: the contour of the bias-dependent transmission t(e,vb); right-hand panels: transmission coefficient at 1.8 v bias of four structures: a) c-h-n:sspgnr, b) c-h-n:sspgnr(co), c) c-hn:sspgnr(co2) and d) c-h-n:sspgnr(nh3). the bias voltage of 1.8 v; the transmission coefficients at 1.8 v bias of the four structures were calculated and shown in the right-hand panels of fig. 7. the values and filled zones bounded by the horizontal axis and t(e, vb = 1.8 v) curves help us to explain the changing tendency of the maximum currents of the four structures. iv conclusions in summary, using first-principle calculations we studied the adsorption geometry, adsorption energy, charge transfer, density of states, partial density of states and i-v curves of n:sspgnr with gas molecule (co, co2 and nh3) adsorption. our calculated results show that edge adsorption of co and co2 molecules is more energetically favorable than edge adsorption of nh3. moreover, the adsorption of co and co2 on the edge of n:sspgnr is chemical, while the adsorption of nh3 is physical. the current voltage (i–v) characteristics were also investigated using the non-equilibrium green’s function (negf) approach. the results indicate that conductance of the co, co2, and nh3 adsorption n:sspgnr can be distinguished at the bias voltage of 1.8 v. these changes in electronic and transport properties make n:sspgnr a promising candidate for gas detector development. acknowledgements this research was funded in part by the can tho university improvement project vn14-p6, supported by a japanese oda loan. we are grateful to the information center and network administrator of can tho university (ctu) for computational support. we also thank prof. yoshitada morikawa (osaka university) for discussing research ideas. 130003-7 papers in physics, vol. 13, art. 130003 (2021) / n. t. tien et al. [1] n haleh, j aashish, p jaewoo, e arezoo, advanced micro-and nano-gas sensor technology: a review, sensors 19, 1285 (2019). [2] s yang, c jiang, s h wei, gas sensing in 2d materials, appl. phys. rev. 4, 021304 (2017). [3] c tan, x cao, x j wu, q he, j yang, h zhang, et. al, recent advances in ultrathin two-dimensional nanomaterials, chem. rev. 117, 6225 (2017). [4] j chen, l xu, w li, x gou, α-fe2o3 nanotubes in gas sensor and lithium-ion battery applications, adv. mater. 17, 582 (2005). [5] v singh, d joung, l zhai, s das, s i khondaker, s seal, graphene based materials: past, present and future, prog. mater. sci. 56, 1178 (2011). [6] p sun, z kunlin, h wang, recent developments in graphene-based membranes: structure, mass-transport mechanism and potential applications, adv. mater. 28, 2287 (2016). [7] d n quang, l tuan, n t tien, electron mobility in gaussian heavily doped zno surface quantum wells, phys. rev. b 77, 125326 (2008). [8] b huang, z li, z liu, g zhou, s hao, j wu, b l gu, w duan, adsorption of gas molecules on graphene nanoribbons and its implication for nanoscale molecule sensor, j. phys. chem. c 112, 13442 (2008). [9] f schwierz, graphene transistors, nature nanotech. 5, 487 (2010). [10] a h castro neto, f guinea, n m r peres, k s novoselov, a k geim, the electronic properties of graphene, rev. mod. phys. 81, 109 (2009). [11] f bonaccorso, z sun, t hasan, a ferrari, graphene photonics and optoelectronics, nature photon. 4, 611 (2010). [12] p avouris, graphene: electronic and photonic properties and devices, nano lett. 10, 4285 (2010). [13] p vogt, p d padova, c quaresima, j avila, e frantzeskakis, m c asensio, a resta, b ealet, g l lay, silicene: compelling experimental evidence for graphenelike two-dimensional silicon, phys. rev. lett. 108, 155501 (2012). [14] a ziletti, a carvalho, d k campbell, d f coker, a h castro neto, oxygen defects in phosphorene, phys. rev. lett. 114, 046801 (2015). [15] x deng, k yao, k sun, w x li, j lee, c matranga, growth of single-and bilayer zno on au(111) and interaction with copper, j. phys. chem. c 117, 11211 (2013). [16] x blase, a rubio, s g louie, m l cohen, quasiparticle band structure of bulk hexagonal boron nitride and related systems, phys. rev. b 51, 6868 (1995). [17] d w latzke, w zhang, a suslu, t r chang, h lin, h t jeng, s tongay, j wu, a bansil, a lanzara, electronic structure, spin-orbit coupling, and interlayer interaction in bulk mos2 and ws2, phys. rev. b 91, 235202 (2015). [18] s zhang, j zhou, q wang, x chen, y kawazoe, p jena, penta-graphene: a new carbon allotrope, p. natl. acad. sci. u.s.a. 112, 2372 (2015). [19] t stauber, j i beltran, j schliemann, tightbinding approach to penta-graphene, sci. rep. 6, 1 (2016). [20] t y mi, n d khanh, r ahuja, n t tien, diverse structural and electronic properties of pentagonal sic2 nanoribbons: a firstprinciples study, mater. today comm. 26, 102047 (2021). [21] j sun, y guo, q wang, y kawazoe, thermal transport properties of penta-graphene with grain boundaries, carbon 145, 445 (2019). [22] h einollahzadeh, r dariani, s fazeli, computing the band structure and energy gap of pentagraphene by using dft and g0w0 approximations, solid state comm. 229, 1 (2016). 130003-8 https://doi.org/10.3390/s19061285 https://doi.org/10.1063/1.4983310 https://doi.org/10.1021/acs.chemrev.6b00558 https://doi.org/10.1021/acs.chemrev.6b00558 https://doi.org/10.1002/adma.200401101 https://doi.org/10.1016/j.pmatsci.2011.03.003 https://doi.org/10.1016/j.pmatsci.2011.03.003 https://doi.org/10.1002/adma.201502595 https://doi.org/10.1103/physrevb.77.125326 https://doi.org/10.1103/physrevb.77.125326 https://doi.org/10.1021/jp8021024 https://doi.org/10.1021/jp8021024 https://doi.org/10.1038/nnano.2010.89 https://doi.org/10.1038/nnano.2010.89 https://doi.org/10.1103/revmodphys.81.109 https://doi.org/10.1103/revmodphys.81.109 https://doi.org/10.1038/nphoton.2010.186 https://doi.org/10.1038/nphoton.2010.186 https://doi.org/10.1021/nl102824h https://doi.org/10.1021/nl102824h https://doi.org/10.1103/physrevlett.108.155501 https://doi.org/10.1103/physrevlett.114.046801 https://doi.org/10.1103/physrevlett.114.046801 https://doi.org/10.1021/jp402008w https://doi.org/10.1021/jp402008w https://doi.org/10.1103/physrevb.51.6868 https://doi.org/10.1103/physrevb.51.6868 https://doi.org/10.1103/physrevb.91.235202 https://doi.org/10.1073/pnas.1416591112 https://doi.org/10.1073/pnas.1416591112 https://doi.org/10.1038/srep22672 https://doi.org/10.1038/srep22672 https://doi.org/10.1016/j.mtcomm.2021.102047 https://doi.org/10.1016/j.mtcomm.2021.102047 https://doi.org/10.1016/j.carbon.2019.01.015 https://doi.org/10.1016/j.ssc.2015.12.012 papers in physics, vol. 13, art. 130003 (2021) / n. t. tien et al. [23] x wu, v varshney, j lee, t zhang, j l wohlwend, a k roy, t luo, hydrogenation of penta-graphene leads to unexpected large improvement in thermal conductivity, nano lett. 16, 3925 (2016). [24] s winczewski, j rybicki, anisotropic mechanical behavior and auxeticity of penta-graphene: molecular statics/molecular dynamics studies, carbon 146, 572 (2019). [25] r krishnan, w s su, h t chen, a new carbon allotrope: penta-graphene as a metal-free catalyst for co oxidation, carbon 114, 465 (2017). [26] h qin, c feng, x luan, d yang, firstprinciples investigation of adsorption behaviors of small molecules on penta-graphene, nanoscale res. lett. 13, 1 (2018). [27] c p zhang, b li, z g shao, first-principle investigation of co and co2 adsorption on fe-doped penta-graphene, appl. surf. sci. 469, 641 (2019). [28] p f yuan, z h zhang, z q fan, m qiu, electronic structure and magnetic properties of penta-graphene nanoribbons, phys. chem. chem. phys. 19, 9528 (2017). [29] c he, x f wang, w x zhang, coupling effects of the electric field and bending on the electronic and magnetic properties of pentagraphene nanoribbons, phys. chem. chem. phys. 19, 18426 (2017). [30] n t tien, v t phuc, r ahuja, tuning electronic transport properties of zigzag graphene nanoribbons with silicon doping and phosphorus passivation, aip adv. 8, 085123 (2018). [31] n t tien, p t b thao, v t phuc, r ahuja, electronic and transport features of sawtooth penta-graphene nanoribbons via substitutional doping, physica e: low dimens. syst. nanostruct. 114, 113572 (2019). [32] n t tien, p t b thao, v t phuc, r ahuja, influence of edge termination on the electronic and transport properties of sawtooth pentagraphene nanoribbons, j. phys. chem. solids 146, 109528 (2020). [33] y h li, p f yuan, z q fan, z h zhang, electronic properties and carrier mobility for penta-graphene nanoribbons with nonmetallicatom-terminations, org. electron. 59, 306 (2018). [34] t y mi, d m triet, n t tien, adsorption of gas molecules on penta-graphene nanoribbon and its implication for nanoscale gas sensor, physics open 2, 100014 (2020). [35] a saffarzadeh, modeling of gas adsorption on graphene nanoribbons, j. appl. phys. 107, 114309 (2010). [36] j taylor, h guo, j wang, ab initio modeling of quantum transport properties of molecular electronic devices, phys. rev. b 63, 245407 (2001). [37] m brandbyge, j l mozos, p ordejón, j taylor, k stokbro, density-functional method for nonequilibrium electron transport, phys. rev. b 65, 165401 (2002). [38] j zhao, a buldum, j han, j p lu, gas molecule adsorption in carbon nanotubes and nanotube bundles, nanotechnology 13, 195 (2002). [39] j w feng, y j liu, h x wang, j x zhao, q h cai, x z wang, gas adsorption on silicene: a theoretical study, comp. mater. sci. 87, 218 (2014). [40] p j perdew, k burke, m ernzerhof, generalized gradient approximation made simple, physical rev. lett. 77, 3865 (1996). [41] p pyykkö, m atsumi, molecular single-bond covalent radii for elements 1–118, chem. eur. j. 15, 186 (2009). [42] l tang, m q cheng, q chen, t huang, k yang, w q huang, w hu, g f huang, ultrahigh sensitivity and selectivity of pentagonal sic2 monolayer gas sensors: the synergistic effect of composition and structural topology, phys. status solidi b 257, 1900445 (2020). 130003-9 https://doi.org/10.1021/acs.nanolett.6b01536 https://doi.org/10.1021/acs.nanolett.6b01536 https://doi.org/10.1016/j.carbon.2019.02.042 https://doi.org/10.1016/j.carbon.2016.12.054 https://doi.org/10.1016/j.carbon.2016.12.054 https://doi.org/10.1186/s11671-018-2687-y https://doi.org/10.1016/j.apsusc.2018.11.072 https://doi.org/10.1016/j.apsusc.2018.11.072 https://doi.org/10.1039/c7cp00029d https://doi.org/10.1039/c7cp00029d https://doi.org/10.1039/c7cp03404k https://doi.org/10.1039/c7cp03404k https://doi.org/10.1063/1.5035385 https://doi.org/10.1016/j.physe.2019.113572 https://doi.org/10.1016/j.physe.2019.113572 https://doi.org/10.1016/j.jpcs.2020.109528 https://doi.org/10.1016/j.jpcs.2020.109528 https://doi.org/10.1016/j.orgel.2018.05.039 https://doi.org/10.1016/j.orgel.2018.05.039 https://doi.org/10.1016/j.physo.2020.100014 https://doi.org/10.1063/1.3409870 https://doi.org/10.1063/1.3409870 https://doi.org/10.1103/physrevb.63.245407 https://doi.org/10.1103/physrevb.63.245407 https://doi.org/10.1103/physrevb.65.165401 https://doi.org/10.1103/physrevb.65.165401 https://doi.org/10.1088/0957-4484/13/2/312 https://doi.org/10.1088/0957-4484/13/2/312 https://doi.org/10.1016/j.commatsci.2014.02.025 https://doi.org/10.1016/j.commatsci.2014.02.025 https://doi.org/10.1103/physrevlett.77.3865 https://doi.org/10.1002/chem.200800987 https://doi.org/10.1002/chem.200800987 https://doi.org/10.1002/pssb.201900445 introduction computational methods results and discussion structure stability parameters of structures electronic properties transport properties conclusions papers in physics, vol. 13, art. 130002 (2021) received: 17 october 2020, accepted: 23 february 2021 edited by: r. cerbino reviewed by: f. giavazzi, università degli studi di milano, italy licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.130002 www.papersinphysics.org issn 1852-4249 radial percolation reveals that cancer stem cells are trapped in the core of colonies lucas barberis1* using geometrical arguments, it is shown that cancer stem cells (cscs) must be confined inside solid tumors under natural conditions. aided by an agent-based model and percolation theory, the probability of a csc being positioned at the border of a colony is estimated. this probability is estimated as a function of the csc self-renewal probability ps; i.e., the chance that a csc remains undifferentiated after mitosis. in the most common situations ps is low, and most cscs produce differentiated cells at a very low rate. the results presented here show that cscs form a small core in the center of a cancer cell colony; they become quiescent due to the lack of space to proliferate, which stabilizes their population size. this result provides a simple explanation for the csc niche size, dispensing with the need for quorum sensing or other proposed signaling mechanisms. it also supports the hypothesis that metastases are likely to start at the very beginning of tumor development. i introduction cancer stem cells (cscs) are responsible for driving tumor growth due to their ability to make copies of themselves (self-renewal) and differentiate into cells with more specific functions [1]. differentiated cancer cells (dccs) maintain a limited ability to proliferate, but can only generate cells of their specific lineage [2]. like the tissues from which they arise, solid tumors are composed of a heterogeneous population of cells; many properties of normal stem cells are shared by at least a subset of cancer cells [3, 4]. in many tissues, normal stem cells must be able to migrate to different regions of an organ, where they give rise to the specifically differentiated cells required by the organism. these features *lbarberis@unc.edu.ar 1 instituto de f́ısica enrique gaviola (ifeg) and facultad de matemática, astronomı́a, f́ısica y computación, conicet and unc, x5000hue córdoba, argentina. are reminiscent of invasion and immortality, two hallmark properties of cancer cells [5, 6]. resistance to chemo/radio therapies gives the cscs a high chance of survival and of forming new tumors, even after treatment [7]. hence, based on the concept that destroying or incapacitating cscs would be an efficient method of cancer containment and control, new therapeutic paradigms are the focus of current research. a tumorsphere assay is an experimental biological model used for the study of csc features. a tumorsphere is a clonal aggregate of cancer cells grown in vitro from a single cell. it has been shown experimentally that dccs can not generate a tumorsphere because they are unable to form compact long-term aggregates. as a consequence, the current experimental convention for the definition of cscs, from a functional point of view, is based on the capacity of a single cell, the seed, to grow a more or less spherical aggregate in a gel suspension. this is why cscs are said to “drive” tumor progression. 130002-1 papers in physics, vol. 13, art. 130002 (2021) / l. barberis in an experimental assay, where the time evolution of the number of cells in a tumorsphere is measured, it is possible to determine a proliferation rate r consistent with the population doubling time (pdt) of the total population. unfortunately, this quantity cannot discriminate between the growth rates of cscs and their differentiated counterparts. furthermore, measuring the pdt of the cscs is experimentally very complex [8]. understanding r is crucial for mathematical modeling in a system where the offspring might belong to a different population than their parents. indeed, csc duplication could have three possible outcomes: stem cell replication (self-renewal), asymmetric differentiation and symmetric differentiation. to model such a feature mathematically, we can assume that the three outcomes will occur with probabilities ps, pa and pd, respectively. from the point of view of the populations, the last two possibilities correspond to the birth of a new dcc, the parent cell either keeping (in the asymmetric duplication) or losing (in the symmetric case) its stemness. for example, writing the corresponding population dynamics differential equations, beńıtez et al. showed that a csc will give birth to another csc at a rate rps which is estimated by fitting their mathematical model to the experimental data [9,10]. since the probability of self-replication ps seems to be small in homeostasis and in most common culture media, then rps will also be small, leading to quiescence, a main feature of cscs. nevertheless, cscs can be experimentally forced to abandon their quiescent state by using specific growth factors that inhibit differentiation [11, 12] or by restricting oxygen concentration, as discussed below. in these situations, ps is close to one and tumorspheres will contain a high fraction of cscs. also, pa is usually small, although its value can increase under abnormal situations such as injury or disease. metastasis, the invasion process by which cancer cells leave a tumor to form a new colony at another location, is an intriguing feature of cancer disease. because cscs are the seeds of tumors, it is accepted that metastatic tumors must come from a pre-existent primary tumor and might be located near its surface in order to detach, migrate and finally invade another site in the organism. the distribution of cscs in a primary tumor is key to the understanding of metastasis, cell proliferation and drug resistance. curiously, according to the three-concentric-layer model proposed by persano et al., cscs seem to be located in the inner core of glioblastomas. on the other hand, cells on the periphery of the tumor show a more differentiated phenotype that is highly sensitive to temozolomide, a drug for cancer treatment [13]. interestingly, these authors demonstrate that cscs are more proliferative under hypoxic conditions. furthermore, li et al. report that hypoxia plays an important role in the de-differentiation of cells [14]. these results indicate that a hypoxic environment will increase the numbers of cscs. the aim of this work is to simulate colony growth by means of an agent-based model (abm) that mimics basic features of csc proliferation, with emphasis on its geometrical properties. multiple realizations allow us to estimate the fraction of cscs situated on the periphery of the colony, showing that it is quite small for large, long-lived colonies. we also use some elements of percolation theory to help interpret and quantify the simulation results. we then report a transition to percolation which depends on the self-renewal probability, ps, of the csc population. finally, we conclude that that our results lead to a simple explanation for csc niche size, and support the hypothesis that the metastatic process must start at the very beginning of tumor development. ii simulation of two-dimensional colonies experimental monoclonal colonies start with a csc in a suitable culture medium. this cell and its daughters duplicate at a more or less constant rate, forming a colony with an approximately circular shape. the reader interested in examples may refer to [15–17] for further information. in this work we use the same principle, describing the cells using circle-shaped mathematical objects and refining rules for their duplication and movement. in later studies we will requirecells not to be rigid spheres. each cell in the current simulations has a central circular impenetrable region, the core, and around the core there is an area, the corona, where overlapping is allowed. in this way, cells are able to overlap as far as the border of their coronas and reach the border of the core of other, neighboring cells. the figures presented in this work has a 95 % overlap 130002-2 papers in physics, vol. 13, art. 130002 (2021) / l. barberis (a) (b) figure 1: two realizations of the simulation for ps = 0.5 after 15 days. (a) there are no cscs at the periphery; all of them, colored in pink, became quiescent. (b) the active cscs at the border are colored in red. note that some branches failed to percolate. dccs are depicted in cyan (quiescent) and blue (active). the seed is represented in yellow. on the cell radius. however, we will show that the results presented here do not depend on this parameter. also, each cell belongs to a class that defines its behavior: active cscs, active dccs, quiescent cscs and quiescent dccs. as a simplification of the model, we assume that the probability of asymmetric duplication of cscs is zero, pa = 0. in this way ps is the only control parameter that allows direct capture of the model’s main features. we start the simulation by seeding an active csc and asking it to duplicate. furthermore, at each time step we ask all active cells to attempt to duplicate, according to the following rules. first a new cell is created in a random position at the side of an active cell. if there are no other cells in this location it remains there, otherwise a new location beside the active cell is randomly determined. after 500 attempts, if the new cell has still not found an empty spot to occupy, it is destroyed and the active cell changes to its respective quiescent class. conversely, after successful duplication, if the active cell is a csc it self-renews with a probability ps, which means that the new cell becomes an active csc, otherwise it changes to the active dcc class. if the new, active cells belong to different classes, there is a probability of 1/2 that the active csc will exchange positions with the new dcc. naturally, an active dcc can only create another active dcc. once all the active cells have been asked to duplicate, independently of the success of the attempt (a) ps = 0.2 (b) ps = 0.95 figure 2: two realizations of the simulation at low (a) and high (b) self-replication rates after 15 days. in (a) the cscs are quickly surrounded by dccs, becoming quiescent (pink). in (b) the high number of cscs allows percolation to most of the colony perimeter. the seed is colored in yellow. we say that a one-day-long time step has been performed. this implies, without loss of generality, that r ' 1 day−1, a reasonable growth rate that will aid our intuition. further details on the structure of the simulation process are provided in the appendix, including a flow chart in fig. 7. we have also provided videos in the supplementary information. in each video a representative value of ps was set and ten realizations were recorded, illustrating how they were carried out. typical outcomes of the described process are shown in fig. 1 and video v1. in these examples we set ps = 0.5 and started with a csc seed depicted in yellow. after running the simulation for a period equivalent to two weeks (15 time steps), a relatively long period for a biological experiment, we obtained two possible outcomes: (a) a few quiescent cscs were trapped in the center of the colony or (b) there were active cscs at the border of the colony. indeed, we defined the border of the colony as the rim formed by the set of active cells. to better track the subpopulations present in the colony, the active cells are colored in red for csc and in blue for dcc. also, the quiescent cells are colored in pink for csc and cyan for dcc. remember that the yellow dot is not always at the center of the colony because the seed may exchange its pposition with a new dcc. in panel (b) we recognize a path, paved by cscs, that joins the center of the colony with its border. also, some frustrated branches appear in this path of cscs, which die in the quiescent core, indicating great variability. this path resembles 130002-3 https://www.papersinphysics.org/papersinphysics/article/view/641/349 papers in physics, vol. 13, art. 130002 (2021) / l. barberis 0 10 20 30 40 50 t 10 0 10 1 10 2 a c ti v e c s c (a) 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 t 10 0 10 1 10 2 10 3 10 4 t o ta l c s c (b) 0 10 20 30 40 50 t 0 0.2 0.4 0.6 0.8 1 p (c) 0 10 20 30 40 50 t 10 0 10 1 10 2 10 3 10 4 n (d) figure 3: time evolutions of quantities averaged over a thousand realizations for values of ps ∈ [0.2..1.0]. (a) number of cscs at the border. (b) the total number of cscs reaches a constant value that defines the niche. (c) the probability of a csc being at the border. (d) system size as a function of time. cluster percolation in porous media, lattices or networks, which led to us to attempt to describe the probability of finding a csc at the border of the colony by means of percolation theory. to sharpen our intuition as to what percolation means in this system, in fig. 2 and videos v2 and v3 we present two more examples with different self-replication probabilities, ps. in panel (a) and video v2, ps = 0.2 is so small that the cscs are quickly surrounded by dccs, and become quiescent. this is the most common situation in a regular culture medium where the csc count is low and constant —over time. on the other hand, in panel (b) and video v3, ps = 0.95 leads to a large csc population that invades almost the entire system. the addition of stem cell maintenance factors such as egf or bfgf to the culture medium is an example of this case. these limiting situations were previously studied both experimentally [11, 18] and mathematically [9, 10], focusing in the first case on technical aspects of the assay, and in the second case on recovery of the csc fraction. curiously, in the examples shown in figs. 1 and 2, the rim formed by active cells is similar to the one reported in the classical multicelular spheroid work of freyer and shuterland [19]. in freyer’s work, the inner portion of the spheroids was formed by dead cells, attributed to a low concentration of oxygen/nutrients supplied by means of diffusion. this rim was also formed by active cells and was studied mathematically by several authors using diffusion equations [20–22]. it is interesting that in the present monoclonal case the geometry seems to be sufficient to develop a rim, just two cells thick, independently of any diffusion process. 130002-4 https://www.papersinphysics.org/papersinphysics/article/view/641/350 https://www.papersinphysics.org/papersinphysics/article/view/641/351 https://www.papersinphysics.org/papersinphysics/article/view/641/350 https://www.papersinphysics.org/papersinphysics/article/view/641/351 papers in physics, vol. 13, art. 130002 (2021) / l. barberis to study the percolation properties of the system statistically, we carried out extensive simulations of colony growth. each data point reported is the result of averaging over 1000 runs. the time evolution of colony growth was performed for 37 values of ps ∈ [0.2, 1.0]. the results of these measurements are summarized in fig. 3. panel (a) shows the number of cscs at the border, which increases to a maximum given by the time when they are overwhelmed by the dcc population. beyond this point, more and more cscs become quiescent until no more active cscs remain. this phenomenon disappears for ps barely lower than 1, when almost all cells on the border are cscs. in panel (b) the time evolution of the total csc population is depicted, showing that after a transient the csc population stops growing and remains constant. this fixed number of cscs is usually called the csc niche and was mathematically studied in [9]. as expected, the probability of a csc being at the border rises as ps is increased, as shown in panel (c). the relationship between simulation time and system size is shown in panel (d); due to its geometrical nature it is independent of ps, the self-renewal probability. iii percolation theory percolation theory was used to estimate the probability of finding a csc at the border of the colony, looking for a purely geometrical feature of its growth. in particular, we assumed that the cells in the colony could be mapped onto the nodes of a triangular network with the seed at its center. each cell is then connected with its nearest neighbors, whose number, in two dimensions, is known not to exceed six. as shown in fig. 1(b), the cscs (quiescent and active) form paths that expand radially from the center to the edge of the colony. by inspection, we note that these paths are formed by the connected nodes that have had a csc at any moment during the culture. the path could have some “holes” that arise because of the possibility of exchanging a parent csc with its daughter dcc. the probability ps thus regulates the number of cscs that occupy the nodes of the network. hence, there must be a threshold value pc of ps, below which it is impossible to follow a path of nodes starting at the seed and ending in a csc at the border of the tumor. this value, pc, is called the critical value or percolation threshold and defines a percolation phase transition[23]. thus, when the border is reached by cscs, we say that such a path percolates. formally, a percolating path is mathematically defined as a set of connected nodes that expands infinitely over an infinite network for a large enough value of ps [24]. in our model, each new node of the percolation path is added either by the selfrenewal of a csc, with probability ps, or by an exchange between a csc and a dcc, with probability 1 2 ×(1−ps), after a csc gives birth to a dcc. it is also important to note that several neighboring nodes could already be occupied. thus, it is useful to define the average empty neighbor number z as follows: imagine that we perform a random walk starting at the seed’s node along an infinite path where it is forbidden to step back. at each step there are z branches in the network, but there is also a probability ps + 1−ps 2 that a node belongs to the path, so on average only 1 2 (ps + 1)z will be accessible. to continue walking along the path, there must be at least one empty node to walk along, leading to the condition 1 2 (ps + 1)z ≥ 1. therefore, the critical probability for transition to percolation is pc = 2 z − 1, (1) which depends on the available neighbor number. 1 it is noteworthy that in eq. (1), for z = 1 we obtain pc = 1, the 1d critical occupation for a chain of nodes. for 1 < z < 2 the critical occupation quickly falls to zero. this means that for z ≥ 2 the system will always percolate. on the other hand, if the csc never exchanges its position with a dcc, we get pc = 1 z , which is again the 1d critical occupation and will asymptotically fall to zero as z increases. otherwise, if the csc always moves, ps diverges as expected for a cell that always remains on the outer rim of the growing colony. there is no evidence that a csc will stay either in the core or on the border of the colony. simulations carried out on both limits of the exchange probability agree with these theoretical limits, but do not provide further information. for this reason, we implemented the one-half exchange probability in our 1remember it is not universal because it depends on the details of the network. 130002-5 papers in physics, vol. 13, art. 130002 (2021) / l. barberis (a) (b) 0 10 20 30 40 50 n 1 1.2 1.4 1.6 1.8 2 z figure 4: (a) cell overlapping does not change the maximum neighbor number of six; thus, the underlying lattice is topologically triangular (yellow lines). (b) the average neighbor number versus layer number quickly drops to one. at t = 50 days, n ' 50, n ' 4000 and z = 1.02, which means ps = 0.92. simulations, using the “a priori equal probability” principle. by construction, there are no closed loops in the network, thus deduction of the critical probability was similar to that for a bethe lattice or a tree graph. however, unlike regular lattices/graphs, in our model the neighbor number of a node is not the same for all nodes due the randomness introduced by space searching and the probability of exchange. for this reason, we defined an average neighbor number. in fig. 4(a) each cell is depicted by two concentric circles; the dark one represents the rigid core and the lighter represents the corona where overlapping is allowed. note that the maximum neighbor number is six, regardless of the extent of the corona. in simulations a new cell must be in contact with other cells, and overlapping will modify the density of the colony. as mentioned, this parameter will be useful for further study of diffusion effects that are irrelevant to the present work. also, the random process of searching space for proliferation will slightly modify the geometry of the underlying network, but not its topology, which will be the same as that of the triangular lattice except for a few defects caused by a neighbor number lower than six. thus density, which is relevant when studying the diffusion of nutrients, oxygen or proteins for signaling, does not play any role in the current percolation problem. to roughly estimate z we built up a tree graph figure 5: growth of the colony on a triangular lattice, starting from the center (black dot). each layer is depicted in a different color and contains three nodes with z = 3 and the remainder with z = 1. over a triangular lattice substrate. a possible connection pattern of nodes in such a network is depicted in fig. 5, which was built layer by layer with each layer represented in a different color. layer n = 0 is the central black dot that represents the seed. in this particular example, in each layer we depict three nodes with z = 3 (darker dots), leaving the remaining nodes with z = 1 (lighter dots). it is clear that for any layer there are 6 ×n nodes for n > 0; then to connect two layers without loops we need on average z = n+1 n . because 1 < z < 2 quickly decays with n, c.f. fig. 4(b), we do expect that the critical probability for percolation, given by eq. (1), will shift to pc = n n+1 → 1 as n → ∞. the size of the network will be n = 6 ∑n i=1 i + 1 leading to the limit pc −−−−→ n→∞ 1. thus, as the colony increases in size, the percolation threshold shifts towards 1. the main consequence is that there is no chance to perform finite size scaling to detect the critical transition point through simulations, as in the case of 1d chains. as mentioned before, in figs. 1 and 2, the proliferative cells are distributed in the two outermost layers. this feature of the colony occurs because the cells in the simulation are requested to proliferate in random order. this is also the cause of 130002-6 papers in physics, vol. 13, art. 130002 (2021) / l. barberis 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p s 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p t [days] n [cells] 20 500 40 2389 60 5797 80 10669 100 17090 bethe 0 1 2 10 4 0.2 0.4 0.6 0.8 1 p c n simulation theory figure 6: probability of percolation as a function of ps for different colony sizes. dots correspond to simulations and lines are theoretical fittings to estimate the percolation threshold (see text). the gray line is the theoretical bethe result. inset: percolation threshold pc at different sizes measured by simulations (blue). our simple theoretical approach overestimates this threshold (red). the observed circular shape of the colony. because the vertices of the hexagonal array shown in fig. 5 have a lower probability of being chosen for proliferation, as system size increase s cells on their edges will proliferate first, as at the beginning of each time step they have more room to do so. thus, the estimated number of cells in each layer is underestimated, as is their neighbor number. as a consequence, we expect that the actual value of pc as a function of size or time will be much lower than that predicted by eq. (1) and our estimation of z. a precise estimation of z will require statistical treatment that is beyond the scope of this article. the result implies that proper (mathematical) percolation transition will occur at ps = 1, as in 1d percolation. this also implies that the only way for a csc to be at the border, for an arbitrarily large colony, is when it is forced not to differentiate. nevertheless, monoclonal colonies cannot be maintained and grown forever. a 50-day experiment is very long, and very uncommon in the literature. we thus focus on the probability of a csc being at the border, p∞, whose behavior, as a function of ps, is depicted in fig. 6 for several times/sizes. the relationship between time and size is bijective, thus we report both values in the legend. in this graph the dots are the result of the simulations presented in the previous section, for which the frequency of cscs at the border of the colony was averaged over a thousand runs. to obtain suitable fitting of these data as continuous functions of ps, we used erf functions following yonezawa’s classical work [23]. the inflection point of the erf curves coincides with the percolation threshold, giving a good estimate of pc. in addition, the theoretical result for the percolation parameter in a bethe lattice of order 3 is depicted in gray for comparative purposes. note that p∞ does not jump as in bethe percolation; the transition is continuous, as known for several regular lattices, including the triangular one. gonzalez et al. [25] studied the transitions of several forms of percolation in triangular lattices, reporting their critical probabilities using finite size scaling analysis. they found universal features with values between 0.5 and 0.8 for the percolation threshold, depending on the problem. in contrast, but as expected, as our system size increases the fitted curves become steeper and steeper, and their inflection points predict a shift of ps towards 1, as depicted by the blue dots in the inset of fig. 6. comparison with our theoretical result gives an expected overestimation of the percolation threshold – the red line, as explained before. we therefore hope to find that a csc will be at the border of the colony in half the runs when we simulate a three-week experiment. iv conclusions metastasis is an intriguing feature of cancer invasion. most research in this field is guided by the biological point of view, even though there are many mathematical models that attempt to describe it [26–29]. in the examples of growth given in fig. 1(a) and fig. 2(a), it becomes clear that under normal culture conditions cscs remain in the inner core of spheroids. this is also deduced from the low p∞ probability of finding a csc at the border of the colony, as shown in fig. 6. if this could be observed in the laboratory, the experiments that reveal a csc preference for hypoxic environments could be 130002-7 papers in physics, vol. 13, art. 130002 (2021) / l. barberis easily explained by this fact. it has been shown experimentally that hypoxia maintains the undifferentiated state of primary glioma cells, slowing down the growth of glioma cells which are in a relatively quiescent stage, increasing colony-forming efficiency and the migration of glioma cells, and elevating the expression of of stem cell markers. however, the expression of markers for stem cell differentiation was reduced after hypoxia treatment [13, 14]. it is also known that the inner core of spheroids and tumors have a lack of oxygen and nutrients, which are first consumed by the outer layers of the tumor [30–33]. in this context, cscs constitute a phenotype that has evolved to survive under hypoxic conditions in order to drive tumor growth. on the other hand, metastasis requires cscs to overcome three major barriers: the probability of an active csc being present at the border of the colony, the probability of it detaching from the tumor and the probability of it finding a suitable place to proliferate. the results of the present work establish that the first of these probabilities is very small, even for relatively small tumors, supporting the rare occurrence of metastasis. for large self-replication rates, possibly generated by environmental conditions, many active cscs would become candidates for detachment. at present, we do not know of any report of a high number of cscs in tumors in vivo or in transplanted xenographs under normal physiological conditions that experimentally support a large6 self-replication rate as part of the metastasis mechanism. an intriguing possibility is that metastasis begins in the early stages of tumor progression [34–36]. this is deduced from fig. 3(a) where the number of active cscs shows a peak for low ps and short times. if this is the actual situation, a csc must detach from the primary tumor in its first week of life, when the probability of being at the border is close to 1, even for low values of self-replication. note that at this time the tumor consists of no more than a dozen cells. this fact leads us to hypothezise that a primary tumor is indeed the metastasis of some small young tumors. according to this hypothesis the primary tumor is in fact the first colony able to fix and grow, while metastatic tumors come from later cscs that take more time to attach successfully to a new location. the detached cells must survive while competing for nutrients and space with the primary spheroids, which at this stage are ruthless adversaries [37]. our two-dimensional model could be extended to three dimensions; this would be more realistic but also computationally more expensive. preliminary results are qualitatively similar to those reported here, butso far we lack the statistics to perform a quantitative comparison. another feature, obtained by changing geometrical constraints in our code, are simulations of non-solid tumors such as haematologic neoplasm; preliminary results show that active cscs are present in a larger proportion there than in solid tumor spheroids. as a consequence, there is a higher probability of cscs detaching and developing metastasis in neoplasms. in summary, percolation provides a way of developing a geometrical theory to support or complement signaling pathways, quorum sensing and other tools frequently used to study metastasis. further experimental research will elucidate whether cscs are the survivors with greatest fitness, as suggested by our present results. acknowledgements this work was supported by secyt-unc (project 05/b457) and conicet (pip 11220150100644), argentina. the author is grateful to dr. c. a. condat and dr. l. vellón for fruitful discussions. also, he thanks dr. fabio giavazzi for suggesting a better approach for eq. (1). appendix: simulation algorithm. the simulations were carried out in an objectoriented paradigm. each cell belongs to one of the following classes: active csc, active dcc, quiescent csc or quiescent csc. first we created an active csc object at the center of a square space large enough to contain the final colony. then for each time step we requested, in random order, that all objects belonging to an active class perform the duplication procedure. in the left-hand column of fig. 7 we depict the flow chart of the duplication procedure, whose steps are illustrated in the righthand column. 130002-8 papers in physics, vol. 13, art. 130002 (2021) / l. barberis move towards is the draw a random is theredid this 500 times? a random direction is defined the new cell moves until there is no core superposition a parent dcc always creates a dcc if a parent csc creates a csc when a csc creates a dcc and doesn't change position when a csc creates a dcc and changes position set new as csc set parent as dcc active dccany cell active csc quiescent dcc quiescent csc destroy the new cell set the parent as quiescent set new as csc set new as dcc create a new cell looking for space subroutine a new cell is created in the position of the parent cell move back draw a random set new as dcc figure 7: left: flow chart for the proliferation process of each cell. right: illustration of the stages and outcomes of the duplication process. the duplication process starts when an active cell, the parent, creates a copy of itself, the new cell, in the same position. then a random angle 0 ≤ rα < 2π is drawn from a uniform distribution and the new cell will advance a certain distance in the direction defined by this angle, such that its border does not overlap with its parent’s core. this distance was determined as 95% of the cell diameter. if in its new position the border of the new cell overlaps with the core of another cell, it moves back to its initial position. another random direction is then defined, repeating the movement sequence. the new cell will move forward and back, randomly changing its direction until there is no overlapping; thus we say that an empty spot is found. after 500 failed attempts to find an empty spot, the new cell is destroyed and the parent cell changes to its respective quiescent class. conversely, if the new cell 130002-9 papers in physics, vol. 13, art. 130002 (2021) / l. barberis finds an empty spot, and if its parent is a dcc, it sets its own class as an active dcc. on the other hand, if the parent is a csc, a new random number 0 ≤ rd < 1 is drawn to be compared with ps. if rd < ps the new cell will set its class to active csc, otherwise the new cell class will be set to active dcc. in the latter case a new random number 1 ≤ rm < 1 is drawn, and if the comparison rm < 1/2 becomes true, the cells will exchange their positions, leaving the csc in the new spot and the dcc in the position of its parent. note that quiescent cells will never be requested to do anything; they do not move or duplicate in these simulations. their only role is to restrict the movement of active cells. as a consequence, the code runs very rapidly, even for large colonies. [1] e batlle, h clevers, cancer stem cells revisited, nat. med. 23, 1124 (2017). [2] c a m la porta, s zapperi, j p sethna, senescent cells in growing tumors: population dynamics and cancer stem cells, plos comput. biol. 8, e1002316 (2012). [3] n a lobo, y shimono, d qian, m f clarke, the biology of cancer stem cells, annu. rev. cell dev. biol. 23, 675 (2007). [4] j stingl, c caldas, molecular heterogeneity of breast carcinomas and the cancer stem cell hypothesis, nat. rev. cancer 7, 791 (2007). [5] y shimono, m zabala, r w cho, n lobo, p dalerba, d qian, m diehn, h liu, s p panula, e chiao, f m dirbas, g somlo, r a reijo pera, k lao, m f clarke, downregulation of mirna-200c links breast cancer stem cells with normal stem cells, cell 138, 592 (2009). [6] d hanahan, r a weinberg, the hallmarks of cancer review, cell 100, 57 (2000). [7] p jagust, b de luxán-delgado, b parejoalonso, p sancho, metabolism-based therapeutic strategies targeting cancer stem cells, front. pharmacol. 10, 203 (2019). [8] a waisman, f sevlever, m eĺıas costa, m s cosentino, s g miriuka, a c ventura, a s guberman, cell cycle dynamics of mouse embryonic stem cells in the ground state and during transition to formative pluripotency, sci. rep 9, 8051 (2019). [9] l beńıtez, l barberis, c a condat, modeling tumorspheres reveals cancer stem cell niche building and plasticity, physica a 533, 121906 (2019). [10] l beńıtez, l barberis, c a condat, understanding the influence of substrate when growing tumorspheres, bmc cancer (in press) (2021). [11] j wang, x liu, z jiang, l li, z cui, y gao, d kong, x liu, a novel method to limit breast cancer stem cells in states of quiescence, proliferation or differentiation: use of gel stress in combination with stem cell growth factors, oncol. lett. 12, 1355 (2016). [12] y c chen, p n ingram, s fouladdel, s p mcdermott, e azizi, m s wicha, e yoon, highthroughput single-cell derived sphere formation for cancer stem-like cell identification and analysis, sci. rep. 6, 27301 (2016). [13] l persano, e rampazzo, a della puppa, f pistollato, g basso, the three-layer concentric model of glioblastoma: cancer stem cells, microenvironmental regulation, and therapeutic implications, thescientificworldjo. 11, 1829 (2011). [14] p li, c zhou, l xu, h xiao, hypoxia enhances stemness of cancer stem cells in glioblastoma: an in vitro study, int. j. med. sci. 10, 399 (2013). [15] c zhang, y tian, f song, c fu, b han, y wang, salinomycin inhibits the growth of colorectal carcinoma by targeting tumor stem cells, oncol. rep. 34, 2469 (2015). [16] s shankar, d nall, s-n tang, d meeker, j passarini, j sharmar, k srivastava, resveratrol inhibits pancreatic cancer stem cell characteristics in human and krasg12d transgenic mice by inhibiting pluripotency maintaining factors and epithelial-mesenchymal transition, plos one 6, e16530 (2011). 130002-10 https://doi.org/10.1038/nm.4409 https://doi.org/10.1371/journal.pcbi.1002316 https://doi.org/10.1371/journal.pcbi.1002316 https://doi.org/10.1146/annurev.cellbio.22.010305.104154 https://doi.org/10.1146/annurev.cellbio.22.010305.104154 https://doi.org/10.1038/nrc2212 https://doi.org/10.1016/j.cell.2009.07.011 https://doi.org/10.1016/j.cell.2009.07.011 https://doi.org/10.1016/s0092-8674(00)81683-9 https://doi.org/10.3389/fphar.2019.00203 https://doi.org/10.1038/s41598-019-44537-0 https://doi.org/10.1016/j.physa.2019.121906 https://doi.org/10.1016/j.physa.2019.121906 https://doi.org/10.21203/rs.3.rs-67713/v1 https://doi.org/10.21203/rs.3.rs-67713/v1 https://doi.org/10.3892/ol.2016.4757 https://doi.org/10.1038/srep27301 https://doi.org/10.1100/2011/736480 https://doi.org/10.1100/2011/736480 doi:10.7150/ijms.5407 doi:10.7150/ijms.5407 https://doi.org/10.3892/or.2015.4253 https://doi.org/10.1371/journal.pone.0016530 https://doi.org/10.1371/journal.pone.0016530 papers in physics, vol. 13, art. 130002 (2021) / l. barberis [17] a schneider, d spitkovsky, p riess, m molcanyi, n kamisetti, m maegele, j hescheler, u schaefer, “the good into the pot, the bad into the crop!” — a new technology to free stem cells from feeder cells, plos one 3, e3788 (2008). [18] a chen, l wang, s liu, y wang, y liu, m wang, attraction and compaction of migratory breast cancer cells by bone matrix proteins through tumor-osteocyte interactions, sci. rep. 8, 5420 (2018). [19] j p freyer, r m sutherland, regulation of growth saturation and development of necrosis in emt6/ro multicellular spheroids by the glucose and oxygen supply, cancer res. 46, 3504 (1986). [20] c a condat, s a menchón, ontogenetic growth of multicellular tumor spheroids, physica a 371, 76 (2006). [21] s a menchón, c a condat, cancer growth: predictions of a realistic model, phys. rev. e 78, 022901 (2008). [22] l barberis, c a condat, describing interactive growth using vector universalities, ecol. model. 227, 56 (2012). [23] f yonezawa, s sakamoto, m hori, percolation in two-dimensional lattices. i. a technique for the estimation of thresholds, phys. rev. b 40, 636 (1989). [24] k christensen, n r moloney, complexity and criticality, imperial college press, london, uk (2005). [25] m i gonzález, p centres, w lebrecht, a j ramirez-pastor, f nieto, site-bond percolation on triangular lattices: monte carlo simulation and analytical approach, physica a 392, 6330 (2013). [26] k iwata, k kawasaki, n shigesada, a dynamical model for the growth and size distribution of multiple metastatic tumors, j. theor. biol. 203, 177 (2000). [27] s a menchón, c a condat, modeling tumor cell shedding, eur. biophys. j. 38, 479 (2009). [28] j b mcgillen, e a gaffney, n k martin, p k maini, a general reaction-diffusion model of acidity in cancer invasion, j. of math. biol. 68, 1199 (2014). [29] a rhodes, t hillen, a mathematical model for the immune-mediated theory of metastasis, j. theor. biol. 482, 109999 (2019). [30] a i liapis, g g lipscomb, o k crosser, e tsiroyianni-liapis, a model of oxygen diffusion in absorbing tissue, math. modell. 3, 83 (1982). [31] p j robinson, s i rapoport, model for drug uptake by brain tumors: effects of osmotic treatment and of diffusion in brain, j. cerebr. blood f. met. 10, 153 (1990). [32] y jiang, j pjesivac-grbovic, c cantrell, j p freyer, a multiscale model for avascular tumor growth, biophys. j. 89, 3884 (2005). [33] j a bull, f mech, t quaiser, s l waters, h m byrne, mathematical modelling reveals cellular dynamics within tumour spheroids, plos comput. biol. 16, e1007961 (2020). [34] j w gray, evidence emerges for early metastasis and parallel evolution of primary and metastatic tumors, cancer cell 4, 4 (2003). [35] h hosseini, m m obradović, m hoffmann, k l harper, m s sosa, m werner-klein, l k nanduri, c werno, c ehrl, m maneck, n patwary, g haunschild, m gužvić, c reimelt, m grauvogl, n eichner, f weber, a d hartkopf, f a taran, s y brucker, t fehm, b rack, s buchholz, r spang, g meister, j a aguirre-ghiso, c a klein, early dissemination seeds metastasis in breast cancer, nature 540, 552 (2016). [36] n linde, m casanova-acebes, m s sosa, a mortha, a rahman, e farias, k harper, e tardio, i reyes torres, j jones, j condeelis, m merad, j a aguirre-ghiso, macrophages orchestrate breast cancer early dissemination and metastasis, nature commun. 9, 1 (2018). [37] l barberis, m a pasquale, c a condat, joint fitting reveals hidden interactions in tumor growth, j. theor. biol. 365, 420 (2015). 130002-11 https://doi.org/10.1371/journal.pone.0003788 https://doi.org/10.1371/journal.pone.0003788 https://doi.org/10.1038/s41598-018-23833-1 https://cancerres.aacrjournals.org/content/46/7/3504.short https://cancerres.aacrjournals.org/content/46/7/3504.short https://doi.org/10.1016/j.physa.2006.04.082 https://doi.org/10.1103/physreve.78.022901 https://doi.org/10.1103/physreve.78.022901 https://doi.org/10.1016/j.ecolmodel.2011.12.011 https://doi.org/10.1016/j.ecolmodel.2011.12.011 https://doi.org/10.1103/physrevb.40.636 https://doi.org/10.1103/physrevb.40.636 https://doi.org/10.1142/p365 https://doi.org/10.1142/p365 https://doi.org/10.1016/j.physa.2013.09.001 https://doi.org/10.1016/j.physa.2013.09.001 https://doi.org/10.1006/jtbi.2000.1075 https://doi.org/10.1006/jtbi.2000.1075 https://doi.org/10.1007/s00249-008-0398-5 https://doi.org/10.1007/s00285-013-0665-7 https://doi.org/10.1007/s00285-013-0665-7 https://doi.org/10.1016/j.jtbi.2019.109999 https://doi.org/10.1016/0270-0255(82)90014-8 https://doi.org/10.1016/0270-0255(82)90014-8 https://doi.org/10.1038/jcbfm.1990.30 https://doi.org/10.1038/jcbfm.1990.30 https://doi.org/10.1529/biophysj.105.060640 https://doi.org/10.1371/journal.pcbi.1007961 https://doi.org/10.1016/s1535-6108(03)00167-3 https://doi.org/10.1038/nature20785 https://doi.org/10.1038/s41467-017-02481-5 https://doi.org/10.1038/s41467-017-02481-5 https://doi.org/10.1016/j.jtbi.2014.10.038 introduction simulation of two-dimensional colonies percolation theory conclusions papers in physics, vol. 14, art. 140005 (2022) received: 7 july 2021, accepted: 18 february 2022 edited by: a. goñi licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.140005 www.papersinphysics.org issn 1852-4249 thermophysical behavior of mercury-lead liquid alloy n. panthi1,2∗, i. b. bhandari1, i. koirala1† thermophysical properties of compound forming binary liquid mercury-lead alloy at temperature 600 k have been reported as a function of concentration by considering hgpb2 complex using different modelling equations. the thermodynamic properties such as the gibbs free energy, enthalpy of mixing, chemical activity of each component, and microscopic properties such as concentration fluctuation in long-wavelength limit and warrencowley short range order parameter of the alloy are studied by quasi-chemical approximation. this research paper places additional emphasis on the interaction energy parameters between the atoms of the alloy. the theoretical and experimental data are compared to determine the model’s validity. compound formation model, statistical mechanical technique, and improved derivation of the butler equation have all been used to investigate surface tension. the alloy’s viscosity is investigated using the kozlov-ronanov-petrov equation, the kaptay equation, and the budai-benko-kaptay model. the study depicts a weak interaction of the alloy, and the theoretical thermodynamic data derived at 600 k are in good agreement with the experimental results. the surface tension is slightly different in the compound formation model than in the statistical mechanical approach and the butler equation at greater bulk concentrations of lead. the estimated viscosities in each of the three models are substantially identical. i introduction the knowledge of thermophysical characteristics of alloys is regarded as a necessary foundation for developing novel materials. the creation of an alloy is linked to changes in the structure of a system as well as bonding between the constituent atoms. the subject is more intricately understood by studying the interaction and structural rearrangement of constituent atoms during alloy formation. the electrochemical effect, atom size, and constituent element concentration all influence the ∗narayan.755711@cdp.tu.edu.np †iswar.koirala@cdp.tu.edu.np 1 central department of physics, tribhuvan university, kirtipur, kathmandu, nepal. 2 department of physics, patan multiple campus, tribhuvan university, nepal. alloy’s mixing properties, causing atoms of particular elements to align in either a self-coordinated or strong ordering pattern [1–4]. the alloying properties of liquid alloys vary as a function of composition, temperature, and pressure, all of which are important for the materials’ strength, stability, and electrical resistivity. as a result, metallurgists and physicists have been interested in understanding the mixing behavior of metals that produce alloys. however, due to experimental difficulties as well as time limitations, the study of various alloys’ characteristics is still incomplete. different theoreticians have produced numerous concentration-dependent theoretical models to comprehend the alloying behavior of compound forming binary alloys in order to address such challenges and facilitate study as well as speed up the investigation process [5–7]. because of their direct impact on human health, mercury and lead are the most studied metals. our 140005-1 papers in physics, vol. 14, art. 140005 (2022) / n. panthi et al. study focuses on one of the lead alloys, hg-pb, to theoretically determine various properties at 600 k, assuming hgxpby (x = 1, y = 2) complex in the melt, by using compound formation models [6]. lead, being very soft and ductile, is often used commercially as lead alloys [8]. zabdyr [9] explored phase diagram, crystal structure and lattice parameter by varying atomic weight percentage of hg but the detailed thermophysical investigation is incomplete. the properties under investigation include the gibbs free energy of mixing, enthalpy of mixing, chemical activity, concentration fluctuation in long-wavelength limit and warren–cowley shortrange order parameter of the alloy. similarly, concentration-dependent surface tension and viscosity of binary liquid alloys are investigated, these being the most desirable in metallurgical research for specifying the surface and transport properties of liquid mixtures: as such, scientists are attempting to investigate these aspects by proposing several models [10–16] . furthermore, surface segregation, which primarily refers to the concentration disparity between the alloys’ surface and bulk materials, is one of the most essential elements to be investigated in metallurgical research. the difference in surface energy between the alloy’s constituent elements is the fundamental source of this disparity, the element with lower surface energy tending to segregate on the surface [17]. theoretical study indicates that the atom with a larger size tends to segregate on the surface of liquid alloy [18]. the present work also aims to study the surface tension of the alloy with a compound formation model [13]. due to a lack of experimental data, the computed result is compared with two other models: a statistical mechanical approach [12] and an improved derivation of the butler equation [16]. for the study of viscosity, this study employs three models; the kozlov-ronanov-petrov equation [11], the kaptay equation and the budai-benko-kaptay model [10]. ii theoretical formulation i thermodynamic functions let a binary alloy contain na and nb number of a and b atoms respectively. the model assumes the existence of complexes axby in such a way that xa + yb = axby (1) where x and y are small integers. with this assumption, the grand partition function in terms of configurational energy [6] is solved and excess free energy of mixing is obtained as given in eq. (2) by which various properties are obtained. gxsm = rt ∫ c 0 γdc (2) where γ is the activity coefficient ratio of atom a to b, c is the concentration of a atom and r is universal gas constant. after simple mathematical calculation, the solution of eq. (2) is given as gxsm = n[θω+θab∆ωab+θaa∆ωaa+θbb∆ωbb] (3) where θ = c(1 − c) and θj,k’s are the simple polynomials in c that depend on the values of integers x and y, ω is interchange energy, and ∆ωjk are the interaction energy parameters. for a =hg, b =pb, x = 1, y = 2, the values of θj,k’s are found to be [6,19] θaa(c) = 0 (4) θab(c) = 1 6 c + c2 − 5 3 c3 + 1 2 c4 (5) θbb(c) = − 1 4 c + 1 2 c2 − 1 4 c4 (6) the gibbs free energy of mixing for complex formation is given by gm = g xs m + g ideal m = gxsm + rt(c ln c + (1 − c) ln(1 − c)) = rt [ θ ω kbt + θab ∆ωab kbt + θaa ∆ωaa kbt +θbb ∆ωbb kbt + c ln c + (1 − c) ln(1 − c) ] (7) 140005-2 papers in physics, vol. 14, art. 140005 (2022) / n. panthi et al. here θaa is taken as zero because, according to the model used, the value of x is 1. in this case, the probability of a and a pair to be part of the complex is zero, such that the coefficient of ∆ωaa kbt in eq. (6) also tends to zero. if there are no complexes in the alloy, then ∆ωjk is zero. in such a case, the above equation takes the form as given below: gm = rt [ θ ω kbt + c ln c + (1 − c) ln(1 − c) ] (8) the enthalpy of mixing is calculated with the standard thermodynamic relation: hm rt = gm rt − [ dgm rdt ] c,n,p = θ [ ω kbt − 1 kb dω dt ] + θab [ ∆ωab kbt − 1 kb d∆ωab dt ] + θbb [ ∆ωbb kbt − 1 kb d∆ωbb dt ] (9) the activity of each constituent element of the alloy is revealed following the standard thermodynamic relation, rt ln aj(j = a, b) = gm + (1 − c) [ ∂gm ∂cj ] t,p,n (10) now, by solving eqs. (7) and (10), the theoretical value of activity of each constituent component is given as follows: ln aa = gm rt + 1 − c kbt [(1 − 2c)ω + θ′ab∆ωab + θ′bb∆ωbb + ln c 1 − c ] (11) ln ab = gm rt − c kbt [(1 − 2c)ω + θ′ab∆ωab + θ′bb∆ωbb + ln c 1 − c ] (12) where, θ′ab, θ ′ aa and θ ′ bb, respectively, are derivatives of θab, θaa and θbb with respect to concentrations. ii microscopic functions the concentration fluctuation in the longwavelength limit scc(0)for the alloy is given from the relation as [20], scc(0) = rt [ ∂2gm ∂c2 ] t,p,n (13) the value of scc(0) can be obtained by using experimentally observed activities with the help of the following eq. (14). hence the values of scc(0) obtained from this equation are called experimental values. scc(0) = aa(1 − c) [ ∂aa ∂ca ]−1 t,p,n = abc [ ∂ab ∂cb ]−1 t,p,n (14) where aa and ab are observed activities of elements a and b respectively. for simplicity, we can write c and 1-c in place of ca and cb, respectively. solving eqs. (7) and (14), the value of scc(0) is found as, scc(0) = c(1 − c) [ 1 + c(1 − c) ( − 2 ω kbt + θ”ab ∆ωab kbt + θ”bb ∆ωbb kbt )]−1 (15) where θ”jk are second concentration derivatives of θjk. the warren-cowley short-range order parameter (α1) is related to concentration fluctuation in the long-wavelength limit [21,22] as: α1 = (s − 1)[s(z − 1) + 1]−1 (16) where z is coordination number and s = scc(0) sidcc(0) (17) 140005-3 papers in physics, vol. 14, art. 140005 (2022) / n. panthi et al. iii transport property: viscosity at the microscopic level, the mixing nature of molten alloy may be examined in terms of viscosity, which provides basis for some of the most fundamental theories concerning atomic transport qualities. it is regarded as one of the most important thermophysical qualities in metallurgical research, which primarily deals with industrial processes and a variety of natural occurrences. it is influenced by factors such as the liquid’s composition, cohesion energy, and molar volume [23, 24]. the composition dependence of viscosity at 600 k is computed to examine the atomic transport features of the hg-pb alloy. but due to the lack of experimental data, viscosity is compared using three different models: the kozlov-ronanov-petrov equation, the kaptay equation, and the bbk (budai-benkokaptay) model. a kozlov-ronanov-petrov equation in liquids, viscous flow depends on cohesive interaction, this interaction results from geometric and electronic shell effects [25]. the krp equation has been developed to incorporate cohesion interaction in terms of enthalpic effect in order to consider the viscous flow in a liquid alloy. at temperature t, the equation is given as: ln η = c ln c ln ηa + (1 − c) ln ηb − hm 3rt (18) where η and ηj are viscosity of the alloy and viscosity of individual elements a and b, respectively. for the metals, the variation of viscosity with temperature is given as [26] ηj = η0 exp ϵ rt (19) where η0 and ϵ are constants of each metal’s units of viscosity and energy per mole. b kaptay equation kaptay developed an equation by considering the theoretical relationship between the cohesive energy and activation energy of the viscous flow. at temperature t, the equation is: η = hnav cva + (1 − c)vb + v e × exp ( cga + (1 − c)gb − φhm rt ) (20) where h is plank’s constant, nav is avogadro’s number, vj(j = a, b) is the molar volume of pure metal, v e is excess molar volume upon alloy formation, gj is gibb’s energy of activation of the viscous flow in pure metals, and ϕ is a constant whose value is (0.155±0.015) [27]. the gibb’s energy of activation of pure metal is calculated by the following equation: gj = rt ln ( ηjvj hnav ) (21) c bbk (budai-benko-kaptay) model the bbk model is used for the viscosity of multicomponent alloys. at temperature, it is given as: η = lt 1/2(cma + (1 − c)mb)1/2 × (cva + (1 − c)vb + v e)−2/3 × exp [ (ctm,a + (1 − c)tm,b − hm χr ) i t ] (22) where l and i are constants whose values are (1.80±0.39)×10−8j/kmol1/3) 1/2 and (2.34±0.02), respectively, and χ is a semi-empirical parameter having a value equal to 25.4. similarly mj and tmj are, respectively, molar mass and the effective melting temperature of constituent elements of the alloy. iv surface property in metallurgy and industry, the surface properties (surface tension and surface concentration) of liquid alloy or liquid metal are considered to be prime factors for the processing, as well as for the production, of new materials due to their relation with both surface and interface in the liquid metal process [28,29]. the interfacial motion caused by the surface tension of liquid plays a major role in many industrial phenomena, hence the importance given 140005-4 papers in physics, vol. 14, art. 140005 (2022) / n. panthi et al. to the surface and interfacial behaviors of liquid metals in the metallurgical process for solidification, controlling the processes of welding and casting [30]. a compound formation model the model assumes that there is a compound forming tendency in the binary liquid alloy similar to that of the compound forming tendency in the solid state, in the form of short-ranged volume elements, due to the formation of intermetallic compound axby in the melt. the equation in this model is developed by using the grand partition function similar to the quasi chemical approximation. the equation at temperature t is given below: σ = σa + kbt ρ ln cs c + ω ρ [p(fs − f) − qf] + ∆ωab ρ [p(fsab − fab) − qfab] + ∆ωbb ρ [p(fsbb − fbb) − qfbb] (23) = σb + kbt ρ ln 1 − cs 1 − c + ω ρ [p(φs − φ) − qφ] + ∆ωab ρ [p(φsab − φab) − qφab] + ∆ωbb ρ [p(φsbb − φbb) − qφbb] (24) where φ, f, φjk and fjk are bulk concentration functions. similarly, φs, fs, φsjk and f s jk are surface concentration functions, and ρ is the mean area of the surface per atom. for x = 1 and y = 2, the bulk concentration functions are [13,31]: φ = c2 (25) φab = 1 6 + 2(1 − c) − 6(1 − c)2 + 16 3 (1 − c)3 − 3 2 (1 − c)4 (26) φbb = − 1 4 + (1 − c) − 1 2 (1 − c)2 + (1 − c)3 − 3 4 (1 − c)4 (27) f = (1 − c)2 (28) fab = (1 − c)2 + 10 3 (1 − c)3 − 3 2 (1 − c)4 (29) fbb = −(1 − c)2 + 3 4 (1 − c)4 (30) the functions φs, φsjk, f s and fsjk can be obtained from eqs. (26) to (30) by replacing bulk concentration c with surface concentration cs, while p and q are surface coordination fractions that indicate the fraction of the number of nearest neighbors of an atom within its own layer and in the adjoining layers, respectively, and and are related as p + 2q = 1. in a simple cubic crystal, p = 2/3 and q = 1/6. in a bcc crystal, p = 3/5 and q = 1/5, and in close packed crystal, p = 1/2 and q = 1/4. the mean atomic surface area is given by following relation [13]: ρ = ∑ j cjρj (31) the atomic surface area of each component is given as ρj = 1.012 ( vj nav )2/3 (32) b statistical mechanical approach this method is mainly based on the concept of layered structure near the interface. the model connects the surface tension to thermodynamic properties through the activity coefficient (γj) and the interchange energy between the components of an alloy. the equation at temperature t is given as below: 140005-5 papers in physics, vol. 14, art. 140005 (2022) / n. panthi et al. σ = σj + kbt ρ ln csj cjγj + [ p(1 − csj ) 2 − q(1 − cj)2 ] ω ρ (33) c improved derivation of the butler equation according to this model, there exists a monoatomic layer, called surface monolayer, at the surface of a liquid as a separate phase, and it is in thermodynamic equilibrium with the bulk phase. the surface tension (σ) of binary alloy at temperature t is given by the improved butler equation as: σ = s0j sj σ0j + rt sj ln csj cbj + g s,xs j − g b,xs j sj (34) where, σ0j , s 0 j , sj are surface tension of pure liquid metal, molar surface area of pure liquid metal, and partial molar surface area of jth component, respectively. g s,xs j and g b,xs j are partial excess free energy of mixing in the surface and bulk of constituent elements of the alloy, respectively. the molar surface area of pure component is given as: s0j = δ ( m0j λ0j )2/3 n 1/3 av (35) where δ, m0j , λ 0 j, δ and nav are geometrical constant, molar mass, density of each constituent element at its melting temperature, and avogadro’s number, respectively. the value of geometrical constant is expressed as, δ = ( 3fv 4 )2/3 π fs (36) where fv is volume packing fraction and fs is surface packing fraction. for liquid metal, the values of fv and fs are 0.66 and 0.906 respectively [33]. iii results and discussion i thermodynamic and microscopic properties generally, the properties of binary liquid alloys depend on temperature, composition, and pressure. -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0 0.2 0.4 0.6 0.8 1 g m /r t cpb theoretical experimental figure 1: gibbs free energy of mixing versus bulk concentration of pb. our study of the binary alloy hg-pb is carried out at fixed atmospheric pressure and fixed temperature of 600 k as a function of the composition of the alloy. during the study, we assumed complexity with x = 1 and y = 2 and computed different thermodynamic and structural properties for compound forming molten alloys. the different results thus obtained from the study are outlined in the sections below. a thermodynamic properties for the analysis of the thermodynamic properties, we consider eqs. (7), (9), (11), and (12), as mentioned above. for the gibbs free energy of mixing, the interaction energy parameters are determined by the method of successive approximation for several concentrations, following stoichiometry of the hgpb2 alloy with the help of experimental values in the concentration range (0.1 to 0.9) [34]. the approximated values of energy parameters are as follows: ω kbt = 2.139, ∆ωab kbt = −2.264, ∆ωbb kbt = 0.392 to calculate the interaction energy parameters, no statistical methods such as mean square deviation were used to decide the best fit values, hence the 140005-6 papers in physics, vol. 14, art. 140005 (2022) / n. panthi et al. -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 0 0.2 0.4 0.6 0.8 1 h m /r t cpb theoretical experimental figure 2: enthalpy of mixing versus bulk concentration of pb parameters thus obtained are considered reasonable for analysis and have been considered throughout the study of different mixing properties. the computed values of gm/rt are in good agreement with experimental values as shown in fig. 1. the theoretically computed value of free energy of mixing is a minimum of −0.533rt at 0.6 concentration of pb. the calculation of free energy of mixing indicates that the alloy hgpb at molten state is weakly interacting. similarly, being asymmetric at 0.5 concentration, it is classified as an irregular alloy. the temperature derivatives of interaction energy parameters which are used for the theoretical calculation of enthalpy of mixing are obtained by the method of successive approximation. the best fit approximated values of such parameters are: 1 kb dω dt = 0.767, 1 kb d∆ωab dt = −0.3128, 1 kb d∆ωbb dt = 0.429 the plot of enthalpy of mixing versus concentration of lead is shown in fig. 2. it is positive below 0.6 concentration of lead, while above this concentration it is negative, and both computed and experimental values of enthalpy of mixing are in agreement, with small discrepancies. the deviation of alloy from ideal behavior can be examined by chemical activity, a measure of effec 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.2 0.4 0.6 0.8 1 pb hg a c ti v it y cpb theoretical experimental figure 3: chemical activity versus bulk concentration of pb tive concentration in the mixture, as its magnitude depends on the interaction of constituent binary components of the alloy. equations (11) and (12) are used for the calculation of the chemical activity of elements of alloy hg-pb. figure 3 plots experimental and theoretically computed values of the chemical activity of the components of the alloy, showing good agreement between the experimental and theoretical activities of the components in the alloy at temperature 600 k at all concentrations of pb. b microscopic properties it is difficult to perform diffraction experiments on materials at high temperatures. thus, to make the study of local arrangement of atoms in the binary alloy more effective, the concentration fluctuations in the long-wavelength limit (scc(0)) are 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.2 0.4 0.6 0.8 1 s c c (0 ) cpb theoretical ideal experimental figure 4: concentration fluctuation in long-wavelength limit versus bulk concentration of pb 140005-7 papers in physics, vol. 14, art. 140005 (2022) / n. panthi et al. -0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0 0.2 0.4 0.6 0.8 1 α 1 cpb figure 5: warren-cowley short-range order parameter versus bulk concentration of pb considered some of the most important microscopic functions [20, 35]. for any given concentration, if scc(0) < s id cc(0), the alloy is expected to have complex formation in nature and if scc(0) > sidcc(0), the expected nature of the alloy is segregating. the graph of experimental, theoretical and ideal values of scc(0) versus concentration of pb is shown in fig. 4. in the figure, both the experimental and theoretical values of scc(0) lie above the ideal value for lead concentration values below 0.6 , indicating that the alloy has a segregating nature below this concentration of lead, while above this concentration it exhibits an ordering nature. the warren-cowley short range parameter (α1) is considered one of the most powerful parameters for information regarding the arrangement of atoms in the liquid alloys. it provides quantitative information about the degree of local arrangement of atoms in the alloys. its value lies between +1 and -1. the positive value of α1 is considered an indication of a segregating nature, which is complete for α1 = 1, whereas its negative value indicates an ordering nature, and is complete for α1 = −1. similarly the value α1 = 0 indicates the random arrangements of atoms in the liquid mixture. the value of α1 computed as a function of the concentration of pb using eq. (16) is shown in fig. 5 where we took coordination number z = 10. it is observed that the α1 is positive up to 0.6 concentration range of lead, with highest values at a concentration of 0.2, indicating the strong segregating tendency of the alloy. but above a 0.6 concentration of lead, the value of α1 goes on decreasing, showing the ordering tendency of the alloy. 0.0008 0.001 0.0012 0.0014 0.0016 0.0018 0.002 0.0022 0.0024 0.0026 0 0.2 0.4 0.6 0.8 1 η ( p a s) cpb krp kaptay bbk figure 6: viscosity versus bulk concentration of pb ii viscosity for the theoretical calculation of viscosity of hgpb alloy at 600 k, the viscosities of each component (pb and hg) are required for krp and kaptay models. these values are obtained from eq. (19) after substituting the values of η0 and ϵ of the metals as given in reference [26]. the value of enthalpy for different concentrations is used as obtained from eq. (9) and the gibbs energy of activation of each pure metal is obtained from eq. (21). due to the lack of an experimental value for v e, it is taken as zero. in fact, the value of v e is non-zero for a non-ideal alloy, but the contribution of this term is very small for the determination of viscosity [15]. the results obtained from three models are compared as shown in fig. 6. in the models, the viscosity of the liquid alloy increases with the increase in concentration of lead. the figure shows that there is a small deviation of the viscosity computed by bbk model as compared to the others. due to the inability to compare theoretically computed results with experimental results, it becomes difficult to draw conclusions based on the models for the concentration dependence of the viscosity of hg-pb liquid alloy at temperature 600 k. iii surface segregation and surface tension to calculate the surface tension of the alloy hg-pb, the densities and surface tension of individual metals for all models required at 600 k are calculated by using the relations given in reference [26]. for the compound formation model, the same interaction parameters ω and ωjk used in thermody140005-8 papers in physics, vol. 14, art. 140005 (2022) / n. panthi et al. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.2 0.4 0.6 0.8 1 c s cpb butler statistical compound figure 7: surface concentration of lead versus bulk pb concentration namic properties are used. now, writing these values and values of other quantities of both metals in eqs. (23) and (24) and solving them simultaneously, we first obtain surface concentrations of both metals, and then using each surface concentration of the corresponding metals, the surface tension is obtained. a similar method is applied to the other models. for statistical mechanical approach interchange energy, ω = 0.699, obtained from eq. (8), is used. for the improved derivation of the butler model, the bulk and partial excess free energy of mixing of individual lead and mercury in a liquid state at 600 k are taken from reference [34]. the geometrical constant and the ratio of surface excess energy to the bulk excess energy (g s,xs i /g b,xs i ) are respectively considered as 1.061 and 0.818 [33]. kaptay suggested that, in the case of negligible or unknown excess molar volume of the mixture, the partial molar volume can be replaced by the molar volume of the same component. in such a situation, the partial surface area (si) is replaced by the surface area (s0i ) of the same pure component [16,36]. the computed values of surface concentrations and surface tensions from all three models are compared in figs. 7 and 8 respectively. figure 7 shows the increasing pattern of the surface concentration of pb with the increase in bulk concentration of lead in all models. at 600 k the surface tension of mercury is less than the surface tension of lead. this suggests the surface segregating tendency of hg. thus, at higher bulk concentration of pb, two different atoms of the alloy are involved in the formation of chemical complexes 400 410 420 430 440 450 460 470 0 0.2 0.4 0.6 0.8 1 s ( n /m ) cpb butler statistical compound figure 8: surface tension versus bulk pb concentration or intermetallic compounds assumed to be hgpb2, but at lower bulk concentration of pb, the surface of the alloy is enriched with hg atoms. in fig. 8, the surface tension of alloy hg-pb increases gradually with the increase in bulk concentration of pb. the variation of surface tension in the compound formation model at higher bulk concentration of pb than in the other two models is believed to be the cause of consideration of set of the interaction energy parameters because, as we already mentioned, there is the possibility of compound formation at higher bulk concentrations of pb. the compound formation model is expected to give better results than the other two models due to the presence of interaction parameters. however, due to the lack of experimental results, computed results cannot be compared. iv conclusions the present study is a theoretical analysis for the understanding of thermodynamic, structural, transport and surface behavior of the binary liquid alloy hg-pb at 600 k under the assumption of the existence of the hgpb2 complex in the liquid mixture by compound formation model. the study explains the asymmetric behavior of the thermodynamic properties as a function of concentration as well as of a weakly interacting alloy. the theoretical study shows that the alloy has the nature of segregating at a lower concentration of pb, but it shows an ordering nature at higher concentration of pb at 600 k. similarly, the viscosity and surface tension increases with the increase in the concentration of lead. 140005-9 papers in physics, vol. 14, art. 140005 (2022) / n. panthi et al. acknowledgements we are grateful to university grants commission of nepal for research grant. [1] n panthi, i b bhandari, i s jha, i koirala, thermophysical behavior of sodium lead alloy at different temperature, adv. stud. theor. phys. 15, 153 (2021). [2] y plevachuk, v sklyarchuk, g pottlacher, a yakymovych, o tkach, thermophysical properties of some liquid binary mg-based alloys, j. min. metall. b 53, 279 (2017). [3] i koirala, chemical ordering of ag-au alloys in the molten state, j. i. s. t. 22, 191 (2018). [4] d adhikari, b p singh, i s jha, phase separation in na–k liquid alloy, phase transit. 85, 675 (2012). [5] p j flory, thermodynamics of high polymer solutions, j. chem. phys. 10, 51 (1942). [6] a b bhatia, r n singh, thermodynamic properties of compound forming molten alloys in a weak interaction approximation, phys. chem. liq. 11, 343 (1982). [7] a b bhatia, r n singh, a quasi-lattice theory for compound forming molten alloys, phys. chem. liq. 13, 177 (1984). [8] a a nayeb-hashemi, j b clark, the mg-pb (magnesium-lead) system, bull. alloy phas. diagr. 6, 56 (1985). [9] l a zabdyr, c guminski, the hg-pb (mercury-lead) system, j. phase equilib. 14, 734 (1993). [10] j a v butler, the thermodynamics of the surfaces of solutions, proc. r. soc. lond. a 135, 348 (1932). [11] l y kozlov, l m romanov, n n petrov, prediction of multicomponent metallic melt viscosity, izv. vuz. chern metallurg 3, 7 (1983). [12] l c prasad, r n singh, a quasi-lattice model for the thermodynamic properties of au-zn liquid alloys, phys. chem. liq. 22, 1 (1990). [13] r novakovic, e ricci, d giuranno, f gnecco, surface properties of bi–pb liquid alloys, surf. sci. 515, 377 (2002). [14] r novakovic, m l muolo, a passerone, bulk and surface properties of liquid x–zr (x= ag, cu) compound forming alloys, surf. sci. 549, 281 (2004). [15] i budai, m z benkő, g kaptay, comparison of different theoretical models to experimental data on viscosity of binary liquid alloys, mater. sci. forum 537, 489 (2007). [16] g kaptay, improved derivation of the butler equations for surface tension of solutions, langmuir 35, 10987 (2019). [17] k a eldressi, h a eltawahni, m moradi, e r twiname, r e mistler, energy effects in bulk metals, in: reference module in materials science and materials engineering, elsevier (2019). [18] l c prasad, r n singh, g p singh, the role of size effects on surface properties, phys. chem. liq. 27, 179 (1994). [19] r n singh, short-range order and concentration fluctuations in binary molten alloys, can. j. phys. 65, 309 (1987). [20] a b bhatia, d e thornton, structural aspects of the electrical resistivity of binary alloys, phys. rev. b 2, 3004 (1970). [21] m cowley, shortand long-range order parameters in disordered solid solutions, phys. rev. 120, 1648 (1960). [22] b e warren, x-ray diffraction, addisonwesley pub. co., reading (1969). [23] d r poirier, g h geiger, transport phenomena in materials processing, tms publications, warrendale pa (1994). [24] t iida, r i guthrie, the thermophysical properties of metallic liquids: fundamentals, oxford university press, usa (2015). [25] a k starace, c m neal, b cao, m f jarrold, a aguado, j m lópez, correlation between the latent heats and cohesive energies of metal clusters, j. chem. phys. 129, 144702 (2008). [26] e a brandes, g b brook, smithells metals reference book, butterworth-heinemann, oxford (1992). 140005-10 https://doi.org/10.12988/astp.2021.91548 https://doi.org/10.12988/astp.2021.91548 https://doi.org/10.2298/jmmb170622029p https://doi.org/10.2298/jmmb170622029p https://doi.org/10.3126/jist.v22i2.19612 https://doi.org/10.1080/01411594.2011.635903 https://doi.org/10.1080/01411594.2011.635903 https://doi.org/10.1063/1.1723621 https://doi.org/10.1080/00319108208080755 https://doi.org/10.1080/00319108208080755 https://doi.org/10.1080/00319108408080778 https://doi.org/10.1080/00319108408080778 https://doi.org/10.1007/bf02871188 https://doi.org/10.1007/bf02871188 https://doi.org/10.1007/bf02667887 https://doi.org/10.1007/bf02667887 https://doi.org/10.1098/rspa.1932.0040 https://doi.org/10.1098/rspa.1932.0040 https://doi.org/10.1080/00319109008036406 https://doi.org/10.1016/s0039-6028(02)01923-4 https://doi.org/10.1016/s0039-6028(02)01923-4 https://doi.org/10.1016/j.susc.2003.12.006 https://doi.org/10.1016/j.susc.2003.12.006 https://doi.org/10.4028/www.scientific.net/msf.537-538.489 https://doi.org/10.4028/www.scientific.net/msf.537-538.489 https://doi.org/10.1021/acs.langmuir.9b01892 https://doi.org/10.1016/b978-0-12-803581-8.03346-4 https://doi.org/10.1016/b978-0-12-803581-8.03346-4 https://doi.org/10.1016/b978-0-12-803581-8.03346-4 https://doi.org/10.1080/00319109408029523 https://doi.org/10.1080/00319109408029523 https://doi.org/10.1139/p87-038 https://doi.org/10.1139/p87-038 https://doi.org/10.1103/physrevb.2.3004 https://doi.org/10.1103/physrev.120.1648 https://doi.org/10.1103/physrev.120.1648 https://doi.org/10.1063/1.2987720 papers in physics, vol. 14, art. 140005 (2022) / n. panthi et al. [27] g kaptay, a unified equation for the viscosity of pure liquid metals, int. j. mater. res. 96, 24 (2005). [28] t iida, physical properties of liquid metals [iv] surface tension and electronic transport properties of liquid metals, weld. int. 8, 766 (1994). [29] i koirala, b p singh, i s jha, study of segregating nature in liquid al-ga alloys, scientific world 12, 14 (2014). [30] j u brackbill, d b kothe, c zemach, a continuum method for modeling surface tension, j. comput. phys. 100, 335 (1992). [31] n jha, a k mishra, thermodynamic and surface properties of liquid mg–zn alloys, j. alloys compd. 329, 224 (2001). [32] p laty, j c joud, p desre, surface tensions of binary liquid alloys with strong chemical interactions, surf. sci. 60, 109 (1976). [33] g kaptay, a unified model for the cohesive enthalpy, critical temperature, surface tension and volume thermal expansion coefficient of liquid metals of bcc, fcc and hcp crystals, mater. sci. eng. a 495, 19 (2008). [34] r hultgren, p d desai, d t hawkins, m gleiser, kk kelley, selected values of the thermodynamic properties of binary alloys, asm, metals park, ohio (1973). [35] r novakovic, j brillo, thermodynamics, thermophysical and structural properties of liquid fe–cr alloys, j. mol. liq. 200, 153 (2014). [36] g kaptay, a coherent set of model equations for various surface and interface energies in systems with liquid and solid metals and alloys, adv. colloid interface sci. 283, 102212 (2020). 140005-11 https://doi.org/10.3139/ijmr-2005-0004 https://doi.org/10.3139/ijmr-2005-0004 https://doi.org/10.1080/09507119409548692 https://doi.org/10.3126/sw.v12i12.13564 https://doi.org/10.3126/sw.v12i12.13564 https://doi.org/10.1016/0021-9991(92)90240-y https://doi.org/10.1016/s0925-8388(01)01684-x https://doi.org/10.1016/s0925-8388(01)01684-x https://doi.org/10.1016/0039-6028(76)90010-8 https://doi.org/10.1016/j.msea.2007.10.112 https://doi.org/10.1016/j.molliq.2014.09.053 https://doi.org/10.1016/j.cis.2020.102212 introduction theoretical formulation thermodynamic functions microscopic functions transport property: viscosity kozlov-ronanov-petrov equation kaptay equation bbk (budai-benko-kaptay) model surface property compound formation model statistical mechanical approach improved derivation of the butler equation results and discussion thermodynamic and microscopic properties thermodynamic properties microscopic properties viscosity surface segregation and surface tension conclusions papers in physics, vol. 14, art. 140003 (2022) received: 3 january 2022, accepted: 2 february 2022 edited by: k. daniels, l. a. pugnaloni, j. zhao licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.140003 www.papersinphysics.org issn 1852-4249 challenges and opportunities in measuring time-resolved force chain evolution in 3d granular materials ryan c. hurley1,2∗, chongpu zhai3† granular materials are found throughout nature and industry: in landslides, avalanches, and river beds, and also in pharmaceutics, food, and mineral processing. many behaviors of these materials, including the ways in which they pack, deform, flow, and transmit energy, can be fully understood only in the context of inter-particle forces. however, we lack techniques for measuring 3d inter-particle force evolution at subsecond timescales due to technological limitations. measurements of 3d force chain evolution at subsecond timescales would help validate and extend theories and models that explicitly or implicitly consider force chain dynamics in their predictions. here, we discuss open challenges associated with force chain evolution on these timescales, challenges limiting such measurements, and possible routes for overcoming these challenges in the coming decade. i introduction granular materials play prominent roles in landslides, avalanches, earthquakes, and river-bed mass transport, as well as in the pharmaceutical, food, and mineral processing industries [1–3]. the statistics, fluctuations, and organization of force chains – inter-particle forces with magnitudes greater than the average in a cohesion-less material – have been linked to: material stresses [4]; electrical and mechanical energy transport [5–8]; mechanical failure via particle fracture and inter-particle slip [9, 10]; mechanical failure of confining vessels due to stress ∗rhurley6@jhu.edu †zhaichongpu@xjtu.edu.cn 1 mechanical engineering, johns hopkins university, baltimore, md 21218, usa. 2 hopkins extreme materials institute, johns hopkins university, baltimore, md 21218, usa. 3 laboratory for strength and vibration of mechanical structures, school of aerospace, xi’an jiaotong university, xi’an, 710049 china. concentrations [11]; stick-slip on granular fault gouge [12]; hot spot formation during compaction of energetic powders [13]; the dynamics of intruders penetrating granular beds [14]. however, measuring time-resolved 3d force chain evolution at timescales relevant to these and other applications remains a challenge due to technological limitations and the difficulty of such experiments. here, we therefore summarize several open challenges related to dynamic force chain evolution at subsecond timescales, describe the challenges associated with such measurements, and propose possible routes for overcoming these challenges in the coming decade. i a brief history of force measurements researchers have pursued techniques to measure inter-particle forces in granular media since at least the 1970s [19]. both 2d and 3d methods have been developed. 2d methods using photoelastic and rubber discs have been most commonly used [20, 21], as described in depth in several recent review articles [22, 23]. such methods must employ 2d grains that are more compliant and possess slower wave 140003-1 papers in physics, vol. 14, art. 140003 (2022) / r. c. hurley et al. speeds than grains in many natural and industrial applications of granular materials. while the compliant nature of photoelastic grains may alter the interactions of grains relative to their natural counterparts, the slower wave speeds have a benefit: the ratio of compaction wave speed to loading velocity in photoelastic granular matter may be similar to that in dynamically-loaded natural sands even when the loading velocity in the photoelastic case is significantly slower. this change in the timescales of information flow allows comparable dynamic processes, such as penetration, to be imaged at significantly reduced frame rates in 2d photoelastic granular matter as compared to 3d natural granular matter [14]. despite the differences in compliance, wave speeds, and dimensionality, 2d photoelasticity studies have revealed the role of force chain statistics and fluctuations on mechanical properties [20], stick-slip [24], dynamic penetration and compaction [13, 14, 25], and electrical and mechanical wave transmission [6, 8]. a benefit of studies employing 2d photoelastic and rubber discs is that measurements can be made over very short timescales using high-speed imaging. this has enabled studies of force chain dynamics at subsecond (down to millisecond) timescales relevant to processes such as stick-slip, dynamic compaction, and wave transmission [7,18,24–26]. unfortunately, many dynamic processes in granular materials occur in 3d rather than 2d, motivating the need for quantifying dynamically-evolving forces in 3d. advances in 3d full-field imaging, including in confocal microscopy, refractive index-matched scanning (rims), and x-ray computed tomography and diffraction, have enabled the first force measurements in 3d over the past two decades [27–30]. these advances have recently enabled relating force chain evolution to quasi-static granular deformation, failure, and wave transmission [8, 9, 29]. a prevailing challenge impeding 3d measurements of dynamic force chain evolution is the limited timescales accessible by full-field imaging techniques. for example, in 2012, rims was possible with 10 ms exposure time per image, enabling full-field imaging of a 3d granular medium in about one second [27]. however, data transfer rates and maximum write speeds of hard drives were the limiting factors hindering faster rims measurements. even having overcome such technological limitations, it may be impossible to design a rims experiment to quantify subsecond force chain evolution in many cases of interest due to the interaction of index-matched fluids with granular dynamics on such timescales. as another example, 3d x-ray tomography measurements have been made in 20 seconds for glass beads and in less than one second for other materials such as batteries [31, 32]. although tomographic imaging rates are expected to improve with new laboratory [33] and synchrotron [34] capabilities, most tomographic imaging requires sample rotation, which induces centrifugal forces. such centrifugal forces are not present in many applications of interest for 3d granular materials and therefore limit the extent to which 3d force chain dynamics can be meaningfully studied using x-ray tomography. these challenges associated with 3d measurement techniques have limited the experimental study of dynamic force chain evolution in a variety of applications in which they are thought to be important. in the next section, we describe several such applications that would benefit from timeresolved force chain measurements in 3d. we then describe key developments that may provide opportunities for overcoming the current timescale limitations of 3d inter-particle force measurements in the coming years. ii open challenges related to force chain evolution the following list summarizes open challenges related to time-resolved 3d force chain evolution in granular materials, and how measurements of such evolution would benefit our understanding and predictive capacity. such measurements have not yet been made primarily because of the technological limitations and the difficulty of the associated experiments, as described in the prior section. possible routes to overcoming these limitations and difficulties are described in sec. iii. the list of open challenges is partially summarized in fig. 1. 1. stick-slip and intermittent flow (fig.1(a)) – force chain buckling has been considered a possible mechanism of stick-slip and acoustic emissions in sheared granular fault gouge for decades [12, 35–38]. stress-drop events associated with stick-slip and force chain buck140003-2 papers in physics, vol. 14, art. 140003 (2022) / r. c. hurley et al. fault gauge granular flow dynamic compaction creep shear thickening (a) (c) (d) (e) aaab63icbvbns8naej3ur1q/qh69lbbbu0mkqmecf48v7ae0owy2m3bpbhj2j0ij/qtepcji1t/kzx/jps1bwx8mpn6bywzekehh0hw/ndlg5tb2tnm3srd/chhupt7pmdjvjldzlgpdc6jhuks8jqil7ywauxvi3g2md7nffelaidh6xfncfuxhkqgfo5hla6tpsfpz6+4czj14balbgdaw+juyxsxvpeimqtf9z03qz6hgwssfvwap4qlluzrmfusjqrjxs8wtc3jhlrejy20rqrjqf09kvbkzu4htvbqnztxlxf+8forhrz+jkemrr2y5kewlwzjkj5or0jyhnflcmrb2vsimvfognp6kdcfbfxmddbp177reeliqnrtfhgu4g3o4ba9uoan30ii2mjjam7zcm6ocf+fd+vi2lpxi5ht+wpn8ar6njj8=aaab7xicbvdlsgnbeoynrxhfuy9ebopgkewgir4dxjxgma9ilja7mu3gzgozmrxckn/w4kerr/6pn//gsbihtsxokkq66e6kes6m9f1vr7cxubw9u9wt7e0fhb6vj0/arqwa0bzrxoluha3ltnkwzzbtbqipfhgnnwhyo/c7t1qbpusdnsy0fhgkwcwitk5q9w0bctwov/yqvwbaj0fokpcjosh/9yekpijkszg2phf4iq0zrc0jnm5k/dtqbjmjhtgeoxilasjsce0mxthligklxumlfurviqwly6yicp0c27fz9ebif14vtffnmdgzpjzkslwupxxzheavoyhtlfg+dqqtzdytiiyxxss6geouhgd15xxsrlwderv2f1vp1pi4inag53ajavxda+6gcs0g8ajp8apvnvjevhfvy9la8pkzu/gd7/mhmdmpgg== � (b) force chains aaab63icbvbns8naej3ur1q/qh69lbbbu0mkqmecf48v7ae0owy2m3bpbhj2j0ij/qtepcji1t/kzx/jps1bwx8mpn6bywzekehh0hw/ndlg5tb2tnm3srd/chhupt7pmdjvjldzlgpdc6jhuks8jqil7ywauxvi3g2md7nffelaidh6xfncfuxhkqgfo5hla450wk25dxcbsk68gtsgqgty/rqmypyqhigt1ji+5yboz1sjyjlpk4pu8isykr3zvqurvdz42elwobmwyoiesbyvivmovycyqoyzqcb2koots+rl4n9ep8xw1s9elktii7zcfkasyezyx8liam5qziyhtat7k2etqildg0/fhuctvrxoold177reegjumo0ijjkcwtlcggc30ir7aeebgezggv7hzvhoi/pufcxbs04xcwp/4hz+aaeajjm=aaab83icbvdlssnafl2pr1pfvzdubovgqirs1gxbjcsk9gfnkjpppb06k4szg6ge/oybf4q49wfc+tdo2yy09cda4zx7uhdomeph0hw/ndlg5tb2tnm3srd/chhupt7pmcttjldzihpdc6nhuss8jqil76wauxvk3g0nd3o/+8s1eun8inoub4qoyhejrtfkvj9mmpdhvck6g1rrbt1dgkwtrya1knaavl9sngwkx8gknabvuskgoduomoszip8znli2ospetzsmipsgx9w8ixdwgzio0fbfsbbq70roltftfdpjrxfsvr25+j/xzzc6dxirpxnymc0xrzkkmjb5awqongcop5zqpow9lbax1zshralis/bwv7xoold177reegjumo2ijjkcwtlcggc30ir7aeebgktwdk/w5mtoi/pufcxhs06royu/cd5/agurkec= �̇ force chains force chains? impacting intruder force chains figure 1: open challenges in granular mechanics involving force chain dynamics include (a) chain buckling in fault gouge, (b) chain dynamics in granular flows, (c) stress concentrations during dynamic compaction, (d) force chain evolution during creep at the onset of catastrophic flow visualized through diffusing wave spectroscopy (dws), and (e) force chain formation in discontinuous shear thickening. images adapted from [15–17] (open access) and [18] with permission. ling likely occur over millisecond timescales [39]. experimental measurements of force chain buckling in granular fault gouge have been made only in model 2d photoelastic systems [24] and only with coarser time resolution (order of seconds) before and after stress drop events. experiments employing 3d force chain measurements at short timescales, ideally with subsecond and approaching millisecond resolution, would support the notion that force chain buckling is responsible for stress drops and acoustic emissions in granular fault gouge. these measurements could be used to validate and extend models of fault gouge mechanics relevant to short timescales. 2. hypotheses underlying models of granular flow (fig.1(b)) – mathematical models of granular flow, such as non-local fluidity and shear transformation zone theories, feature non-local laplacian terms and other quantities that capture the role of force fluctuations around the core of a rearrangement event [40–42]. such force fluctuations likely occur over the millisecond timescales associated with stress drops [43]. although the kinematics associated with such fluctuations have been made over longer timescales in 2d [44], they have not been made in 3d. experiments quantifying 3d force fluctuations around rearrangement events at millisecond to second timescales would provide some of the first in-situ data with which to calibrate, validate, and extend models that explicitly or implicitly incorporate such fluctuations (e.g., [45, 46]). 3. stress concentrations and hot spots during dynamic compaction (fig.1(c)) – rapid compaction of granular media plays an important role in manufacturing, defense, planetary science, and manufacturing processes [47]. in energetic materials, force chains that develop during rapid compaction are also thought to nucleate hot-spots that eventually lead to material ignition [48]. many compaction and ignition events of scientific and technological importance take place over millisecond or shorter timescales. prior rapid-compaction and impact studies investigating force chains have primarily employed 2d photoelastic and polymer discs [13, 25, 49], or numerical simulations. experiments quantifying force chain evolution during 3d granular compaction events would provide new insight into the joint evolution of porosity, stresses, and forces experienced by these materials, quantities currently available only through numerical modeling despite their critical importance for predicting the outcome 140003-3 papers in physics, vol. 14, art. 140003 (2022) / r. c. hurley et al. of impact and material processing events. 4. creep and the onset of catastrophic flow (fig.1(d)) – creep leading to catastrophic flow occurs in nature in a variety of granular media, including ice and snow prior to avalanches, soils prior to landslides, and river beds prior to submarine landslides [50]. glassy dynamics and force chain fluctuations have been implicated in the progression of creep leading to the onset of flow [50, 51]; however, direct measurement of force chain evolution throughout the creep process has not been made in 3d. two-dimensional measurements of creep have been made across a broad range of timescales and length scales using diffusing wave spectroscopy (dws) (pioneered for granular matter by crassous and colleagues [44, 52, 53]), but force measurements during the creep process have been made neither in 2d nor 3d. depending on the the geometry and stresses imposed on a granular material, creep may occur over a range of timescales, from hours to seconds. experimental measurements of 3d force chain evolution across this range of timescales would validate conceptual models of creep and provide valuable data supporting mechanistic model development. 5. force chain formation in discontinuous shear thickening (fig.1(e)) – discontinuous shear thickening (dst), the ubiquitous increase in viscosity with shear of flowing dense suspensions, is thought to be related to force chain development on millisecond timescales [54, 55]. discontinuous shear thickening has been studied both experimentally and computationally, but no in-situ measurements of force chain development have been made in either 2d or 3d in such studies. such direct measurements would provide vital information needed to calibrate and validate models of dst and related jamming phenomena in suspensions. iii future opportunities for timeresolved force chain measurement the following list summarizes key advances that may enable measuring time-resolved force chain evolution in 3d. it is not meant to be exhaustive but rather to reflect the current perspective of the authors. the list primarily contains technological developments that may enable full-field measurements on millisecond to second timescales. this list is also summarized in fig. 2. 1. high-speed rims with image-based force inference (fig.2(a) and (d)) – rims allows fullfield imaging of granular materials submerged in index-matched fluids and, as described in sec. i, permits imaging rates around 1 hz [27]. by combining rims with quantitative analysis of particle deformation (e.g., in hydrogels [56]), inter-particle forces can be inferred in 3d. further advances in camera hardware and data storage capabilities may enable subsecond full-field imaging with rims. imaging rates currently accessible by rims make it amenable to studies of stick-slip, non-local constitutive laws used to predict slow granular flows, and creep. future subsecond imaging rates will improve resolution of individual force chain buckling events during such phenomena. reducing full-field rims imaging rates far below subsecond resolution is unrealistic due to the interaction of the index-matched fluid dynamics with the “true” dynamics of a granular material. nevertheless, we envision that carefully designed experiments employing rims may provide new and important insight into force chain fluctuations on second timescales in compliant materials like hydrogels. 2. time-resolved 3d x-ray imaging for compliant materials (fig.2(b) and (d)) – advances in micro-focus computed tomography hardware and software in the past decade provide a route to infer contact forces between compliant grains (e.g., [57]) at synchrotron [58] or laboratory settings [33] by using quantitative analysis of particle deformation (e.g., in rubber [57]). imaging rates can approach 1 hz but, as described in sec. i, are now limited by maximum rotation rates at which centrifugal forces become significant. a gantry-based xray device eliminates the need for sample rotation and therefore eliminates centrifugal forces [33]; however, such a device will also have xray flux limitations that practically limit fullfield imaging to about 1 hz. imaging rates of 140003-4 papers in physics, vol. 14, art. 140003 (2022) / r. c. hurley et al. contact deformation, ! undeformeddeformed grain a grain b camera la se r stage granular packing scan volume forces granular packing (a) (b) (c) (e) (c) rims 2d images 3d reconstruction force inference tomography compliant materials stiff materials short time scales detector diffracted x-rays detector for tomography x-ray beam granular material load frame rotation figure 2: possible approaches to dynamic force inference. (a) rims. (b) xrct. (c) x-ray tomography and diffraction (d) time-resolved 2d x-ray imaging with 3d reconstruction. (e) 2d imaging with 3d microstructure. (d) an illustration of how forces are inferred using rims and xrct data for compliant particles. (e) adapted from [30] with permission. about 1 hz make x-ray tomography amenable to studies of stick-slip, non-local constitutive laws used to predict slow granular flows, and creep. as is the case for rims, x-ray tomography alone only allows inter-particle force measurements for compliant materials, and the timescale capabilities again limit access to the dynamics of force chain buckling events. nevertheless, we envision that carefully designed experiments employing x-ray tomography – either rotation-based or gantry-based – may provide new and important insight into stick-slip, non-local constitutive laws, and creep in compliant materials. 3. time-resolved 3d x-ray imaging and diffraction for stiff materials (fig.2(c)) – rapid advances in 3d x-ray diffraction have, in the past five years, provided a means to study forces in stiff sand-like materials [29], shedding light on the role of forces in material failure, ultrasound transmission, and rearrangements [8, 9]. the technical approach to studying forces with x-ray imaging and diffraction involves using tomography for quantifying packing structure (particle shapes, sizes, and contacts) and diffraction for quantifying particle stress tensors. measurements have historically been made in as little as about 10 minutes. upgrades in x-ray flux and hardware at synchrotron facilities are likely to bring rates to as low as 1 hz in the coming decades. achieving rates below 1 hz is impractical due to the need for sample rotation which will induce centrifugal forces to samples. combined imaging and diffraction therefore will allow studies of stick-slip, non-local constitutive laws used to predict slow granular flows, and creep in stiff materials, but will not provide access to the subsecond dynamics of force chain buckling. 4. time-resolved 2d imaging with 3d reconstruction (fig.2(e)) – time-resolved 2d imaging, using laboratory-based x-ray systems or synchrotron-based x-ray phase contrast imaging, has emerged as a powerful tool for studying dynamic compaction and flow of granular materials [30, 59–61]. these imaging techniques provide only 2d images, but allow temporal image spacing as low as 153 ns at synchrotron facilities and 30 hz in laboratory settings [30, 62]. while such imaging 140003-5 papers in physics, vol. 14, art. 140003 (2022) / r. c. hurley et al. is not 3d, novel algorithms have emerged in the last decade that enable reconstruction of 3d microstructures from 2d images, including flash x-ray tomography, projection-based digital volume correlation, and fourier-space reconstruction methods [30, 61, 63–65]. we envision that these techniques, coupled with appropriate deformation-based force inference algorithms, can be used to study force chain dynamics at time scales relevant to all of the open challenges described in sec. ii. the challenge of using this approach for studying force chain dynamics is to develop robust algorithms for 3d microstructure reconstruction and force inference from 2d time-resolved images. iv discussion and conclusion time-resolved 3d force chain measurements at millisecond to second timescales have been challenging due to hardware and technical limitations. in this paper, we articulated five topics in which dynamic force chain evolution plays an important role across these timescales: stick-slip and intermittent flow; hypotheses underlying granular flow models; stress concentrations and hot spots during dynamic compaction; creep prior to catastrophic flow; force chain formation in discontinuous shear thickening. we also articulated four opportunities for making force chain measurements across a range of time scales, some or all of which should be possible within the next decade. these included: highspeed rims; time-resolved 3d x-ray imaging in compliant materials; time-resolved 3d x-ray imaging and diffraction for stiff materials; time-resolved 2d imaging with 3d reconstruction. the challenge to interested researchers in the granular materials community is to carefully develop experiments and robust algorithms that take advantage of technological developments for such force measurements in the coming decade. acknowledgements rch acknowledges support from the u.s. national science foundation career award no. cbet-1942096. [1] h m jaeger, s r nagel, r p behringer, granular solids, liquids, and gases, rev. mod. phys. 68, 1259 (1996). [2] r m nedderman, et al., statics and kinematics of granular materials, cambridge university press, cambridge (1992). [3] m oda, k iwashita, mechanics of granular materials: an introduction, crc press, london (2020). [4] f radjai, d e wolf, m jean, j-j moreau, bimodal character of stress transmission in granular packings, phys. rev. lett. 80, 61 (1998). [5] c zhai, y gan, d hanaor, g proust, stressdependent electrical transport and its universal scaling in granular materials, extreme mech. lett. 22, 83 (2018). [6] o birkholz, y gan, m kamlah, modeling the effective conductivity of the solid and the pore phase in granular materials using resistor networks, powder technol. 351, 54 (2019). [7] e t owens, k e daniels, sound propagation and force chains in granular materials, europhys. lett. 94, 54005 (2011). [8] c zhai, e b herbold, r c hurley, the influence of packing structure and interparticle forces on ultrasound transmission in granular media, proc. natl. acad. sci. u.s.a. 117, 16234 (2020). [9] r hurley, j lind, d pagan, m akin, e herbold, in situ grain fracture mechanics during uniaxial compaction of granular solids, j. mech. phys. solids 112, 273 (2018). [10] c zhai, e herbold, s hall, r hurley, particle rotations and energy dissipation during mechanical compression of granular materials, j. mech. phys. solids 129, 19 (2019). [11] l vanel, p claudin, j-p bouchaud, m cates, e clément, j wittmer, stresses in silos: comparison between theoretical models and new experiments, phys. rev. lett. 84, 1439 (2000). [12] b ferdowsi, m griffa, r guyer, p johnson, c marone, j carmeliet, microslips as precursors of large slip events in the stick-slip dynamics of sheared granular layers: a discrete 140003-6 https://doi.org/10.1103/revmodphys.68.1259 https://doi.org/10.1103/revmodphys.68.1259 https://doi.org/10.1201/9781003077817 https://doi.org/10.1201/9781003077817 https://doi.org/10.1103/physrevlett.80.61 https://doi.org/10.1016/j.eml.2018.05.005 https://doi.org/10.1016/j.eml.2018.05.005 https://doi.org/10.1016/j.powtec.2019.04.005 https://doi.org/10.1209/0295-5075/94/54005 https://doi.org/10.1209/0295-5075/94/54005 https://doi.org/10.1073/pnas.2004356117 https://doi.org/10.1073/pnas.2004356117 https://doi.org/10.1016/j.jmps.2017.12.007 https://doi.org/10.1016/j.jmps.2017.12.007 https://doi.org/10.1016/j.jmps.2019.04.018 https://doi.org/10.1016/j.jmps.2019.04.018 https://doi.org/10.1103/physrevlett.84.1439 papers in physics, vol. 14, art. 140003 (2022) / r. c. hurley et al. element model analysis, geophys. res. lett. 40, 4194 (2013). [13] s bardenhagen, j brackbill, dynamic stress bridging in granular material, j. appl. phys. 83, 5732 (1998). [14] a h clark, a j petersen, r p behringer, collisional model for granular impact dynamics, phys. rev. e 89, 012201 (2014). [15] t hatano, c narteau, p shebalin, common dependence on stress for the statistics of granular avalanches and earthquakes, sci. rep. 5, 1 (2015). [16] t t vo, s nezamabadi, p mutabaruka, jy delenne, f radjai, additive rheology of complex granular flows, nat. commun. 11, 1 (2020). [17] n s deshpande, d j furbish, p e arratia, d j jerolmack, the perpetual fragility of creeping hillslopes, nat. commun. 12, 1 (2021). [18] r hurley, k lim, g ravichandran, j andrade, dynamic inter-particle force inference in granular materials: method and application, exp. mech. 56, 217 (2016). [19] a drescher, g d j de jong, photoelastic verification of a mechanical model for the flow of a granular material, j. mech. phys. solids 20, 337 (1972). [20] t s majmudar, r p behringer, contact force measurements and stress-induced anisotropy in granular materials, nature 435, 1079 (2005). [21] r hurley, e marteau, g ravichandran, j e andrade, extracting inter-particle forces in opaque granular materials: beyond photoelasticity, j. mech. phys. solids 63, 154 (2014). [22] k e daniels, j e kollmer, j g puckett, photoelastic force measurements in granular materials, rev. sci. instrum. 88, 051808 (2017). [23] a a zadeh, j barés, t a brzinski, k e daniels, et al., enlightening force chains: a review of photoelasticimetry in granular matter, granul. matter 21, 1 (2019). [24] n w hayman, l ducloué, k l foco, k e daniels, granular controls on periodicity of stick-slip events: kinematics and force-chains in an experimental fault, pure appl. geophys. 168, 2239 (2011). [25] a shukla, c damania, experimental investigation of wave velocity and dynamic contact stresses in an assembly of disks, exp. mech. 27, 268 (1987). [26] a thomas, n vriend, photoelastic study of dense granular free-surface flows, phys. rev. e 100, 012902 (2019). [27] j a dijksman, f rietz, k a lőrincz, m van hecke, w losert, invited article: refractive index matched scanning of dense granular materials, rev. sci. instrum. 83, 011301 (2012). [28] j brujić, s f edwards, d v grinev, i hopkinson, d brujić, h a makse, 3d bulk measurements of the force distribution in a compressed emulsion system, faraday discuss. 123, 207 (2003). [29] r hurley, s hall, j andrade, j wright, quantifying interparticle forces and heterogeneity in 3d granular materials, phys. rev. lett. 117, 098005 (2016). [30] a gupta, r crum, c zhai, k ramesh, r hurley, quantifying particle-scale 3d granular dynamics during rapid compaction from timeresolved in situ 2d x-ray images, j. appl. phys. 129, 225902 (2021). [31] m scheel, r seemann, m brinkmann, m di michiel, a sheppard, b breidenbach, s herminghaus, morphological clues to wet granular pile stability, nat. mater. 7, 189 (2008). [32] d p finegan, m scheel, j b robinson, et al., in-operando high-speed tomography of lithiumion batteries during thermal runaway, nat. commun. 6, 1 (2015). [33] l hunter, j dewanckele, evolution of microct: moving from 3d to 4d, microscopy today 29, 28 (2021). 140003-7 https://doi.org/10.1002/grl.50813 https://doi.org/10.1002/grl.50813 https://doi.org/10.1063/1.367429 https://doi.org/10.1063/1.367429 https://doi.org/10.1103/physreve.89.012201 https://doi.org/10.1038/srep12280 https://doi.org/10.1038/srep12280 https://doi.org/10.1038/s41467-020-15263-3 https://doi.org/10.1038/s41467-020-15263-3 https://doi.org/10.1038/s41467-021-23979-z https://doi.org/10.1007/s11340-015-0063-8 https://doi.org/10.1007/s11340-015-0063-8 https://doi.org/10.1016/0022-5096(72)90029-4 https://doi.org/10.1016/0022-5096(72)90029-4 https://doi.org/10.1038/nature03805 https://doi.org/10.1038/nature03805 https://doi.org/10.1016/j.jmps.2013.09.013 https://doi.org/10.1063/1.4983049 https://doi.org/10.1007/s10035-019-0942-2 https://doi.org/10.1007/s00024-011-0269-3 https://doi.org/10.1007/s00024-011-0269-3 https://doi.org/10.1007/bf02318093 https://doi.org/10.1007/bf02318093 https://doi.org/10.1103/physreve.100.012902 https://doi.org/10.1103/physreve.100.012902 https://doi.org/10.1063/1.3674173 https://doi.org/10.1039/b204414e https://doi.org/10.1039/b204414e https://doi.org/10.1103/physrevlett.117.098005 https://doi.org/10.1103/physrevlett.117.098005 https://doi.org/10.1063/5.0051642 https://doi.org/10.1063/5.0051642 https://doi.org/10.1038/nmat2117 https://doi.org/10.1038/nmat2117 https://doi.org/10.1038/ncomms7924 https://doi.org/10.1038/ncomms7924 https://doi.org/10.1017/s1551929521000651 https://doi.org/10.1017/s1551929521000651 papers in physics, vol. 14, art. 140003 (2022) / r. c. hurley et al. [34] j deng, c preissner, j a klug, s mashrafi, et al., the velociprobe: an ultrafast hard x-ray nanoprobe for high-resolution ptychographic imaging, rev. sci. instrum. 90, 083701 (2019). [35] r l biegel, c g sammis, j h dieterich, the frictional properties of a simulated gouge having a fractal particle distribution, j. struct. geol. 11, 827 (1989). [36] c g sammis, s j steacy, the micromechanics of friction in a granular layer, pure appl. geophys. 142, 777 (1994). [37] a tordesillas, j zhang, r behringer, buckling force chains in dense granular assemblies: physical and numerical experiments, geomech. geoengin. 4, 3 (2009). [38] a rechenmacher, s abedi, o chupin, evolution of force chains in shear bands in sands, géotechnique 60, 343 (2010). [39] k gao, r guyer, e rougier, c x ren, p a johnson, from stress chains to acoustic emission, phys. rev. lett. 123, 048003 (2019). [40] m l falk, m toiya, w losert, shear transformation zone analysis of shear reversal during granular flow, arxiv preprint arxiv:0802.1752 (2008). [41] l bocquet, a colin, a ajdari, kinetic theory of plastic flow in soft glassy materials, phys. rev. lett. 103, 036001 (2009). [42] k kamrin, g koval, nonlocal constitutive relation for steady granular flow, phys. rev. lett. 108, 178301 (2012). [43] d v denisov, k a lőrincz, w j wright, t c hufnagel, a nawano, x gu, j t uhl, k a dahmen, p schall, universal slip dynamics in metallic glasses and granular matter–linking frictional weakening with inertial effects, sci. rep. 7, 1 (2017). [44] a le bouil, a amon, s mcnamara, j crassous, emergence of cooperativity in plasticity of soft glassy materials, phys. rev. lett. 112, 246001 (2014). [45] d l henann, k kamrin, a predictive, sizedependent continuum model for dense granular flows, p. natl. acad. sci. usa 110, 6730 (2013). [46] k kamrin, d l henann, nonlocal modeling of granular flows down inclines, soft matter 11, 179 (2015). [47] k wünnemann, g collins, h melosh, a strain-based porosity model for use in hydrocode simulations of impacts and implications for transient crater growth in porous targets, icarus 180, 514 (2006). [48] m j cherukara, t c germann, e m kober, a strachan, shock loading of granular ni/al composites. part 1: mechanics of loading, j. phys. chem. c 118, 26377 (2014). [49] h kocharyan, n karanjgaokar, influence of interactions between multiple point defects on wave scattering in granular media, granul. matter 24, 1 (2022). [50] d j jerolmack, k e daniels, viewing earth’s surface as a soft-matter landscape, nat. rev. phys. 1, 716 (2019). [51] e andò, j dijkstra, e roubin, c dano, e boller, a peek into the origin of creep in sand, granul. matter 21, 1 (2019). [52] m erpelding, a amon, j crassous, diffusive wave spectroscopy applied to the spatially resolved deformation of a solid, phys. rev. e 78, 046104 (2008). [53] a amon, a bruand, j crassous, e clément, et al., hot spots in an athermal system, phys. rev. lett. 108, 135502 (2012). [54] c-p hsu, s n ramakrishna, m zanini, n d spencer, l isa, roughness-dependent tribology effects on discontinuous shear thickening, p. natl. acad. sci. usa 115, 5117 (2018). [55] r seto, r mari, j f morris, m m denn, discontinuous shear thickening of frictional hard-sphere suspensions, phys. rev. lett. 111, 218301 (2013). 140003-8 https://doi.org/10.1063/1.5103173 https://doi.org/10.1063/1.5103173 https://doi.org/10.1016/0191-8141(89)90101-6 https://doi.org/10.1016/0191-8141(89)90101-6 https://doi.org/10.1007/bf00876064 https://doi.org/10.1007/bf00876064 https://doi.org/10.1080/17486020902767347 https://doi.org/10.1080/17486020902767347 https://doi.org/10.1680/geot.2010.60.5.343 https://doi.org/10.1103/physrevlett.123.048003 https://arxiv.org/pdf/0802.1752.pdf https://arxiv.org/pdf/0802.1752.pdf https://doi.org/10.1103/physrevlett.103.036001 https://doi.org/10.1103/physrevlett.103.036001 https://doi.org/10.1103/physrevlett.108.178301 https://doi.org/10.1103/physrevlett.108.178301 https://doi.org/10.1038/srep43376 https://doi.org/10.1038/srep43376 https://doi.org/10.1103/physrevlett.112.246001 https://doi.org/10.1103/physrevlett.112.246001 https://doi.org/10.1073/pnas.1219153110 https://doi.org/10.1073/pnas.1219153110 https://doi.org/10.1039/c4sm01838a https://doi.org/10.1039/c4sm01838a https://doi.org/10.1016/j.icarus.2005.10.013 https://doi.org/10.1021/jp507795w https://doi.org/10.1021/jp507795w https://doi.org/10.1007/s10035-021-01177-4 https://doi.org/10.1007/s10035-021-01177-4 https://doi.org/10.1038/s42254-019-0111-x https://doi.org/10.1038/s42254-019-0111-x https://doi.org/10.1007/s10035-018-0863-5 https://doi.org/10.1103/physreve.78.046104 https://doi.org/10.1103/physreve.78.046104 https://doi.org/10.1103/physrevlett.108.135502 https://doi.org/10.1103/physrevlett.108.135502 https://doi.org/10.1073/pnas.1801066115 https://doi.org/10.1073/pnas.1801066115 https://doi.org/10.1103/physrevlett.111.218301 https://doi.org/10.1103/physrevlett.111.218301 papers in physics, vol. 14, art. 140003 (2022) / r. c. hurley et al. [56] n brodu, j a dijksman, r p behringer, spanning the scales of granular materials through microscopic force imaging, nat. commun. 6, 1 (2015). [57] m saadatfar, a p sheppard, t j senden, a j kabla, mapping forces in a 3d elastic assembly of grains, j. mech. phys. solids 60, 55 (2012). [58] x xiao, f fusseis, f de carlo, x-ray fast tomography and its applications in dynamical phenomena studies in geosciences at advanced photon source, proc. spie 8506, 107 (2012). [59] n d parab, b claus, m c hudspeth, j t black, a mondal, j sun, k fezzaa, x xiao, s luo, w chen, experimental assessment of fracture of individual sand particles at different loading rates, int. j. impact eng. 68, 8 (2014). [60] b jensen, d montgomery, a iverson, c carlson, b clements, m short, d fredenburg, xray phase contrast imaging of granular systems, in: shock phenomena in granular and porous materials, eds. t vogler, d fredenburg, pag. 195, springer (2019). [61] j baker, f guillard, b marks, i einav, xray rheography uncovers planar granular flows despite non-planar walls, nat. commun. 9, 1 (2018). [62] m rutherford, j derrick, d chapman, g collins, d eakins, insights into local shockwave behavior and thermodynamics in granular materials from tomography-initialized mesoscale simulations, j. appl. phys. 125, 015902 (2019). [63] m khalili, s brisard, m bornert, p aimedieu, j-m pereira, j-n roux, discrete digital projections correlation: a reconstruction-free method to quantify local kinematics in granular media by x-ray tomography, exp. mech. 57, 819 (2017). [64] e andò, b marks, s roux, single-projection reconstruction technique for positioning monodisperse spheres in 3d with a divergent x-ray beam, meas. sci. technol. 32, 095405 (2021). [65] f guillard, b marks, i einav, dynamic x-ray radiography reveals particle size and shape orientation fields during granular flow, sci. rep. 7, 1 (2017). 140003-9 https://doi.org/10.1038/ncomms7361 https://doi.org/10.1038/ncomms7361 https://doi.org/10.1016/j.jmps.2011.10.001 https://doi.org/10.1016/j.jmps.2011.10.001 https://doi.org/10.1117/12.936331 https://doi.org/10.1016/j.ijimpeng.2014.01.003 https://doi.org/10.1016/j.ijimpeng.2014.01.003 https://doi.org/10.1007/978-3-030-23002-9_7 https://doi.org/10.1007/978-3-030-23002-9_7 https://doi.org/10.1007/978-3-030-23002-9_7 https://doi.org/10.1038/s41467-018-07628-6 https://doi.org/10.1038/s41467-018-07628-6 https://doi.org/10.1063/1.5048591 https://doi.org/10.1063/1.5048591 https://doi.org/10.1007/s11340-017-0263-5 https://doi.org/10.1007/s11340-017-0263-5 https://doi.org/10.1088/1361-6501/abfbfe https://doi.org/10.1088/1361-6501/abfbfe https://doi.org/10.1038/s41598-017-08573-y https://doi.org/10.1038/s41598-017-08573-y introduction a brief history of force measurements open challenges related to force chain evolution future opportunities for time-resolved force chain measurement discussion and conclusion papers in physics, vol. 14, art. 140004 (2022) received: 20 may 2021, accepted: 7 october 2021 edited by: a. b. márquez licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.140004 www.papersinphysics.org issn 1852-4249 experimental study on the similarity of gas discharge in low-pressure argon gaps prijil mathew1∗, sajith t. mathews1, paul issac1, p. j. kurian1 through experiments and theoretical analysis, we investigated the similarity of gas discharge in low-pressure argon gaps between two plane-parallel electrodes. we found that the breakdown voltages depended not only on gap length and the product of gas pressure and gap length, but also on the aspect ratio of the gap, i.e. ub = f(pd, d/r). when we considered similar discharge gaps, the radius r, gap length d and gas pressure p fulfilled the conditions of p1r1 = p2r2 and p1d1 = p2d2. in this situation the reduced field e/p was also constant. the voltage-current characteristic curves of similar gaps were approximately the same, which is a novel experimental result. comparison of the discharge physical parameters of the scaled-down gap and prototype gap shows that the proportional relations can be derived from the similarity law. our experimental results provide some instructions on extrapolating two similar gaps and their discharge properties. application of the similarity law is straightforward when we scale the discharges up or down if they are too small or large. i introduction paschen’s famous law states that the breakdown voltage of a gas gap does not depend individually on the gap length d and gas pressure p, but depends on their product pd; i.e. ub = f(pd) [1–9]. according to townsend paschen’s law is a unique case, with a uniform electric field, of a more general similarity theorem which can be used for breakdowns in non-uniform fields if they are dependent on ionisation by electron collision with neutral particles [2, 3, 10, 11]. von engel has discussed and summarised the similarity theorem successfully [6, 12–14]. he specified that under certain conditions similar discharges can be produced in gap∗prijilmk@gmail.com 1 department of physics, st. berchmans college campus, mahatma gandhi university, kottayam, 686101 kerala, india. sthat have the same geometrical shape but different linear dimensions. it is also notable that the similar discharges have all the physical properties, such as density of the charged particles and current density, in the correct proportions, and also display similar voltage-current characteristics [2,10,15–19]. it is also possible to use the known properties of the discharge in one gap to derive the characteristics of the discharges in another geometrically identical gap due to the similarity of gas discharge; this is useful in cases where experimental studies may not be practical or even possible [2, 10, 16, 20]. one pre-condition for such an experiment is to verify that there is a similar discharge in the specified geometrically similar gaps. progress with similar discharge has been made recently in microdischarge situations with the huge glow discharge of the international thermonuclear experimental reactor (iter) and the picoseconds pulse discharge [2, 5, 10, 21, 22]. 140004-1 papers in physics, vol. 14, art. 140004 (2022) / m. prijil et al. in this paper we have used experiments and theoretical analysis to investigate the similarity of gas discharge in low-pressure argon gaps between two plane-parallel electrodes. the results show that the breakdown voltages of these gaps depend not only on the product of gas pressure (p) and gap length (d) but also on the aspect ratio of the gap; i.e. ub = f(pd,d/r). theoretically, it has also been proved that ub = f(pd,d/r), the non-uniform electric field between plane-parallel electrodes, is a special case of the similarity theorem of gas discharge [2, 10, 16]. the experiments show that similar glow discharges exist only in two gaps with a limited scaled-down factor k. similarly, the theoretical analysis shows that processes such as stepwise ionisation and inelastic collision of the second kind violate the similarity of the discharge as k increases. the voltage-current (v −i) characteristics of the glow discharge region studied in similar conditions also confirm the similarity in the gas breakdown. ii conditions for similar discharges the first necessary condition for similarity discharges in two geometrically similar gaps is that the product of gas pressure (p) and gap length (d) for these two gaps should be the same [2, 10, 19], i.e. p1d1 = p2d2, which ensures that the total number of collisions for one electron to cross the gap is the same [23]. the second condition is that the reduced field in these two gaps should be the same [2, 19, 24–27]; i.e. e1/p1 = e2/p2 for the uniform electric fields or e1(p1x1)/p1 = e2(p2x2)/p2 for the non-uniform fields at the corresponding points where p1x1 = p2x2, thus ensuring that the average energy of the electrons is the same [16, 23, 28]. one additional condition is required for the similarity discharges in two geometrically similar gaps: the discharges in these two gaps should be dominated by the physical processes allowable for a similar discharge, known as allowed processes [2, 29]. many physical processes are happening in the gas discharge, such as stepwise ionisation, ionisation by single collision, diffusion, photoionisation, penning ionisation, recombination and electron attachment [2, 10, 30]. in appendix 1 of his book von engel has shown how to test whether a process is forbidden or allowable for a similar discharge. an alfigure 1: experimental setup. lowed process is any process in which the change rate of particle density fulfils the conditions stated in eq. (1) [2, 29, 31]. otherwise it is a forbidden process that is not forbidden for gas discharge, but not allowed for similar discharge [2, 10, 29]. it is not possible to distinguish whether a discharge is dominated by forbidden or allowed processes, and this is not controllable.( dn dt ) gap1 = 1 k3 ( dn dt ) gap2 , (1) where k is a scaled-down factor or the ratio of the linear dimension of gap 1 to 2 [2, 10, 32]; n is the particle density. iii experimental setup the setup consisted of a cylindrical vacuum chamber made of stainless steel, about 30 cm in diameter and 80 cm in length. an aluminium stand was placed at a height of about 100 cm from the ground, on which the chamber was mounted horizontally. a digital pirani gauge (model ivdg – 1000) was also attached to the aluminium stand, showing the pressure inside the vacuum chamber in millibars. a rotary pump was connected to the cylindrical chamber to evacuate the pressure inside the chamber. a gas inlet was used to fill the chamber with gas. a glass discharge tube was placed inside the vacuum chamber. the electrodes (anode and cathode) 140004-2 papers in physics, vol. 14, art. 140004 (2022) / m. prijil et al. 0 5 10 15 20 25 30 35 40 350 400 450 500 550 600 650 700 b re a kd o w n v o lt a g e (v ) pd (torr cm) d=30cm r=4cm d/r=7.5 d=40cm r=4cm d/r=10 d=50cm r=4cm d/r=12.5 figure 2: paschen’s curves for constant electrode radius (r) and varying inter-electrode distance (d). were placed inside the discharge tubeusing a wilson fed through arrangement from the end of the glass tube. this arrangement enabled us to change the separation during the experiment. the electrodes were made of stainless steel of about 1 mm thickness and a diameter of a few centimeters (5 cm, 8 cm and 10 cm, see table 1). thin circular mica sheets of about 7 cm diameter were placed around the electrodes to prevent field lines beyond the electrodes. we used a dc voltage supply that varied over a range of 0 to 1000 v, with a maximum output current of 1 a. to measure and limit the discharge current we connected a resistor (variable) in series. table 1: parameter values are chosen for similarity verification. gap d [cm] r [cm] p [mbar] d/r a gap 1 50.0 5.0 0.20 10 1.00 gap 2 40.0 4.0 0.25 10 1.25 gap 3 25.0 2.5 0.40 10 2.00 gap 4 25.0 5.0 0.20 5 1.00 gap 5 20.0 4.0 0.25 5 1.25 gap 6 12.5 2.5 0.40 5 2.00 gap 7 5.0 5.0 0.20 1 1.00 gap 8 4.0 4.0 0.25 1 1.25 gap 9 2.5 2.5 0.40 1 2.00 0 5 10 15 20 25 30 35 40 450 500 550 600 650 b re a k d o w n v o lt a g e ( v ) pd (torr cm) d=25cm r=2.5cm d/r=10 d=40cm r=4cm d/r=10 d=50cm r=5cm d/r=10 figure 3: paschen’s curves for the same d/r values. iv similarity in gas breakdown in 1928 townsend revealed that the breakdown voltage ub for a longer gap was higher than that for a shorter gap, even with an equal value of pd [5, 11, 28, 33–36], i.e. ub = f(p,d) 6= f(pd). consequently, paschen’s curves for the gaps with different d values do not superimpose onto each other. in this paper, this phenomenon was investigated by measuring the breakdown voltages of low-pressure argon gaps between two plane-parallel electrodes. a schematic representation of the experimental set up is shown in fig. 1. a dc voltage was used in the electrodes. figure 2 shows typical results with different d/r ratios, where r is the radius of the electrodes. from fig. 2 we see that as d/r increases the paschen’s curves move to the right and upwards. in fig. 3 we observe that the curves with an equal value of d/r superimpose onto each other. from the experiments we conclude that the breakdown voltage of these gaps depends on two factors: the product of gap length and gap pressure, and the aspect ratio of the gap, i.e. ub = f(pd,d/r). the electric field that exists in the gap between two parallel electrodes is determined by d/r, and in the case of a non-uniform electric field breakdown voltage would be a function of not only pd but also d/r. the value of e/p is the same for the same value of d/r, and when the d/r value is different the field distributions are also different [16]. in fact, the distribution of the electric field is a function of d/r. a 140004-3 papers in physics, vol. 14, art. 140004 (2022) / m. prijil et al. mathematical expression is obtained by the polynomial fit of the profile of the electric field [35, 37, 38]. e eav = f ( x d , d r ) or e p = u pd f ( px pd , d r ) , (2) the breakdown criterion, the self-sustained condition for townsend discharge can be expressed as γ [ exp (∫ d 0 α(x)dx ) − 1 ] = 1 or ∫ d 0 α(x)dx = ln ( 1 + 1 γ ) , (3) where γ is the coefficient of a second electron emission from the cathode by ion bombardment; α is the electron impact ionisation coefficient and is a function of the reduced field e/p, i.e. α = a p exp ( −b e/p ) , (4) where a and b are constants. by substituting (2) into (4), and then into (3), we obtain a ∫ pd 0 exp   −b ub pd f ( px pd , d r )  d(px) = ln (1 + 1 γ ) . now, dividing both sides by a and substituting px = y -2 0 2 4 6 8 10 12 14 16 18 20 22 24 -100 0 100 200 300 400 500 600 700 800 900 gap 2 gap 5 gap 8 e le c tr o d e v o lt a g e ( v ) discharge current (ma) figure 4: v −i characteristic curves for different gaps. ∫ pd 0 exp   −bpd ub f ( y pd , d r )  dy = ln ( 1 + 1 γ ) a , ∫ pd 0 exp   −bpd ub f ( y pd , d r )  dy = f (ub,pd, d r ) , f ( ub,pd, d r ) = ln ( 1 + 1 γ ) a . (5) from eq. (5), theoretically, we prove that the breakdown voltage is a function of d/r and pd. the same results are also observed in our experiments. from eq. (5) we see that, for any two gas gaps, if p1d1 = p2d2 and d1/r1 = d2/r2, the breakdown voltage for these two gaps will be the same, i.e. ub1 = ub2. substituting these three equations (p1d1 = p2d2, d1/r1 = d2/r2, ub1 = ub2) into eq. (2), we know that the reduced field e/p in these two gaps at the corresponding point p1x1 = p2x2 will be equal, i.e. e1(p1x1)/p1 = e2(p2x2)/p2. here, ub = f(pd,d/r) is also a special case of the similarity theorem, with non-uniform electric fields between plane-parallel electrodes [39,40], and extends paschen’s law to this special case. it should be indicated that ub = f(pd,d/r) also applies to the uniform electric field where d/r → 0, and it reduces to paschen’s law. -2 0 2 4 6 8 10 12 14 16 -100 0 100 200 300 400 500 600 700 800 gap 1 gap 2 gap 3 e le c tr o d e v o lt a g e ( v ) discharge current (ma) figure 5: v −i characteristic curves for similar gaps. 140004-4 papers in physics, vol. 14, art. 140004 (2022) / m. prijil et al. v voltage-current characteristic curves of the similar gaps the voltage-current (v −i) characteristics of dc glow discharge plasma can be obtained either by gradually increasing the external voltage or by lowering the external resistance [3,4,28,37,41–47]. external high resistance can be introduced to limit the amount of discharge current produced [47]. the operation region of glow discharge can be identified by studying the voltage-current characteristics. the nonlinear nature of glow discharge plasma can be analysed by studying the v −i characteristic [48, 49]. moreover, the v −i characteristic is the primary step that enables us to find out whether two discharge gaps are similar or not. the distance between the electrode (d), electrode radius (r), gas pressure (p), and external resistance (r) were kept fixed and the applied voltage (va) was varied in equal steps over a wide voltage range. high resistance was introduced into the circuit to limit the amount of current produced. for each voltage applied, the corresponding voltage across the resistor (vr) was measured. the discharge current (i = vr/r) and electrode voltage (v = va−vr) were calculated at each step and the forward characteristics obtained. after reaching the maximum voltage, the voltage was reduced in equal steps as before, the discharge current and electrode voltage were calculated and the reverse characteristics obtained. placing the discharge current (i) on the x-axis and electrode voltage (v ) on the y-axis gives the typical v−i characteristics, as shown in fig. 4 and fig. 5. for any two gaps arranged to fulfil the relationships p1d1 = p2d2, p1r1 = p2r2 and e1/p1 = e2p2, the gaps are said to be similar [16]. along with the physical quantities mentioned, the voltage-current characteristic is approximately the same. experimentally, the validity of this similarity law for v−i characteristics of a large discharge tube is verified here for three discharge gaps satisfying the above similarity relation. the external resistance chosen for all the three cases is 10 kω. in a physical system, the occurrence of hysteresis refers to the parametric dependence of a state on its history. hysteresis is a clear sign of nonlinearity in the system [43, 47]. the jump phenomenon and hysteresis in discharge current are very well known phenomena in gas discharge, due to the variation in discharge -2 0 2 4 6 8 10 12 14 16 18 20 22 24 -100 0 100 200 300 400 500 600 700 800 900 gap 2, reverse gap 2, forward gap 5, reverse gap 5, forward gap 8, reverse gap 8, forward e le ct ro d e v o lta g e ( v ) discharge current (ma) figure 6: v−i characteristic and hysteresis for different gaps. voltage [43]. a gradual increase in discharge voltage causes a sudden increase in discharge current. this is called the jump phenomenon. the current increases gradually with an increase in voltage, confirming the operation of glow discharge plasma in the abnormal region [4–6]. after reaching an applied voltage of 900 v, the voltage is decreased in steps of 10 v. the characteristic curve does not retrace through the forward path. there is a decrease in the amount of current discharged in the reverse direction. the current lags behind the voltage and hysteresis is observed; the jump phenomenon can also be observed in the reverse direction. the v−i characteristic and hysteresis for gaps having different d/r is shown in fig 6. figures 7, 8 and 9 show the v −i characteristic and hysteresis for similar gaps and confirms the similarity experimentally. when p1d1 = p2d2 = p3d3, the total number of collisions undergone by one electron to cross the gap will be the same for the three gaps. the electric field in a gap between two plane-parallel electrodes can be obtained using the ratio d/r; the distribution of the electric field is a function of the d/r ratio. the e/p ratio for the three gaps is found to vary almost constantly, by making the d/r ratio constant . the e/p ratio signifies the energy gained by the electron between two consecutive collisions [23]. by fixing the parameters e/p and pd, the electron multiplication rate of the gaps becomes fixed [23]. the rate of electron multiplication de140004-5 papers in physics, vol. 14, art. 140004 (2022) / m. prijil et al. -2 0 2 4 6 8 10 12 14 16 -100 0 100 200 300 400 500 600 700 800 gap 1, reverse gap 1, forward gap 2, reverse gap 2, forward gap 3, reverse gap 3, forward e le ct ro d e v o lta g e (v ) discharge current (ma) figure 7: v −i characteristic and hysteresis for similar gaps (d/r = 10). termines the rate of ionization, which in turn determines the rate of discharge current [23]. for voltage varying constantly for the three gaps, the discharge currents produced become equal, as the ionization rate due to electron multiplication is the same. the three curves overlap, and the occurrence of forbidden processes in this gap can be discarded. -2 0 2 4 6 8 10 12 14 16 18 0 200 400 600 800 gap 4, reverse gap 4, forward gap 5, reverse gap 5, forward gap 6, reverse gap 6, forwarde le c tr o d e v o lta g e (v ) discharge current (ma) figure 8: v −i characteristic and hysteresis for similar gaps (d/r = 5). 0 5 10 15 20 25 0 200 400 600 800 1000 gap 7, reverse gap 7, forward gap 8, reverse gap 8, forward gap 9, reverse gap 9, forwarde le ct ro d e v o lta g e ( v ) discharge current (ma) figure 9: v −i characteristic and hysteresis for similar gaps (d/r = 1). vi conclusions in a special case, ub = f(pd,d/r) is the breakdown voltage between two plane-parallel electrodes with low-pressure gaps, and is the non-uniform electric field between plane parallel electrodes which is connected with the similarity theorem of gas discharge. similar glow discharge was observed only in two argon gaps which had a limited scaled-down factor k, and in the case of forbidden processes such as the inelastic collision of the second kind and the stepwise ionisation which tend to violate the similarity of the discharge as k increases [2, 10]. from the experiments we observe a clear cathode fall layer, a positive column between the electrodes, and a negative glow zone. these findings indicate that the discharge is a typical glow discharge [6, 37]. the comparison of discharge physical parameters between the scaled-down gap and prototype gap enables us to find the proportional relations derived from the similarity law. the same voltage-current characteristic curves of the two similar gaps are also obtained. studies have been carried out on dc glow discharge at low pressure for more than 100 years; the mechanism of the discharge is well studied. the area of dc glow discharge has many applications, but some of the issues remain unsolved. the glow discharge cleaning of the international thermonuclear experimental reactor (iter) is considered one of the unsolved problems. in the case 140004-6 papers in physics, vol. 14, art. 140004 (2022) / m. prijil et al. of iter, a huge tokamark device, the fusion reaction takes place inside a toroidal chamber. the fusion reaction should be stopped after a period of operation, and once stopped the inner wall of the toroid needs to be washed with dc glow discharge plasma . this cleaning involves inserting small electrodes that function as the anodes for the glow discharge on the inner wall. the inner wall serves as the cathode for the glow discharge because it is electroisolated from the small anodes. the question before designers of iter is whether the dc glow discharge plasma made up of small anodes can uniformly cover the huge wall of the toroid. unfortunately, it is not possible to showcase the full-scale experiment at present. in this paper, we tried to answer the iter designers’ question using a scaled-down experiment, and investigated whether the glow discharged plasma consisting of small anodes can uniformly cover the wall of the scaled-down chamber or not. the affirmative answer obtained from the scaled-down experiment can be extrapolated to iter. acknowledgements the experimental workwas carried out in the plasma laboratory at st. berchmans college, mahatma gandhi university, kerala, india, set up under the project of the board of research in fusion science & technology (brfst), india. the authors expresses their gratitude to the institute for plasma research (ipr), gandhinagar, gujarat, india for their support. [1] v a lisovsky, s d yakovin, scaling law for a low-pressure gas breakdown in a homogeneous dc electric field, jetp lett. 72, 34 (2000). [2] h luo, x wang, y fu, s yang, x zou, similarity of gas discharge in low-pressure argon gaps between two plane-parallel electrodes, high volt. 1, 86 (2016). [3] j r lucas, breakdown of gaseous insulation, in: high voltage engineering, pag. 1-21, katson books, sri lanka (2001). [4] b t chiad, t l al-zubaidi, m k khalaf, a i khudiar, characterization of low pressure plasma-dc glow discharges (ar, sf6 and sf6/he) for si etching, indian j. pure appl. phys. 48, 723 (2010). [5] j t gudmundsson, a hecimovic, foundations of dc plasma sources, plasma sources sci. tech. 26, 123001 (2017). [6] a a garamoon, a samir, f f elakshar, e f kotp, electrical characteristics of a dc glow discharge, plasma sources sci. tech. 12, 417 (2003). [7] a m loveless, a l garner, a universal theory for a gas breakdown from microscale to the classical paschen law, phys. plasmas 24, 113522 (2017). [8] p mathew, j george, t s mathews, p j kurian, experimental verification of modified paschen’s law in dc glow discharge argon plasma, aip adv. 9, 025215 (2019). [9] m nurujjaman, a n s iyengar, realization of soc behaviour in a dc glow discharge plasma, phys. lett. a 360, 717 (2007). [10] s a kalinin, a v meshchanov, a i shishpanov, y z ionikh, dynamics of breakdown in a low-pressure argon–mercury mixture in a long discharge tube, plasma phys. rep. 44, 298 (2018). [11] k t a l burm, calculation of the townsend discharge coefficients and the paschen curve coefficients, contrib. plasm. phys. 47, 177 (2007). [12] l sirghi, k ohef, g popa, interactions between ionization waves and potential structure formed at a constriction of the dc he positive column, j. phys. d. appl. phys. 30, 2431 (1997). [13] s a wissel, a zwicker, j ross, s gershman, the use of dc glow discharges as undergraduate educational tools, am. j. phys. 81, 663 (2013). [14] p f little, a v engel, the hollow-cathode effect and theory of glow discharge, proc. r. soc. lond. a 224, 209 (2016). 140004-7 https://doi.org/10.1134/1.1312005 https://doi.org/10.1049/hve.2016.0017 https://www.academia.edu/38036425/high_voltage_engineering https://www.academia.edu/38036425/high_voltage_engineering http://nopr.niscair.res.in/handle/123456789/10393 http://nopr.niscair.res.in/handle/123456789/10393 https://doi.org/10.1088/1361-6595/aa940d https://doi.org/10.1088/1361-6595/aa940d https://doi.org/10.1088/0963-0252/12/3/317 https://doi.org/10.1088/0963-0252/12/3/317 https://doi.org/10.1063/1.5004654 https://doi.org/10.1063/1.5004654 https://doi.org/10.1063/1.5086246 https://doi.org/10.1016/j.physleta.2006.09.005 https://doi.org/10.1134/s1063780x18030054 https://doi.org/10.1134/s1063780x18030054 https://doi.org/10.1002/ctpp.200710025 https://doi.org/10.1002/ctpp.200710025 https://doi.org/10.1088/0022-3727/30/17/009 https://doi.org/10.1088/0022-3727/30/17/009 https://doi.org/10.1119/1.4811435 https://doi.org/10.1119/1.4811435 https://doi.org/10.1098/rspa.1954.0152 https://doi.org/10.1098/rspa.1954.0152 papers in physics, vol. 14, art. 140004 (2022) / m. prijil et al. [15] v a lisovskiy, s d yakovin, v d yegorenkov, low-pressure gas breakdown in uniform dc electric field, j. phys. d. appl. phys. 33, 2722 (2000). [16] x wang, s yang, y fu, x zou, h luo, effect of distribution of electric field on lowpressure gas breakdown, phys. plasmas 24, 023508 (2017). [17] m schmidt, h conrads, plasma generation and plasma sources, plasma sources sci. technol. 9, 441 (2000). [18] y hoshi, h yoshida, examination of lasertriggered discharge using a virtual gas model and the similarity of its paschen curve with those of inert gases, j. appl. phys. 106, 27 (2009). [19] y fu, h luo, x zou, x wang, research on similarity law of glow discharge in argon at low pressure by numerical simulation, ieee t. plasma sci. 42, 1544 (2014). [20] v a lisovskiy, v a koval, v d yegorenkov, dc breakdown of low pressure gas in long tubes, phys. lett. a 375 , 1986 (2011). [21] g a mesyats, on the similarity law in picosecond gas discharges, jetp lett. 83, 21 (2006). [22] j franzke, the micro-discharge family (dark, corona, and glow-discharge) for analytical applications realized by dielectric barriers, anal. bioanal. chem. 395, 549 (2009). [23] l d tsendin, nonlocal electron kinetics in gasdischarge plasma, phys. usp. 53 , 133 (2010). [24] j nahorny, et al., experimental and theoretical investigation of a n2-o2 dc flowing glow discharge, j. phys. d. appl. phys. 28, 738 (1995). [25] j sernicki, some practical data on the first townsend coefficient of organic vapour in avalanche counters, nucl. instrum. meth. a 399, 347 (1997). [26] d n polyakov, v v shumova, l m vasilyak, positive column of a glow discharge in neon with charged dust grains (a review), plasma phys. rep. 43, 397 (2017). [27] s o macheret et al., shock wave propagation and dispersion in glow discharge plasmas, phys. fluids 13, 2693 (2001). [28] a bogaerts, e neyts, r gijbels, j van der mullen, gas discharge plasmas and their applications, spectroch. acta b 57, 609 (2002). [29] f yang-yang, l hai-yun, z xiao-bing, w xin-xin, influence of forbidden processes on similarity law in argon glow discharge at low pressure, chinese phys. lett. 31, 075201 (2014). [30] l papageorgiou, a c metaxas, g e georghiou, three-dimensional numerical modelling of gas discharges at atmospheric pressure incorporating photoionization phenomena, j. phys. d appl. phys. 44, 045203 (2011). [31] a wolf, j b swift, h l swinney, j a vastano, determining lyapunov exponents from a time series, phys. d: nonlinear phenom. 16, 285 (1985). [32] v o papanyan, y i grigoryan, deterministic onset of chaos in a gas discharge, phys. lett. a 164, 43 (1992). [33] g brunner, townsend coefficients of gases in avalanche counters, nucl. instrum. methods 154, 159 (1978). [34] b m smirnov, modeling gas discharge plasma, phys. usp. 52, 559 (2009). [35] p f kurbatov, the physical nature of the phenomenon of positive column plasma constriction in low-pressure noble gas direct current discharges, phys. plasmas 21, 023508 (2014). [36] q ye, a simple analytical method of gas discharge based on logistic model, ieee trans. plasma sci. 47, 1413 (2019). [37] l conde, l león, multiple double layers in a glow discharge, phys. plasmas 1, 2441 (1994). [38] u kogelschatz, dielectric-barrier discharges: their history, discharge physics, and industrial applications, plasma chem. plasma process. 23, 1 (2003). 140004-8 https://doi.org/10.1088/0022-3727/33/21/310 https://doi.org/10.1088/0022-3727/33/21/310 https://doi.org/10.1063/1.4976848 https://doi.org/10.1063/1.4976848 https://doi.org/10.1088/0963-0252/9/4/301 https://doi.org/10.1088/0963-0252/9/4/301 https://doi.org/10.1063/1.3223536 https://doi.org/10.1063/1.3223536 https://doi.org/10.1109/tps.2014.2319106 https://doi.org/10.1109/tps.2014.2319106 https://doi.org/10.1016/j.physleta.2011.03.035 https://doi.org/10.1134/s002136400601005x https://doi.org/10.1007/s00216-009-2799-4 https://doi.org/10.1007/s00216-009-2799-4 https://doi.org/10.3367/ufne.0180.201002b.0139 https://doi.org/10.1088/0022-3727/28/4/017 https://doi.org/10.1016/s0168-9002(97)00949-2 https://doi.org/10.1016/s0168-9002(97)00949-2 https://doi.org/10.1134/s1063780x17030096 https://doi.org/10.1134/s1063780x17030096 https://doi.org/10.1063/1.1388204 https://doi.org/10.1016/s0584-8547(01)00406-2 https://doi.org/10.1088/0256-307x/31/7/075201 https://doi.org/10.1088/0256-307x/31/7/075201 https://doi.org/10.1088/0022-3727/44/4/045203 https://doi.org/10.1016/0167-2789(85)90011-9 https://doi.org/10.1016/0167-2789(85)90011-9 https://doi.org/10.1016/0375-9601(92)90903-y https://doi.org/10.1016/0375-9601(92)90903-y https://doi.org/10.1016/0029-554x(78)90673-0 https://doi.org/10.1016/0029-554x(78)90673-0 https://doi.org/10.3367/ufne.0179.200906e.0591 https://doi.org/10.1063/1.4866016 https://doi.org/10.1109/tps.2018.2889796 https://doi.org/10.1109/tps.2018.2889796 https://doi.org/10.1063/1.870572 https://doi.org/10.1023/a:1022470901385 https://doi.org/10.1023/a:1022470901385 papers in physics, vol. 14, art. 140004 (2022) / m. prijil et al. [39] r fitzpatrick, plasma physics: an introduction, crc press, boca raton (2014). [40] l d tsendin, electron kinetics in non-uniform glow discharge plasmas, plasma sources sci. technol. 4, 200 (1995). [41] m n shneider, m s mokrov, g m milikh, dynamic contraction of the positive column of a self-sustained glow discharge in molecular gas, phys. plasmas 19, 033512 (2012). [42] x p lu, m laroussi, electron density and temperature measurement of an atmospheric pressure plasma by millimeter wave interferometer, appl. phys. lett. 92, 051501 (2008). [43] r a bosch, r l merlino, sudden jumps, hysteresis, and negative resistance in an argon plasma discharge. i. discharges with no magnetic field, beit. plasmaphys. cont. 26, 1 (1986). [44] w yun, l yinghong, j min, s huimin, s changbing, p yikang, experimental investigation into characteristics of plasma aerodynamic actuation generated by dielectric barrier discharge, chinese j. aeronaut. 23, 39 (2010). [45] r morrow, the theory of the positive glow corona, j. phys. d. appl. phys. 30, 3099 (1997). [46] c m ticos, e rosa, w b pardo, j a walkenstein, m monti, experimental real-time phase synchronization of a paced chaotic plasma discharge, phys. rev. lett. 85, 2929 (2000). [47] y fu, p zhang, j krek, j p verboncoeur, gas breakdown and its scaling law in microgaps with multiple concentric cathode protrusions, appl. phys. lett. 114, 014102 (2019). [48] y fu, j p verboncoeur, on the similarities of low-temperature plasma discharges, ieee trans. plasma sci. 47, 1994 (2018). [49] a l garner, g meng, y fu, a m loveless, r s brayield ii, transitions between electron emission and gas breakdown mechanisms across length and pressure scales, j. appl. phys. 28, 210903 (2020). 140004-9 https://doi.org/10.1201/b17263 https://doi.org/10.1088/0963-0252/4/2/004 https://doi.org/10.1088/0963-0252/4/2/004 https://doi.org/10.1063/1.3694913 https://doi.org/10.1063/1.2840194 https://doi.org/10.1002/ctpp.19860260102 https://doi.org/10.1002/ctpp.19860260102 https://doi.org/10.1016/s1000-9361(09)60185-0 https://doi.org/10.1088/0022-3727/30/22/008 https://doi.org/10.1088/0022-3727/30/22/008 https://doi.org/10.1103/physrevlett.85.2929 https://doi.org/10.1063/1.5077015 https://doi.org/10.1109/tps.2018.2886444 https://doi.org/10.1109/tps.2018.2886444 https://doi.org/10.1063/5.0030220 https://doi.org/10.1063/5.0030220 introduction conditions for similar discharges experimental setup similarity in gas breakdown voltage-current characteristic curves of the similar gaps conclusions papers in physics, vol. 14, art. 140007 (2022) received: 13 january 2022, accepted: 11 march 2022 edited by: k. daniels, l. a. pugnaloni, j. zhao reviewed by: r. stannarius otto von guericke univ. magdeburg, germany licence: creative commons attribution 4.0 doi: http://doi.org/10.4279/pip.140007 www.papersinphysics.org issn 1852-4249 bespoke particle shapes in granular matter d. cantor1∗, m. cárdenas-barrantes2,3 †, l. orozco4 ‡ among granular matter, one type of particle has special properties. upon being assembled in disordered configurations, these particles interlock, hook, almost braid, and – surprisingly, considering their relatively low packing fractions – show exceptional shear strength. such is the case of non-convex particles. they have been used in the shapes of tetrapods, ‘l’, ‘z’, stars, and many others, to protect coasts or build self-standing structures requiring no binders or external supports. although these structures are often designed without a comprehensive mechanical characterization, they have already demonstrated great potential as highly resistant construction materials. nevertheless, it is natural to attempt to find the most appropriate non-convex shapes for any given application. can a particle shape be tuned to obtain a desired mechanical behavior? although this question cannot be answered yet, current technological, simulation, and experimental developments strongly suggest that it can be resolved in the next decade. a clear understanding of the relationships between particle shapes, mechanical response, and packing properties will be key to providing insights into the behavior of these materials. such work should stand on 1) robust and general shape descriptors that encode the complexity of non-convex shapes (i.e., the number of arms, the symmetries and asymmetries of the bodies, the presence of holes, etc.), 2) the analysis of the response of assemblies under different loading conditions, and 3) the disposition and reliability of non-convex shapes to ensure durability. the manufacturing process and an efficient use of resources are additional elements that could further help to optimize particle shape. in the quest of designing bespoke non-convex particles, this paper consolidates the challenges that remain unresolved. it also outlines some routes to explore based on the latest developments in technology and research. ∗ david.cantor@polymtl.ca † manuel-antonio.cardenas-barrantes@umontpellier.fr ‡ l.orozco@uliege.be 1 department of civil, geological and mining engineering, polytechnique montreal, montreal, qc, canada. 2 lmgc, cnrs, university of montpellier, france. 3 laboratoire de micromcanique et intgrit des structures (mist), um, cnrs, irsn, france. 4 grasp, cesam research unit institute of physics b5a, university of lige, belgium. i. introduction have you observed tetrapod-like elements protecting coasts? or, perhaps, an assembly of rigid stars collectively building a structure? figure 1 shows complex arrangements of interlocked non-convex particles forming relatively loose structures with remarkable strength and dissipative properties. in many places around the world, concrete non-convex blocks are used for coastal protection due to their easy prefabrication and disposition methods [1–4] (see fig. 1(a)). these blocks create stable, loose configurations that allow water to enter the cavities and, thus, release the kinetic energy of the waves. 140007-1 papers in physics, vol. 14, art. 140007 (2022) / d. cantor et al. figure 1(b) shows a self-standing structure easily reaching a few meters high, requiring no external support or binding materials. these innovative applications expose the great potential nonconvex particles possess as construction materials where interlocking, braiding, or stacking elements can dramatically improve mechanical properties [5–11]. in addition, non-convex shapes such as ‘z’ [11] can also continuously gain shear strength as strains do not necessarily trigger failure, but rather promote particle rearrangement and further interlocking [12–14]. since the particles used for many of these applications are replicated from a single shape – or a few shapes –, the fabrication processes and disposition of the pieces can be automated for construction in environments of difficult access or that require remote-controlled machinery/robots [15, 16]. it is evident that non-convex particles could vary infinitely in shape. in practice, there are branched, twisted, curved, asymmetric, and even convex shapes with hollowed faces creating truss structures [17–19]. it is natural, then, to inquire into the best shape for a given application. although intuition may suggest that assemblies of ‘z’ or ‘l’ shaped particles can develop higher strengths, there are no systematic studies allowing one to determine which non-convex shape performs better than another. so far, studies have used different particle shapes and tested various characterization approaches that produce no comparable results. while the particle shape is central to designing tailored/bespoke granular materials, other aspects such as deposition or assembly methods, particle strength, and constructive processes should also be considered to optimize the particle geometry for a given application. in the following, we present the challenges that, in our opinion, can be addressed in the near future to make bespoke granular matter the next generation of construction materials. ii. challenges i. how is particle geometry described? many shape descriptors found in the literature are focused on describing convex shapes (i.e., shapes without cavities) by means of their sphericity, aspect ratio, angularity, flatness, elongation, roundness, and irregularity [22–28]. these parameters often fail to describe the complexity of non-convex shapes or do not allow one to have a straightforward picture of the particle. as an alternative, fourier descriptors have been used to represent non-convex shapes [29, 30]. however, this method may need a large set of parameters when dealing with asymmetric or highly irregular geometries. in general, there is no consensus on a parameter or a set of parameters to describe the complexity of non-convex bodies. the first challenge is the development of simple and robust geometrical descriptors, allowing one to compare different families of particle shapes. this descriptor or set of descriptors should be able to encode as much information as possible regarding the characteristics of non-convex shapes, such as the number of arms, symmetry or asymmetry of the body, the presence of holes, recurrent patterns, among others. furthermore, any descriptor should be sufficiently clearsuch that its definition maps directly into basic or elemental parameters. once the descriptors are set, efforts should be focused on systematically linking them to the mechanical behavior (compaction, shear strength, rheology of quasi-static and inertial flows) and packing properties of particle assemblies. for this, experiments and simulations are valuable tools that should be developed to work in synergy and mutually enrich each other’s observations. ii. physical experiments in experiments, for example, the technology of 3d printing has allowed one to precisely control the shape of particles to be later assembled in structures [31]. although some studies have performed qualitative characterization of the resistance of non-convex particle assemblies [5, 9, 10, 32], only a few have quantitatively assessed their mechanical response and packing properties [11, 33–36]. the current technological capabilities suggest that hundreds or even thousands of particles can be built and tested in standard devices (triaxial, shear cells, rheometers, etc.). however, to the best of our knowledge, this has not been done. besides, it is unclear how to scale up the obtained properties from the laboratory scale to large size applications such as coastal barriers. the second challenge is related to the fabrication 140007-2 papers in physics, vol. 14, art. 140007 (2022) / d. cantor et al. (a) (b) figure 1: examples of structures built using non-convex particles: a) coastal barrier (public image by pok rie [20]) and b) aggregate wall (with permission of karola dierichts [21]). of non-convex shapes and the mechanical characterization of assemblies either in standard equipment or non-conventional devices (e.g., large-scale shear boxes [37, 38]). up to now, the physical characterization we have mentioned refers to macroscopic mechanical or packing properties. however, it is well known that such responses are related to the mechanics at the scale of the particles and their contacts [39–41]. recently, several efforts have been put into characterizing the mechanics at such small scales. for instance, novel fast digital image analysis allows one to track a particle’s positions when loaded [42–45]. among them, it is worth mentioning the use of photo-elastic particles, which can not only display the contact network, but also permits us to estimate the force and stress intensities within the particles [46–49]. despite the advances in these experimental approaches, their use is seldom seen when it comes to non-convex particle assemblies. the third challenge corresponds to exploring the microstructure and force transmission mechanisms within assemblies of non-convex bodies. this requires the development of robust and efficient digital image analyses capable of dealing with complex entangled particles that are often ambiguous, difficult to identify one from another. iii. numerical simulations an alternative means to probe the packing properties and load distribution within a granular sample is numerical simulations. to simulate discrete matter, some of the most popular approaches are the discrete-element method (dem) [50–52], and the material point method (mpm) [53–55]. in particular, dem strategies have proven to be quite advantageous since they provide a detailed description of the micromechanics of granular materials (e.g., particle connectivity, fabric anisotropy, contact force network, etc.) while being capable of dealing with collections of rigid [56, 57] and deformable bodies [58–60] of varied sizes and shapes [61, 62], under a large variety of boundary conditions. even though the simulation of non-convex granular assemblies is in rapid development today [17, 63–70], these works also expose some of the limitations of the current modeling approaches. first, the shape of the bodies is sometimes represented using clumped spheres [71], sphero-polyhedra [72, 73] or superquadrics [74]. although these strategies allow one to use well-known methods for convex bodies, the discretization of the shapes may add artificial textures on the surfaces, or they can be rapidly limited when pointy geometries or sharp edges need to be considered. other strategies that discretize bodies using multiple vertices and faces to represent the particles often require excessive storage space and expensive i/o operations that penalize the num140007-3 papers in physics, vol. 14, art. 140007 (2022) / d. cantor et al. ber of particles that can be considered. secondly, contact detection strategies require further optimization and are computationally prohibitive when dealing, for instance, with elongated or long-armed bodies. since contact detection is often based on the overlapping of spheres enveloping the bodies, such approaches can largely overestimate potential contacts in the case of non-convex particles, which can dramatically slow down the time stepping evolution of a simulation. finally, it remains difficult to draw clear comparisons between studies given the broad spectrum of particle shapes considered and the lack of a consensus, once again, in the parameters to describe non-convex geometries. the fourth challenge is to develop new algorithms that provide an accurate representation of non-convex shapes while facilitating or optimizing contact detection. these developments should be conceived and developed as scalable, highly parallel algorithms that could be used in cluster computing (e.g., hpc and gpu computing). equally important will be validating the well-known micromechanical analyses of granular materials to any particle shape by means of simulation campaigns over a broad range of parameters. the advances on the experimental and simulation axes will naturally lead to the validation of the numerical models that, then, can be extended to more complex particle shapes and boundary conditions. extensive systematic simulation studies, in which one can clearly control the physical and numerical parameters, can thus provide further insights into the mechanical behavior of granular matter composed of non-convex bodies. iii. tuning particle shape for applications while theoretical, experimental, and numerical approaches are valuable tools to explore the mechanics of granular media, the applications or industrial stakes may determine pertinent features that particles and assemblies should exhibit. as mentioned before, coastal protection structures have consisted of various shapes, including tetrapods, tripods, cubes, dolossa, etc. even though all of them dissipate energy and contribute to erosion prevention, it is not clear which features led engineers to prefer one shape over others. the fifth challenge is to develop methodologies to tune particle shape for a specific application, based on the thorough geometrical and mechanical characterization of non-convex shapes. although the mechanical response of the granular assemblies can often be the key parameter to choose a specific particle shape, other elements accounting for the durability and reliability of the structures should also be considered. for instance, additional analyses may include the breakage strength of the particles, the settlements over time due to particle rearrangement, the capabilities for particle manufacturing, and the initial conditions of the arrangements (e.g., deposition, preloading, etc.). more recently, the scarcity of materials due to supply chain unreliability and the need for reducing ecological footprints are issues that call for the reusability of the individual bodies, or for their being manufactured from local materials. this last challenge is indeed broader and may involve the science of new composite materials from renewable sources. nevertheless, the task may largely benefit from advances in the mechanics of granular media of varied shaped bodies. iv. summary and perspectives arrangements of non-convex particles often present outstanding properties regarding their shear strength, mainly due to their capacity to hook, interlock, and entangle. these materials also present low packing fractions which favor efficient energy dissipation, as in the case of coastal barriers. despite the multiple advantages non-convex particle assemblies can display, their mechanical behavior is still poorly understood. indeed, there are no methodologies to determine what shapes provide more strength than others, and it is not clear how to optimize particle shape for a given application. to correctly match non-convex particle shapes to applications, a series of challenges need to be overcome. they include 1) the development of robust yet simple geometrical descriptors, allowing one to compare different shapes; 2) the generation of particle assemblies for systematic studies on their mechanical properties with experiments and numerical simulations. the synergy between experiments and numerical modeling will be key to rapidly gath140007-4 papers in physics, vol. 14, art. 140007 (2022) / d. cantor et al. ering insights on the mechanical behavior of these materials; 3) the micro-mechanical analyses to understand the particularities of these materials that lead to their exceptional macroscopical properties; and finally, 4) the identification of additional elements that limit the range of possibilities for particle shape or allow one to optimize the mechanical properties for a given application. state-of-the-art experiments and simulations suggest that these challenges can be successfully addressed in the near future. in terms of experiments, x-ray tomography, 3d printing, and digital image analyses are promising tools and strategies largely used for convex grains, awaiting generalization to any particle shape. regarding numerical modeling, discrete-element approaches seem to present a privileged framework for exploring the macro and microscopic responses of assemblies of complex shaped bodies. however, this task requires several improvements in the algorithms’ efficiency and their scalability to work on highly parallelized environments. although artificial intelligence (ai) is just debuting in the field of granular materials [75, 76], it is going to be an essential tool for the optimization of particle shapes under a set of constraints. these ai tools will benefit from the experimental and numerical axes of research and can shed light into more fundamental aspects concerning a unified geometrical descriptor for non-convex grains and the physics of granular media. optimizing particle shapes in granular matter will also benefit from multidisciplinary contributions including mathematics, statistics, computer science, chemistry, among others. although we focused our exposition on civil structures, this is a topic spanning different fields of material technology, engineering, architecture, bio-inspired materials, etc. undoubtedly, materials composed of non-convex particles may be the next generation of building materials for optimized structures based on the idea of tailored granular matter or bespoke particle shapes. acknowledgements the authors would like to acknowledge fruitful discussion with jonathan barés and emilien azéma. [1] p danel, tetrapods, coast. eng. proc. 1, 28 (1953). [2] m muttray, b reedijk, design of concrete armour layers, hansa int. maritime j. 6, 111 (2009). [3] j r medina, j molines, m e gómez-mart́ın, influence of armour porosity on the hydraulic stability of cube armour layers, ocean eng. 88, 289 (2014). [4] j molines, r centi, m di risio, j r medina, estimation of layer coefficients of cubipod homogeneous low-crested structures using physical and numerical model placement tests, coast. eng. 168, 103901 (2021). [5] o tessmann, topological interlocking assemblies, proc. 30th int. conf. ecaade 2, 201 (2012). [6] s v franklin, geometric cohesion in granular materials, phys. today 65, 70 (2012). [7] n gravish, s v franklin, d l hu, d i goldman, entangled granular media, phys. rev. lett. 108, 208001 (2012). [8] k dierichs, a menges, aggregate architecture: simulation models for synthetic non-convex granulates, proc. 33rd annual conf. acadia, 301 (2013). [9] k dierichs, a menges, towards an aggregate architecture: designed granular systems as programmable matter in architecture, granul. matter 18, 1 (2016). [10] y zhao, k liu, m zheng, j barés, k dierichs, a menges, r behringer, packings of 3d stars: stability and structure, granul. matter 18, 1 (2016). [11] k a murphy, n reiser, d choksy, c e singer, h m jaeger, freestanding loadbearing structures with z-shaped particles, granul. matter 18, 1 (2016). [12] d dumont, m houze, p rambach, t salez, s patinet, p damman, emergent strain stiffening in interlocked granular chains, phys. rev. lett. 120, 088001 (2018). 140007-5 https://doi.org/10.9753/icce.v4.28 https://doi.org/10.9753/icce.v4.28 https://www.researchgate.net/profile/bas-reedijk/publication/275960407_design_of_concrete_armour_layers/links/554b8c450cf21ed213594b1a/design-of-concrete-armour-layers.pdf https://www.researchgate.net/profile/bas-reedijk/publication/275960407_design_of_concrete_armour_layers/links/554b8c450cf21ed213594b1a/design-of-concrete-armour-layers.pdf https://doi.org/10.1016/j.oceaneng.2014.06.012 https://doi.org/10.1016/j.oceaneng.2014.06.012 https://doi.org/10.1016/j.coastaleng.2021.103901 http://papers.cumincad.org/data/works/att/ecaade2012_176.content.pdf http://papers.cumincad.org/data/works/att/ecaade2012_176.content.pdf https://doi.org/10.1063/pt.3.1726 https://doi.org/10.1103/physrevlett.108.208001 https://doi.org/10.1103/physrevlett.108.208001 http://papers.cumincad.org/data/works/att/acadia13_301.content.pdf http://papers.cumincad.org/data/works/att/acadia13_301.content.pdf https://doi.org/10.1007/s10035-016-0631-3 https://doi.org/10.1007/s10035-016-0631-3 https://doi.org/10.1007/s10035-016-0606-4 https://doi.org/10.1007/s10035-016-0606-4 https://doi.org/10.1007/s10035-015-0600-2 https://doi.org/10.1007/s10035-015-0600-2 https://doi.org/10.1103/physrevlett.120.088001 https://doi.org/10.1103/physrevlett.120.088001 papers in physics, vol. 14, art. 140007 (2022) / d. cantor et al. [13] a hafez, q liu, t finkbeiner, r a alouhali, t e moellendick, j c santamarina, the effect of particle shape on discharge and clogging, sci. rep. 11, 1 (2021). [14] y zhao, j barés, j e s socolar, yielding, rigidity, and tensile stress in sheared columns of hexapod granules, phys. rev. e 101, 062903 (2020). [15] k dierichs, o kyjánek, m loučka, a menges, construction robotics for designed granular materials: in situ construction with designed granular materials at full architectural scale using a cable-driven parallel robot, constr. robotics 3, 41 (2019). [16] e p g bruun, r pastrana, v paris, a beghini, a pizzigoni, s parascho, s adriaenssens, three cooperative robotic fabrication methods for the scaffold-free construction of a masonry arch, autom. constr. 129, 103803 (2021). [17] a g athanassiadis, m z miskin, p kaplan, n rodenberg, s h lee, j merritt, e brown, j amend, h lipson, h m jaeger, particle shape effects on the stress response of granular packings, soft matter 10, 48 (2014). [18] c avendao, f a escobedo, packing, entropic patchiness, and self-assembly of non-convex colloidal particles: a simulation perspective, curr. opin. colloid in. 30, 62 (2017). [19] y wang, l li, d hofmann, j e andrade, c daraio, structured fabrics with tunable mechanical properties, nature 596, 238 (2021). [20] aerial view of breakwater, pok rie, marang, malaysia. [21] icd aggregate wall 2017, institute for computational design and construction (icd), university of stuttgart (2017). [22] hakon wadell, volume, shape, and roundness of rock particles, j. geol. 40, 443 (1932). [23] w c krumbein, measurement and geological significance of shape and roundness of sedimentary particles, j. sediment. res. 11, 64 (1941). [24] g lees, a new method for determining the angularity of particles, sedimentology 3, 2 (1964). [25] s j blott, k pye, particle shape: a review and new methods of characterization and classification, sedimentology 55, 31 (2008). [26] c r i clayton, c o r abbireddy, r schiebel, a method of estimating the form of coarse particulates, géotechnique 59, 493 (2009). [27] g h bagheri, c bonadonna, i manzella, p vonlanthen, on the characterization of size and shape of irregular particles. powder technol. 270 a, 141 (2015). [28] m a maroof, a mahboubi, a noorzad, yaser safi, a new approach to particle shape classification of granular materials, transp. geotech. 22, 100296 (2020). [29] e t bowman, k soga, w drummond, particle shape characterisation using fourier descriptor analysis, géotechnique 51, 545 (2001). [30] g mollon, j zhao, 3d generation of realistic granular samples based on random fields theory and fourier shape descriptors, comput. method. appl. mech. eng. 279, 46 (2014). [31] d a h hanaor, y gan, m revay, d w airey, i einav, 3d printable geomaterials, gotechnique 66, 323 (2016). [32] h zheng, d wang, j barés, r behringer, jamming by compressing a system of granular crosses, epj web conf. 140, 06014 (2017). [33] d p huet, m jalaal, r van beek, d van der meer, a wachs, granular avalanches of entangled rigid particles, phys. rev. fluids 6, 104304 (2021). [34] j landauer, m kuhn, d s nasato, p foerst, h briesen, particle shape matters using 3d printed particles to investigate fundamental particle and packing properties, powder technol. 361, 711 (2020). [35] n weiner, y bhosale, m gazzola, h king, mechanics of randomly packed filaments the ”bird nest” as meta-material, j. appl. phys. 127, 050902 (2020). 140007-6 https://doi.org/10.1038/s41598-021-82744-w https://doi.org/10.1103/physreve.101.062903 https://doi.org/10.1103/physreve.101.062903 https://doi.org/10.1007/s41693-019-00024-6 https://doi.org/10.1007/s41693-019-00024-6 https://doi.org/10.1016/j.autcon.2021.103803 https://doi.org/10.1039/c3sm52047a https://doi.org/10.1016/j.cocis.2017.05.005 https://doi.org/10.1038/s41586-021-03698-7 https://www.pexels.com/photo/aerial-view-of-breakwater-with-concrete-dolosse-5854657/ https://www.pexels.com/photo/aerial-view-of-breakwater-with-concrete-dolosse-5854657/ https://www.icd.uni-stuttgart.de/projects/icd-aggregate-wall-2017/ https://www.icd.uni-stuttgart.de/projects/icd-aggregate-wall-2017/ https://www.icd.uni-stuttgart.de/projects/icd-aggregate-wall-2017/ https://doi.org/10.1086/623964 https://doi.org/10.1306/d42690f3-2b26-11d7-8648000102c1865d https://doi.org/10.1306/d42690f3-2b26-11d7-8648000102c1865d https://doi.org/10.1111/j.1365-3091.1964.tb00271.x https://doi.org/10.1111/j.1365-3091.1964.tb00271.x https://doi.org/10.1111/j.1365-3091.2007.00892.x https://doi.org/10.1680/geot.2007.00195 https://doi.org/10.1016/j.powtec.2014.10.015 https://doi.org/10.1016/j.powtec.2014.10.015 https://doi.org/10.1016/j.trgeo.2019.100296 https://doi.org/10.1016/j.trgeo.2019.100296 https://doi.org/10.1680/geot.2001.51.6.545 https://doi.org/10.1016/j.cma.2014.06.022 https://doi.org/10.1016/j.cma.2014.06.022 https://doi.org/10.1680/jgeot.15.p.034 https://doi.org/10.1680/jgeot.15.p.034 https://doi.org/10.1051/epjconf/201714006014 https://doi.org/10.1103/physrevfluids.6.104304 https://doi.org/10.1103/physrevfluids.6.104304 https://doi.org/10.1016/j.powtec.2019.11.051 https://doi.org/10.1016/j.powtec.2019.11.051 https://doi.org/10.1063/1.5132809 https://doi.org/10.1063/1.5132809 papers in physics, vol. 14, art. 140007 (2022) / d. cantor et al. [36] r stannarius, j schulze, on regular and random two-dimensional packing of crosses, granul. matter 24, 25 (2022). [37] c ovalle, e frossard, c dano, w hu, s maiolino, p-y hicher, the effect of size on the strength of coarse rock aggregates and large rockfill samples through experimental data, acta mech. 225, 2199 (2014). [38] s linero-molina, l bradfield, s g fityus, j v simmons, a lizcano, design of a 720-mm square direct shear box and investigation of the impact of boundary conditions on large-scale measured strength, geotech. test. j. 43, 1463 (2020). [39] l rothenburg, r j bathurst, analytical study of induced anisotropy in idealized granular material, géotechnique 39, 601 (1989). [40] b andreotti, y forterre, o pouliquen, granular media: between fluid and solid, cambridge university press, new york (2013). [41] j c santamarina, g c cho, soil behaviour: the role of particle shape, proc. adv. geotech. eng.: the skempton conference , 604 (2004). [42] e andò, s a hall, g viggiani, j desrues, p bésuelle, grain-scale experimental investigation of localised deformation in sand: a discrete particle tracking approach, acta geotech. 7, 1 (2012). [43] c r k windows-yule, t weinhart, d j parker, a r thornton, effects of packing density on the segregative behaviors of granular systems, phys. rev. lett. 112, 098001 (2014). [44] e e ehrichs, h m jaeger, g s karczmar, j b knight, v y kuperman, s r nagel, granular convection observed by magnetic resonance imaging, science 267, 1632 (1995). [45] s s shirsath, j t padding, h j h clercx, j a m kuipers, cross-validation of 3d particle tracking velocimetry for the study of granular flows down rotating chutes, chem. eng. sci. 134, 312 (2015). [46] d muir wood, d leśniewska, stresses in granular materials, granul. matter 13, 395 (2011). [47] r hurley, e marteau, g ravichandran, josé e andrade, extracting inter-particle forces in opaque granular materials: beyond photoelasticity, j mech. phys. solids 63, 154 (2014). [48] k e daniels, j e kollmer, j g puckett, photoelastic force measurements in granular materials, rev. sci. instrum. 88, 051808 (2017). [49] a a zadeh, j barés, t a brzinski, k e daniels, et al, enlightening force chains: a review of photoelasticimetry in granular matter, granul. matter 21, 1 (2019). [50] p a cundall, o d l strack, a discrete numerical model for granular assemblies, géotechnique 29, 47 (1979). [51] m jean, j-j moreau, unilaterality and dry friction in the dynamics of rigid body collections, proc. 1st. contact mech. int. symp., 31 (1992). [52] f dubois, v acary, m jean, the contact dynamics method: a nonsmooth story, c. r. mécanique 346, 247 (2018). [53] d sulsky, z chen, h l schreyer, a particle method for history-dependent materials, comput. method. appl. mech. eng. 118, 179 (1994). [54] s g bardenhagen, j u brackbill, d sulsky, the material-point method for granular materials, comput. method. appl. mech. eng. 187, 529 (2000). [55] k soga, e alonso, a yerro, k kumar, s bandara, trends in large-deformation analysis of landslide mass movements with particular emphasis on the material point method, gotechnique 66, 248 (2016). [56] l f orozco, j-y delenne, p sornay, f radjai, rheology and scaling behavior of cascading granular flows in rotating drums, j. rheol. 64, 915 (2020). [57] y huillca, m silva, c ovalle, j c quezada, s carrasco, g e villavicencio, modelling size effect on rock aggregates strength using a dem bonded-cell model, acta geotech. 16, 699 (2021). 140007-7 https://doi.org/10.1007/s10035-021-01190-7 https://doi.org/10.1007/s00707-014-1127-z https://doi.org/10.1520/gtj20190344 https://doi.org/10.1520/gtj20190344 https://doi.org/10.1680/geot.1989.39.4.601 https://www.cambridge.org/9781107034792 https://www.cambridge.org/9781107034792 https://www.icevirtuallibrary.com/doi/abs/10.1680/aigev1.32644.0035 https://www.icevirtuallibrary.com/doi/abs/10.1680/aigev1.32644.0035 https://doi.org/10.1007/s11440-011-0151-6 https://doi.org/10.1007/s11440-011-0151-6 https://doi.org/10.1103/physrevlett.112.098001 https://doi.org/10.1126/science.267.5204.1632 https://doi.org/10.1016/j.ces.2015.05.005 https://doi.org/10.1016/j.ces.2015.05.005 https://doi.org/10.1007/s10035-010-0237-0 https://doi.org/10.1016/j.jmps.2013.09.013 https://doi.org/10.1063/1.4983049 https://doi.org/10.1007/s10035-019-0942-2 https://doi.org/10.1680/geot.1979.29.1.47 https://hal.archives-ouvertes.fr/hal-01863710 https://hal.archives-ouvertes.fr/hal-01863710 https://doi.org/10.1016/j.crme.2017.12.009 https://doi.org/10.1016/j.crme.2017.12.009 https://doi.org/10.1016/0045-7825(94)90112-0 https://doi.org/10.1016/0045-7825(94)90112-0 https://doi.org/10.1016/s0045-7825(99)00338-2 https://doi.org/10.1016/s0045-7825(99)00338-2 https://doi.org/10.1680/jgeot.15.lm.005 https://doi.org/10.1680/jgeot.15.lm.005 https://doi.org/10.1122/1.5143023 https://doi.org/10.1122/1.5143023 https://doi.org/10.1007/s11440-020-01054-z https://doi.org/10.1007/s11440-020-01054-z papers in physics, vol. 14, art. 140007 (2022) / d. cantor et al. [58] t-l vu, j barés, s mora, s nezamabadi, numerical simulations of the compaction of assemblies of rubberlike particles: a quantitative comparison with experiments, phys. rev. e 99, 062903 (2019). [59] d cantor, m cárdenas-barrantes, i preechawuttipong, m renouf, e azéma, compaction model for highly deformable particle assemblies, phys. rev. lett. 124, 208003 (2020). [60] m cárdenas-barrantes, d cantor, j barés, m renouf, emilien azéma, three-dimensional compaction of soft granular packings, soft matter 18, 312 (2022). [61] c voivret, f radjäı, j-y delenne, m s el youssoufi, multiscale force networks in highly polydisperse granular media, phys. rev. lett. 102, 178001 (2009). [62] d cantor, e azéma, i preechawuttipong, microstructural analysis of sheared polydisperse polyhedral grains, phys. rev. e 101, 062901 (2020). [63] a d rakotonirina, j-y delenne, f radjai, a wachs, grains3d, a flexible dem approach for particles of arbitrary convex shape part iii: extension to non-convex particles modelled as glued convex particles, comp. part. mech. 6, 55 (2019). [64] i malinouskaya, v v mourzenko, j-f thovert, p m adler, random packings of spiky particles: geometry and transport properties, phys. rev. e 80, 011304 (2009). [65] l meng, x yao, x zhang, two-dimensional densely ordered packings of non-convex bending and assembled rods, particuology 50, 35 (2020). [66] f ludewig, n vandewalle, strong interlocking of nonconvex particles in random packings, phys. rev. e 85, 051307 (2012). [67] e azéma, f radjäı, b saint-cyr, j-y delenne, p sornay, rheology of three-dimensional packings of aggregates: microstructure and effects of nonconvexity, phys. rev. e 87, 052205 (2013). [68] j-p latham, j mindel, j xiang, r guises, x garcia, c pain, g gorman, m piggott, a munjiza, coupled femdem/fluids for coastal engineers with special reference to armour stability and breakage, geomech geoengin. 4, 39 (2009). [69] c f schreck, n xu, c s o’hern, a comparison of jamming behavior in systems composed of dimerand ellipse-shaped particles, soft matter 6, 2960 (2010). [70] t a marschall, s teitel, athermal shearing of frictionless cross-shaped particles of varying aspect ratio, granul. matter 22, 1 (2020). [71] n a conzelmann, a penn, m n partl, f j clemens, l d poulikakos, c r müller, link between packing morphology and the distribution of contact forces and stresses in packings of highly nonconvex particles, phys. rev. e 102, 062902 (2020). [72] f alonso-marroqúın, spheropolygons: a new method to simulate conservative and dissipative interactions between 2d complex-shaped rigid bodies, europhys. lett. 83, 14001 (2008). [73] s zhao, j zhao, a poly-superellipsoid-based approach on particle morphology for dem modeling of granular media, int. j. numer. anal. meth. geomech. 43, 2147 (2019). [74] s wang, d marmysh, s ji, construction of irregular particles with superquadric equation in dem, theor. app. mech. lett. 10, 68 (2020). [75] z cheng, j wang, estimation of contact forces of granular materials under uniaxial compression based on a machine learning model, granul. matter 24, 17 (2022). [76] g ma, j mei, k gao, j zhao, w zhou, d wang, machine learning bridges microslips and slip avalanches of sheared granular gouges, earth planet. sc. lett. 579, 117366 (2022). 140007-8 https://doi.org/10.1103/physreve.99.062903 https://doi.org/10.1103/physreve.99.062903 https://doi.org/10.1103/physrevlett.124.208003 https://doi.org/10.1103/physrevlett.124.208003 https://doi.org/10.1039/d1sm01241j https://doi.org/10.1039/d1sm01241j https://doi.org/10.1103/physrevlett.102.178001 https://doi.org/10.1103/physrevlett.102.178001 https://doi.org/10.1103/physreve.101.062901 https://doi.org/10.1103/physreve.101.062901 https://doi.org/10.1007/s40571-018-0198-3 https://doi.org/10.1007/s40571-018-0198-3 https://doi.org/10.1103/physreve.80.011304 https://doi.org/10.1103/physreve.80.011304 https://doi.org/10.1016/j.partic.2019.05.003 https://doi.org/10.1016/j.partic.2019.05.003 https://doi.org/10.1103/physreve.85.051307 https://doi.org/10.1103/physreve.87.052205 https://doi.org/10.1103/physreve.87.052205 https://doi.org/10.1080/17486020902767362 https://doi.org/10.1080/17486020902767362 https://doi.org/10.1039/c001085e https://doi.org/10.1039/c001085e https://doi.org/10.1007/s10035-019-0966-7 https://doi.org/10.1103/physreve.102.062902 https://doi.org/10.1103/physreve.102.062902 https://doi.org/10.1209/0295-5075/83/14001 https://doi.org/10.1002/nag.2951 https://doi.org/10.1002/nag.2951 https://doi.org/10.1016/j.taml.2020.01.021 https://doi.org/10.1007/s10035-021-01160-z https://doi.org/10.1016/j.epsl.2022.117366 introduction challenges how is particle geometry described? physical experiments numerical simulations tuning particle shape for applications summary and perspectives papers in physics, vol. 15, art. 150003 (2023) received: 06 february 2023, accepted: 05 may 2023 edited by: s.a. cannas licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.150003 www.papersinphysics.org issn 1852-4249 study of hysteresis in the ferromagnetic random field 3-state clock model in two and three dimensional periodic lattices at zero temperature and in the presence of dilution and an absorbing state elisheba syiem1, r. s. kharwanlang1∗ we numerically study hysteresis in the ferromagnetic random field 3-state clock model in two and three dimensional periodic lattices at zero temperature and in the zero frequency limit of the driving field. the on-site quenched disorders are continuous and are drawn from a uniform distribution. we numerically analyzed the effects of disorder on the dynamics of the model and hence on the shape of the hysteresis loops. we also study the model in the presence of dilution and an absorbing state. i introduction hysteresis in quenched disordered systems has been the subject of numerous research studies over the past years [1,2]. these studies are crucial because they offer a wealth of insights into the rich field of phenomena associated with nonlinear and random systems, and they also have numerous practical applications, for example in the magnetic recording industry [3–5]. systems with quenched disorder often possess large number of metastable states that are separated from each other by energy barriers much larger than the thermal energy. the barriers have a distribution of heights that depends strongly on the details of the system. the metastable states correspond to local minima in the free-energy landscape of the system. the system remains trapped in a local minimum for a long time and is unable to attain thermal equilibrium over the practical time scale of interest. however the system can be driven from one metastable state to another by the ap∗ regenel@nehu.ac.in 1 department of basic sciences and social sciences, north eastern hill university, shillong-22, india. plied external field. as the field is varied continuously, the response of the system consists of irregular jumps arising from the non-uniformity of barrier heights. this gives rise to barkhausen noise. the non-equilibrium random-field ising model at zero temperature was proposed by sethna [6] to study hysteresis and phase transitions in systems with quenched disorder. the disorder in their model is characterized by on-site quenched random fields having a gaussian distribution with mean value zero and standard deviation σ. the model is able not only to reproduce hysteresis loops that resemble the observed experimental loops but also provide an understanding of other aspects associated with it, for example, the barkhausen noise, return point memory etc. the model also exhibits a nonequilibrium critical point σ = σc, h = hc at which the barkhausen jumps become scale invariant. for σ < σc, there is a first order jump in the magnetization on each half of the hysteresis loop. as σ is increased, the size of the jump decreases continuously to zero at σc. in addition to this model, other spin models have also been formulated to study hysteresis and phase transitions in systems with quenched randomness [7–11]. the present work studies the zero temperature hysteresis in the random field 3150003-1 papers in physics, vol. 15, art. 150003 (2023) / e syiem and r s kharwanlang state clock model and in the limit of zero frequency of the driving field. in our model the random field has a fixed magnitude, so we vary the ferromagnetic interaction j. we found that the response of a system to an applied field in the high j limit (low disorder) consists of first order jumps in the magnetization. as j is decreased, the size of the jump decreases gradually to zero at a particular value j = jc and h = hc. the point (jc, hc) marks the existence of a non-equilibrium critical point of the model. in the present work we focus on the analysis of the shape of hysteresis loops rather than estimating the value of the critical point jc. the shape of the hysteresis loop has practical importance as it relates directly to the dissipation of energy in the system. we also found that at very high value of j, spins flipped directly from first state to third state and no spins flipped to the second state as the field was varied continuously from h = −∞ to h = ∞. in the present work we also study the effect on the hysteresis loop when dilution and an absorbing state are incorporated into the system. an absorbing state is a state in which the spin degrees of freedom of the system remain frozen over the practical time scale of interest [12, 13]. in the present work, the second state is considered as an absorbing state. as the field is increased gradually from sufficiently large negative value, spins flipped from the first state to the second state or to the third state. the spins that had flipped to the second state remained frozen and could not leave the state throughout the entire journey of the applied field. ii model the model is defined by the hamiltonian, h = −j ∑ i,j s⃗i.s⃗j − ∑ i h⃗i.s⃗i − h⃗. ∑ i s⃗i (1) where s⃗i is a 2-component unit spin vector located at the site i. s⃗i can point along any of the three directions defined by the angles θ = θa, θ = θb and θ = θc that the spin vector makes with the +x-axis. we set 2π/3 < θa ≤ π, π/3 < θb ≤ 2π/3, and 0 ≤ θc ≤ π/3 as shown in fig. 1 below. at each site i there is a 2-component quenched random field unit vector h⃗i. the vector h⃗i is assumed to be continuous and it is defined by an angle αi ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� ����� s a s b s c θ a b θ c θ x−axis figure 1: the figure depicts the spin states defined by angles θa, θb and θc. (0 < αi < π) that it makes with the +x-axis. the summation in the first term is over the nearest neighbors of the spin s⃗i. the uniform external field h, |⃗h| = h is applied along the x-axis. j is the ferromagnetic interaction between spins (j > 0). the first and the last terms in eq. (1) favor parallel alignment of spins while the second term introduces disorder in the system, thereby making each spin point in the direction of the random field h⃗i. writing eq. (1) in terms of the effective local field f⃗i acting at each site i, h = − ∑ i f⃗i(t).s⃗i(t); f⃗i(t) = j ∑ j s⃗j(t) + h⃗i + h⃗ (2) at zero temperature, the energy of a spin, and hence that of the entire system, is minimum when each spin points along the direction of the local field f⃗i at its site. since |s⃗i| = 1, the state of each spin in the lattice at any applied field h is wholly ascertained by the direction f̂i of the local field f⃗i. let sxj and s y j denote the components of the spin s⃗j along the x-axis and the y-axis respectively and similarly let hxi and h y i be the components of the vector h⃗i along the x-axis and the y-axis respectively. the xcomponent of f̂i is cos θi = (3) j ∑ j s x j + h x i + h [(j ∑ j s x j + h x i + h) 2 + (j ∑ j s y j + h y i ) 2]1/2 . 150003-2 papers in physics, vol. 15, art. 150003 (2023) / e syiem and r s kharwanlang we are interested in looking only at the ordering of the spins along the field direction. assuming no global ordering along the y-axis, we set ∑ j s y j = 0. writing eq. (3) in terms of the angles θ and α that the spin vector and the random field vector make with the +x-axis, we have cos θi = j ∑ j cos θj+h+cos αi [(j ∑ j cos θj+h)2+2(j ∑ j cos θj+h) cos αi+1]1/2 . (4) as the field is continuously varied from h = −∞ to h = ∞, cos θi in eq. (4) can take any values from −1 to +1. since in our model the spin can take only three states, we set the states of the spin as follows: cos θi =   sa = cos θa for cos π ≤ cos θi < cos(2π/3) sb = cos θb for cos(2π/3) ≤ cos θi < cos π/3 sc = cos θc for cos π/3 ≤ cos θi ≤ cos 0 (5) at any applied field value h, the state of a spin is represented either by a single projection sa = cos θa or sb = cos θb or sc = cos θc. furthermore, in a given state at h, sa or sb or sc is a representative of the various projections/directions that the spin vector makes with the field direction in that state at h. evidently, a spin that is stable in a given state at h can have a range of minimum energy values rather than a single value. for example, a spin that is stable at the third state sc at h has minimum energy value ϵi in the range − cos θi ≤ ϵi ≤ − cos θi cos π/3. the range of ϵi is independent of the dimension of the lattice but the values of ϵi in the corresponding range are different for different dimensions of the lattice. for example in a cubic lattice (3d) the range of ϵi is the same as that of a square lattice (2d) but the values of ϵi are different from that of a square lattice as given by eq. (4). rewriting eq. (4), γi(h) = a + βi (a2 + 2aβi + 1) 1 2 (6) where a = j ∑ j γj(h) + h; γi = cos θi; βi = cos αi; and −1 ≤ γi ≤ 1; −1 ≤ βi ≤ 1. starting from a sufficiently large and negative applied field when all spins are in the first state sa, α = θ α = θ i i i i β =γ i i i i + β =−γ − x y si s i figure 2: the case a = 0 where the spin vector s⃗i points along the random field h⃗i. β + and β− are the projections of h⃗i along the direction of the applied field. we increase the field in small steps. at each step, the dynamics given by eq. (6) is applied recursively keeping h constant, until each spin in the lattice is oriented along the direction of the local field at its site and in line with eq. (5). this results in a configuration of spins in the lattice consisting of spins that are stable in the first state sa or the second state sb or the third state sc. it is a local minimum of the energy of the system within the approximation explained above and it represents a stable state of the system at zero temperature. at non-zero temperature it would become a metastable state if the barriers of thermal fluctuations are smaller than that of the quenched random fields. if, after a spin is relaxed, the energy at the neighboring spin increases, then the neighbor is relaxed in the next step. holding h constant, we allow this process to continue till all spins are relaxed along the directions of their respective local fields. the fraction of unstable spins that are relaxed during this process determines the size of the avalanche. keeping the applied field constant during the avalanche justifies the assumption that the frequency of the applied field is infinitely slow and that the spins are relaxed infinitely quickly as compared to the time of variation of the field. writing eq. (6) in terms of the random fields βi, 150003-3 papers in physics, vol. 15, art. 150003 (2023) / e syiem and r s kharwanlang β±i (γi, a) = (7) − a(1 − γ2i ) ± γi(1 − a 2 + a2γ2i ) 1 2 with −(1 − γ2i ) − 1 2 ≤ a ≤ (1 − γ2i ) − 1 2 . it is easy to see that, β+i (γi, a) = β − i (−γi, a) β+i (γi, a) = −β − i (γi, −a) (8) the geometrical picture of β+ and β− is shown in fig. 2. for example, we consider a = 0, under this case, eq. (6) gives θi = αi. this may be expected because a spin can now lower its energy only if it is aligned along the direction of the random field. fig. 2 shows the projections β+ and β− along the field direction. we choose to work with β+i . for any given a value, it is seen from eq. (7) that β+j (γj) > β + i (γk) for γj > γk. in the present work, the random fields −∆ < β+i < ∆ are continuous and are drawn from a uniform probability distribution, p(βi) = { 1 2∆ if − ∆ < βi < ∆ 0 otherwise (9) iii simulations we increased the field slowly from h = −∞ to h = ∞ in small steps. successively at each field step we run the dynamics described by eq. (6) until all spins are stable. this drives the system through a succession of local minima. if on reversing the field from h = ∞, the system visited a sequence consisting of different local minima, the system is said to exhibit hysteresis. in the present work we are interested in studying the zero temperature hysteresis loops of the model in the two dimensional square lattice as well as in the three dimensional cubic lattice and in the presence of dilution and an absorbing state. we run the simulations with l = 3000 spins for 2d lattice and l = 1000 spins for 3d lattice. at each value of the applied field we averaged the magnetization per spin of the system over 10000 realizations of the random fields. we estimated the statistical error involved in our numerical calculations at several values of h. at each -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 m a g n e ti z a ti o n applied field figure 3: hysteresis in 2d for j = 0.1, sa = −0.75, sb = 0.0, sc = 0.75, l = 3000. h value, we binned the 10000 data of average magnetization into 100 bins, where each bin contains 100 data. from each bin we compute the average magnetization and estimate the error by calculating the standard deviation with respect to the average magnetization at h. the error computed is approximately 0.000367. at the starting field h = −∞, all spins are stable in the first state sa. on increasing the field slowly, at some value h of the applied field, some spins may become unstable. this arises because the direction of the local fields at their sites no longer points along the directions relevant to state sa. we call them the seed spins. the seed spins are relaxed either to the second state sb or to the third state sc depending on the direction of the local fields at their sites. the neighbors of seed spins may find themselves in a more favored position to become unstable and they are also relaxed. this leads to an avalanche of flipped spin at h. after the avalanche had stopped, we calculated the magnetization per spin at that value of the applied field. we found that at very small values of j, the hysteresis loop area is very small and has a constriction along the middle portion of the trajectory. these loops are called wasp-waisted hysteresis loops. the small hysteresis loop area can be attributed to the fact that at very low j, the disorder in the system is very strong and the spins act almost independently from each other. fig. 3 shows the magnetization of the system for a 2d lattice with l = 3000 at j = 0.1, and fig. 4 shows the corresponding magnetization for 150003-4 papers in physics, vol. 15, art. 150003 (2023) / e syiem and r s kharwanlang -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 m a g n e ti z a ti o n applied field figure 4: hysteresis in 3d for j = 0.1, sa = −0.75, sb = 0.0, sc = 0.75, l = 1000. a 3d lattice with l = 1000 at j = 0.1. as j is increased further, the wasp-waisted shape of the loop disappears and we get the normal hysteresis loops as shown in fig. 5 and fig. 6. in fig. 7 and fig. 8, we plotted the hysteresis curves of the second state sb in increasing (blue curve) and decreasing field (green curve) in 2d and 3d lattices and at j = 0.385 and j = 0.257, respectively. the separation of the two curves starts at h = 0. this can be understood as follows: the field value at which spins start flipping from the first state sa in the increasing field is given by h = −j ∑4 i=1 sa − 1√ (1−cos2 π/3) and on decreasing the field from h = +∞ the dynamics starts at h = −j ∑4 i=1 sc + 1√ (1−cos2 π/3) . the seed spin in the increasing field has all nearest neighbors in the first state sa while on decreasing the field, the neighbors of the seed spin are all in the third state sc. since sa = −sc, then from the above two equations we see that the separation starts at h = 0. the separation of the two curves increases as j is increased. in our model the disorder is tuned by the ferromagnetic interaction j. as j increases, the disorder in the system decreases. we found that at high values of j, there is a first order jump in the magnetization and as we lower j the size of the jump decreases continuously to zero at a critical value jc. jc marks the non-equilibrium critical point of the model. in the present work we are not in a posi-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 m a g n e ti z a ti o n applied field figure 5: hysteresis in 2d for j = 0.5, sa = −0.75, sb = 0.0, sc = 0.75, l = 3000. tion to determine the exact value of jc numerically, but instead we identify the range within which the value of jc lies. in fig. 9 we show the hysteresis curves in increasing field for 2d lattice at j = 0.25 (violet), j = 0.3 (green) and j = 0.35 (light blue). the hysteresis curve at j = 0.25 is continuous and the curve at j = 0.35 has a discontinuity. the corresponding value of jc in 2d appears to lie between j = 0.25 and j = 0.35. fig. 10 shows a similar behavior in 3d lattice for j = 0.1 (violet), j = 0.15 (green) and j = 0.2 (light blue). the value of jc in this case appears to lie between j = .15 and j = .2. the value of jc in 3d is found to be lower than that in 2d. in fig. 11 we plotted the minor hysteresis -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 m a g n e ti z a ti o n applied field figure 6: hysteresis in 3d for j = 0.4, sa = −0.75, sb = 0.0, sc = 0.75, l = 1000. 150003-5 papers in physics, vol. 15, art. 150003 (2023) / e syiem and r s kharwanlang 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 f ra c ti o n o f s p in s a t th e s e c o n d s ta te applied field figure 7: trajectory of the fraction of spins at the second state in increasing and decreasing field in 2d for j = 0.385, sa = −0.75, sb = 0.0, sc = 0.75, l = 3000. loops of the model along with the major hysteresis loop for the 2d lattice at j = 0.4. we increase the field slowly from h = −0.7 and reversed the field at a value h = 0.3 before the magnetization is saturated. the reversed curve touches the return major loop at h = −0.09. on increasing the field again from h = −0.09, the curve meets the point where it was last reversed at h = 0.3. a minor loop within a minor loop is also traced in fig. 11. this shows that the model exhibits return point memory. the minor hysteresis loops for the 3d lattice 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 f ra c ti o n o f s p in s a t th e s e c o n d s ta te applied field figure 8: trajectory of the fraction of spins at the second state in increasing and decreasing field in 3d for j = 0.257, sa = −0.75, sb = 0.0, sc = 0.75, l = 1000. -0.8 -0.6 -0.4 -0.2 0 0.2 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 m a g n e ti z a ti o n applied field figure 9: jumps in magnetization in 2d lattice in increasing field for j = 0.25 (violet), j = 0.3 (green) and j = 0.35 (light blue), l = 3000. at j = 0.22 is shown in fig. 12. at high value of j, for example at j = 1.54 in 2d lattice, spins from the first state flipped directly to the third state and no spin flipped to the second state. this can be explained below. in fig. 13 we traced the random field profile from eq. (7) for spins in the 2d lattice and at j = 1.54. in this plot, for example at the h value along the red vertical line, cd gives the range of random fields where spins with one nearest neighbor in second state sb and the remaining three in the first state sa remain in the first state sa at h. the range -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 m a g n e ti z a ti o n applied field figure 10: jumps in magnetization in 3d lattice in increasing field for j = 0.1 (violet), j = 0.15 (green) and j = 0.2 (light blue), l = 1000. 150003-6 papers in physics, vol. 15, art. 150003 (2023) / e syiem and r s kharwanlang -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 m a g n e ti z a ti o n applied field figure 11: minor hysteresis loops in 2d lattice for j = 0.4, l = 3000. bc and ab give the values of random fields where such spins can flip to second state sb and third state sc at h respectively. as we increased the field slowly at j = 1.54, some spins with all neighbors in the first state sa start flipping to second state at hs = −j ∑4 i=1 sa − 1√ (1−cos2 π/3) ≈ 3.465. this is shown in fig. 13 when the red vertical line touches the light blue curve at h ≈ 3.465. we call these spins seed spins. the random field profile of seed spins is governed by the light blue curve in fig. 13. after a seed spin is flipped, it may cause one of its neighbors to flip to third state if the random field of the neighbor is in the range ab. once flipped to third state, the neighbor in question can then cause an avalanche that spans across the entire system, where spins can flip directly from the first state to the third state. this is because, at h = hs and j = 1.54, it can be seen from eq. (6) that a spin with one neighbor in the third state and the remaining three neighbors in first state can always flip from the first state to the third state even if it has the minimum random field value β = −1. a similar behavior is also seen in 3d lattice. i hysteresis in the presence of dilution and an absorbing state on diluting the lattice, most spins find themselves surrounded by vacancies. vacancies are lattice sites that are not occupied by spins. at low dilution we observed that the behavior of the model is similar to that of the non-dilute model. in the limit of ex-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 m a g n e ti z a ti o n applied field figure 12: minor hysteresis loops in 3d lattice for j = 0.22, l = 1000. treme dilution, hysteresis starts at a higher value of j as compared to that when there is no dilution. this is expected because the system is now punctuated by large numbers of isolated clusters of spins. these spins behave independently, as the field is continuously changing. in this limit there is no possibility of occurrence of avalanches that -1 0 1 1 2 3 4 5 6 a b c d m a g n e ti z a ti o n applied field figure 13: distribution of random fields at j = 1.54 for 2d lattice. the curve at the right (light blue) is for spins with all nearest neighbors in the first state sa. the middle curve (green) is for spins with one nearest neighbor in the second state sb and the remaining three in the first state sa. the left curve (black) is for spins with one nearest neighbor in the third state sc while the remaining three are in the first state sa. 150003-7 papers in physics, vol. 15, art. 150003 (2023) / e syiem and r s kharwanlang 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 -6 -4 -2 0 2 4 6 f ra c ti o n o f s p in s a t th e s e c o n d s ta te applied field figure 14: trajectory of the fraction of spins at the second state in increasing and decreasing field in 2d lattice with vo = 0.8 as the fraction of vacancies in the system and for j = 0.5, l = 3000. span across the entire system, and consequently no possibility of macroscopic jumps in magnetization, hence no possibility of critical behavior. fig. 14 shows the trajectory of the fraction of spins at the second state in increasing and decreasing field in the 2d lattice at high dilution v0 = 0.8 and at j = 5.0, and fig. 15 shows the magnetization plot in a 2d lattice at the same values v0 = 0.8 and at j = 5.0. as shown in fig. 14, the trajectory of the system in increasing field as well as in decreasing field consists of two peaks. it starts with the big-0.15 -0.1 -0.05 0 0.05 0.1 0.15 -6 -4 -2 0 2 4 6 m a g n e ti z a ti o n applied field figure 15: hysteresis curves in 2d lattice with v0 = 0.8 as the fraction of vacancies in the system for j = 5.0, l = 3000. -1 -0.5 0 0.5 1 -2 -1 0 1 2 3 4 5 6 r a n d o m f ie ld applied field figure 16: random field profiles for spins with all neighbors vacancies (green) and spins with one neighbor in the first state with the remaining three being vacancies (violet). the hysteresis curves in increasing field for the second state in 2d lattice with v0 = 0.8 as a fraction of dilution in the system and at j = 5.0, l = 3000 is also plotted here. ger peak and then the smaller peak. the bigger peaks in the increasing and decreasing field overlapped with each other. this can be understood as follows. in fig. 16 we plotted fig. 14 (in increasing field) together with the random field profile given by eq. (7) at j = 5.0. the green curve is the random field profile for spins with all nearest neighbors as vacancies and the violet curve is that for spins with one nearest neighbor in the first state and the other three as vacancies. in the increasing field, the dynamics start when spins with all nearest neighbors as vacancies start to flip from first state to the second state. this occurs at a field value h = −j ∑ cos θj − 1√ (1−cos2 π/3) ≈ −1.154 with cos θj = sb. it is clearly seen from fig. 16 that the bigger peak occurs due to the flipping of these isolated spins to the second state. since these spins can flip independently from each other, their response to the field in increasing and decreasing trajectory is exactly the same. this explains the overlap of the bigger peak in increasing and decreasing field. the maximum value of the peak also occurs at a field value where the range of random fields accessible to these spins at the second state is maximum (green curve in fig. 16). 150003-8 papers in physics, vol. 15, art. 150003 (2023) / e syiem and r s kharwanlang 0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 -1.5 -1 -0.5 0 0.5 1 1.5 f ra c ti o n o f s p in s a t th e t h ir d s ta te applied field figure 17: trajectory of the fraction of spins at the third state in increasing and decreasing field in 2d with absorbing state, j = 0.3, l = 3000. at h = 1√ (1−cos2 π/3) = 1.154, no such spins remained in the second state and all of them had flipped to the third state. similarly, the lower hysteresis loop in fig. 15 below the plateau is due to the flipping of these spins from the second state to the third state. as the field is increased from h = 1.154, no spins flip until the point where spins with one nearest neighbor in the first state and whose remaining neighbors are vacancies start to flip at h = 2.596. this explains the presence of the plateau in the lower hysteresis loop in fig. 15. similarly the plateau in the corresponding upper hysteresis curve in the reversing field can be explained from the reversed trajectory of the second state. a similar behavior is also observed in 3d lattice. in the present work we also consider the case when the second state is an absorbing state. when a spin flipped to the absorbing state, it cannot leave the state even when the field is very strong. as such, spins in the absorbing state remained in that state throughout the entire journey of the applied field. therefore in the increasing field spins can flip from first state to second state or from first state to third state while no spins can flip from second state to third state. similarly, on reversing the field, spins from third state can flip to either second or first state while no spins can flip from second state to first state. we observed that the jump in magnetization, and hence the critical behavior of the system, occurs only in the increasing 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 -1.5 -1 -0.5 0 0.5 1 1.5 f ra c ti o n o f s p in s a t th e t h ir d s ta te applied field figure 18: trajectory of the fraction of spins at the third state in increasing and decreasing field in 3d with absorbing state, j = 0.3, l = 1000. field. this is expected because when a spin flips in the increasing field it also causes an avalanche of flipped spins. on reversing the field, the spin cannot flip back unless the neighbors are flipped first. if a neighbor of a spin flipped to the absorbing state after the spin had flipped to the third state, then on reversing the field, the spin cannot flip back because its neighbor remained in the second state. such a spin can flip back only by the influence of the field. and as the field is continuously varied, the spin of interest will flip back to the second state. therefore, the presence of an absorbing state in our model prevents the occurrence of avalanches of flipped spins from the third state to the first state in the decreasing field. in fig. 17 and fig. 18 we plot the trajectory of the fraction of spins at the third state in the increasing as well as decreasing field in the 2d and 3d lattices, respectively, when the second state is an absorbing state at j = 0.3. as shown in the graphs, the hysteresis loops are asymmetric and have the wasp-waisted shape. iv discussion and conclusions we have presented in this work the numerical study of the zero temperature hysteresis of the random field 3-state clock model in two and three dimensional lattices and also incorporated the effect on the hysteresis loops when dilution and an absorbing state are present in the system. in this study, 150003-9 papers in physics, vol. 15, art. 150003 (2023) / e syiem and r s kharwanlang we observed that the presence of quenched disorder in the system has a strong effect on the shape of hysteresis loops as well as on its critical behavior. the shape of hysteresis loop is not universal; nonetheless, it has significance as it measures the amount of memory stored in the system. the presence of dilution breaks the system into isolated clusters of spins. the spins in this case act almost independently from each other. in the low dilution limit, the behavior of the model is qualitatively similar to the undiluted model. at high dilution, the model behaves differently. for example, as the field is increasing slowly from h = −∞ to h = ∞, we found no first order jump in the magnetization. the presence of dilution prevents the formation of an avalanche that would span across the system. in the presence of an absorbing state, the critical behavior of the system is seen only in the increasing field, disappearing when the field is decreased slowly from h = ∞. the hysteresis loops in this case are asymmetrical and acquire a wasp-waisted shape. these loops are observed in many diverse systems, for example in magnetic rocks [14], shape memory alloy [15], martensites [16], etc. we have also studied the cases when θc(= −θa) takes any value 0 < θc < π/3. we observed that as θc is varied continuously from π/3 to zero, the range of jc decreases towards jc = 0 in 2d as well as in 3d lattice. the behavior of the model is qualitatively similar, though the distribution of random fields is drastically different in each case. we hope the work presented in this paper motivates further studies with more refined analysis. [1] the science of hysteresis, edited by g bertotti and i mayergoyz, academic press, amsterdam, (2006). [2] a p young, spin glasses and random fields, world scientific, singapore, (1997). [3] a mosder, k takano, d t margulies, et al., magnetic recording: advancing into the future, j. phys. d., 35, r157, (2002). [4] a ali, t shah, r ullah, et al., review on recent progress in magnetic nano particles: synthesis, characterization and diverse applications, frontiers in chemistry, 9, (2021). [5] j i martin, j nogues, k liu, j l vicent, i k schuller, ordered magnetic nano structures: fabrication and properties, j. magn. magn. mater., 256, 449, (2003). [6] j p sethna, k dahmen, s kartha, et al., hysteresis and hierarchies: dynamics of disorder-driven first-order phase transformations, phys. rev. lett., 70, 3347, (1993). [7] e vives, j goicoechea, j ortin, a planes, universality in models for disorder-induced phase transitions, phys. rev. e, 52, r5, (1995). [8] x p qin, b zheng, n j zhou, depinning phase transition in the two dimensional clock model with quenched randomness, phys. rev. e, 86, 031129, (2012). [9] o d r salmon, f d nobre, anisotropic fourstate clock model in the presence of random fields, phys. rev. e, 93, 022125, (2016). [10] p shukla, r s kharwanlang, hysteresis in random-field xy and heisenberg models: mean-field theory and simulations at zero temperature, phys. rev. e, 81, 031106 (2010). [11] p shukla, r s kharwanlang, critical hysteresis in random-field xy and heisenberg models, phys. rev. e, 83, 011121 (2011). [12] j marro, r dickman, non-equilibrium phase transitions in lattice model, cambridge university press, cambridge, (1999). [13] h hinrichsen, nonequilibrium critical phenomena and phase transitions into the absorbing states, adv. phys., 49, 815, (2000). [14] a p roberts, c r pike, k l verosub, first order reversal curve diagrams: a new tool for characterizing magnetic properties of natural sample, j. geophys. res., 105, 28461, (2000). [15] l straka, o heczko, n lanska, magnetic properties of various martensitic phases in ni-mnga alloy, ieee trans. magn., 38, 5, (2002). [16] j goicoechea, j ortin, a random field 3state spin model to simulate hysteresis and avalanches in martensitic transformations, j. phys. iv (france), 05, c2, (1995). 150003-10 https://doi.org/10.1142/3517 https://doi.org/10.1088/0022-3727/35/19/201 https://doi.org/10.3389/fchem.2021.629054 https://doi.org/10.1002/chin.200325214 https://doi.org/10.1002/chin.200325214 https://doi.org/10.1103/physrevlett.70.3347 https://doi.org/10.1103/physreve.52.r5 https://doi.org/10.1103/physreve.86.031129 https://doi.org/10.1103/physreve.86.031129 https://doi.org/10.1103/physreve.93.022125 https://doi.org/10.1103/physreve.81.031106 https://doi.org/10.1103/physreve.83.011121 https://doi.org/10.1017/cbo9780511524288 https://doi.org/10.1017/cbo9780511524288 https://doi.org/10.1080/00018730050198152 https://doi.org/10.1029/2000jb900326 https://doi.org/10.1109/tmag.2002.802469 https://doi.org/10.1051/jp4:1995210 https://doi.org/10.1051/jp4:1995210 introduction model simulations hysteresis in the presence of dilution and an absorbing state discussion and conclusions papers in physics, vol. 3, art. 030004 (2011) received: 11 may 2011, accepted: 15 july 2011 edited by: j-c. géminard reviewed by: b. tighe, instituut-lorentz, universiteit leiden, netherlands. licence: creative commons attribution 3.0 doi: http://dx.doi.org/10.4279/pip.030004 www.papersinphysics.org issn 1852-4249 master curves for the stress tensor invariants in stationary states of static granular beds. implications for the thermodynamic phase space luis a. pugnaloni,1∗ josé damas,2† iker zuriguel,2‡ diego maza2§ we prepare static granular beds under gravity in different stationary states by tapping the system with pulsed excitations of controlled amplitude and duration. the macroscopic state—defined by the ensemble of static configurations explored by the system tap after tap—for a given tap intensity and duration is studied in terms of volume, v , and force moment tensor, σ. in a previous paper [pugnaloni et al., phys. rev. e 82, 050301(r) (2010)], we reported evidence supporting that such macroscopic states cannot be fully described by using only v or σ, apart from the number of particles n. in this work, we present an analysis of the fluctuations of these variables that indicates that v and σ may be sufficient to define the macroscopic states. moreover, we show that only one of the invariants of σ is necessary, since each component of σ falls onto a master curve when plotted as a function of tr(σ). this implies that these granular assemblies have a common shape for the stress tensor, even though it does not correspond to the hydrostatic type. although most results are obtained by molecular dynamics simulations, we present supporting experimental results. i. introduction the study of static granular systems is of fundamental importance to the industry to improve the storage of such materials in bulk, as well as to optimize the packaging design. however, far from yielding such benefits of practical interest, physicists have found a fascinating challenge on their way that has stuck them, to a large extent, in the ∗e-mail: luis@iflysib.unlp.edu.ar †e-mail: jdelacruz@alumni.unav.es ‡e-mail: iker@unav.es §e-mail: dmaza@unav.es 1 instituto de f́ısica de ĺıquidos y sistemas biológicos, (conicet la plata, unlp), calle 59 no 789, 1900 la plata, argentina. 2 departamento de f́ısica y matemática aplicada, facultad de ciencias, universidad de navarra, pamplona, spain. area of static granular matter. such challenge is finding an appropriate description of the simplest state in which matter can be found: equilibrium [1]. at equilibrium, a sample will explore different microscopic configurations over time, in such a way that macroscopic averages over large periods will be well defined, with no aging. moreover, if not only equilibrium but ergodicity is present, averages over a large number of replicas at a given time should give equivalent results to time averages [2]. finally, one expects that all macroscopic properties of such equilibrium states can be put in terms of a few independent macroscopic variables. then, a thermodynamic description and, hopefully, a statistical mechanics approach can be attempted. theoretical formalisms based on these assumptions have been used to analyze data from experiments and numerical models. however, some of the foundations are still supported by little evidence. 030004-1 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. the use of careful protocols to make a granular sample explore microscopic configurations within a seemingly equilibrium macroscopic state has given us the first standpoint [3–6]. in such protocols, the sample is subjected to external excitations of the form of pulses. well defined, reproducible time averages are found after a transient if the same pulse shape and pulse intensity are applied. however, for low intensity pulses, a previous annealing might be in order since equilibrium is hard to reach —as in supercooled liquids. we will use the expressions equilibrium state, steady state or simply state to refer to the collection of all configurations generated by using a given external excitation after any transient has disappeared. more than twenty years ago, edwards and oakshott [7] put forward the idea that the number of grains n and the volume v are the basic state variables that suffice to characterize a static sample of hard grains in equilibrium. the nv granular ensemble was then introduced as a collection of microstates, where the sample is in mechanical equilibrium, compatible with n and v . however, newer theoretical works [8–13] suggest that the force moment tensor, σ, (σ = v σ, where σ is the stress tensor) must be added to the set of extensive macroscopic variables (i.e., an nv σ ensemble) to adequately describe a packing of real grains. in the rest of this paper, we show experimental and simulation evidence that the equilibrium states of static granular packings cannot be only described by v (or equivalently the packing fraction, φ, defined as the fraction of the space covered by the grains) nor by σ. we do this by generating states of equal v but different σ and states of equal σ but different φ. we also show that states of equal v may present different volume fluctuations. moreover, we show that states of equal v and σ display the same fluctuations of these variables, suggesting that no other extensive parameter might be required to characterize the state (apart from n). finally, but of major significance, we show that the shape of the force moment tensor is universal, in the sense that different states that present the same trace of the tensor actually have the same value in all the components of σ. ii. simulation we use soft-particle 2d molecular dynamics [14,15]. particle–particle interactions are controlled by the particle–particle overlap ξ = d − |rij| and the velocities ṙij, ωi and ωj. here, rij represents the center-to-center vector between particles i and j, d is the particle diameter and ω is the particle angular velocity. these forces are introduced in the newton’s translational and rotational equations of motion and then numerically integrated by a velocity verlet algorithm [16]. the interaction of the particles with the flat surfaces of the container is calculated as the interaction with a disk of infinite radius. the contact interactions involve a normal force fn and a tangential force ft. fn = knξ −γnvni,j (1) ft = −min (µ|fn|, |fs|) · sign (ζ) (2) where fs = −ksζ −γsvti,j (3) ζ (t) = ∫ t t0 vti,j (t ′) dt′ (4) vti,j = ṙij ·s + 1 2 d (ωi + ωj) (5) the first term in eq. (1) corresponds to a restoring force proportional to the superposition ξ of the interacting disks and the stiffness constant kn. the second term accounts for the dissipation of energy during the contact and is proportional to the normal component vni,j of the relative velocity ṙij of the disks. equation (2) provides the magnitude of the force in the tangential direction. it implements the coulomb’s criterion with an effective friction following a rule that selects between static or dynamic friction. notice that eq. (2) implies that the maximum static friction force |fs| used corresponds to µ|fn|, which effectively sets µdynamic = µstatic = µ. the static friction force fs [see eq. (3)] has an elastic term proportional to the relative shear displacement ζ and a dissipative term proportional to the tangential component vti,j of the relative velocity. in eq. (5), s is a unit vector normal to rij. the 030004-2 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. elastic and dissipative contributions are characterized by ks and γs, respectively. the shear displacement ζ is calculated through eq. (4) by integrating vti,j from the beginning of the contact (i.e., t = t0). the tangential interaction behaves like a damped spring which is formed whenever two grains come into contact and is removed when the contact finishes [17]. the particular set of parameters used for the simulation is (unless otherwise stated): µ = 0.5, kn = 10 5(mg/d), γn = 300(m √ g/d), ks = 2 7 kn and γs = 200(m √ g/d). in some cases, we have varied µ and γn in order to control the friction and restitution coefficient. the integration time step is set to δ = 10−4 √ d/g. the confining box (13.39d-wide and infinitely high) contains n = 512 monosized disks. units are reduced with the diameter of the disks, d, the disk mass, m, and the acceleration of gravity, g. tapping is simulated by moving the confining box in the vertical direction following a half sine wave trajectory [a sin(2πνt)(1−θ(2πνt−π))]. the excitation can be controlled through the amplitude, a, and the frequency, ν, of the sinusoidal trajectory. we implement a robust criterion based on the stability of particle contacts to decide when the system has reached mechanical equilibrium [14] before a new tap is applied to the sample. averages were taken over 100 taps in the steady state and over 20 independent simulations for each value of a and ν. the volume, v , of the system after each tap can be obtained from the packing fraction, φ, as v = nπ(d/2)2/φ. we measure φ in a rectangular window centered in the center of mass of the packing. the measuring region covers 90% of the height of the granular bed (which is of about 40d) and avoids the area close to the walls by 1.5d. we have observed that φ is sensitive to the chosen window. however, none of the conclusions drawn in this paper are affected by this choice. the stress tensor, σ, is calculated from the particle–particle contact forces as σαβ = 1 v ∑ cij rαijf β ij, (6) where the sum runs over all contacts. the force moment tensor, σ, is defined as σ ≡ v σ. during the course of a tap σ is non-symmetric, however, once mechanical equilibrium is reached in accordance with our criterion, σ becomes symmetric within a very small error compared with the fluctuations of σ. although σ may depend on depth, we have measured the force moment tensor by simply summing over particle–particle contacts in the entire system. the fluctuations of φ (∆φ) and σ (∆σ) are calculated as the standard deviation in the 100 taps obtained in each steady state. we average φ, σ, and their fluctuations over 20 independent runs for each steady state and estimate error bars as the standard deviation over these 20 runs. iii. experimental method (a) (b) figure 1: (a) schematic diagram of the experimental set-up. a: accelerometer, s: shaker, c: camera, o: oscilloscope, fg: function generator, pc: computer. (b) example of an image of the packing. the region of measurement is indicated by a red square. the experimental set-up is sketched in fig. 1(a). a quasi 2d plexiglass cell (28 mm wide and 150 mm high) is filled with 900 alumina oxide beads of diameter d = 1±0.005 mm. the separation between the front and rear plates was made 15% larger than the bead diameter. the cell is tapped by an elec030004-3 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. tromagnetic shaker (tiravib 52100) with a trend of half sine wave pulses separated five seconds between them. the tapping amplitude was controlled adjusting the intensity, a, and frequency, ν, of the pulse and was measured by an accelerometer attached to the base of the cell. averages are taken over 500 taps after equilibration. high resolution images (more than 10 mpix) are taken after each tap. to calculate the packing fraction, we only consider a rectangular zone at the center of the packing whose limits are at 4 mm from the borders [see fig. 1(b)]. we determine the centroid of each particle by means of a numerical algorithm with subpixel resolution. then, we calculate the packing fraction by assuming that the 2d projection of each bead corresponds to a disk of diameter d = 1 mm. we estimate the packing fraction with a resolution of ±0.001. as the separation between plates is larger than the particle diameter, small overlaps between the spheres are present in the 2d projections of the images. therefore, the calculated packing fraction in some dense configurations might result slightly higher than the hexagonal disk packing limit (π √ 3/6). iv. tapping characterization and asymptotic equilibrium states it is often debated [18, 19] what is the appropriate parameter to characterize the external excitation used to drive a granular sample. dijksman et al. [18] proposed a parameter related to the liftoff velocity of the granular bed. ludewing et al. [19] presented an energy based parameter. pugnaloni et al. [15, 20] suggested that the factor of expansion induced on the sample would be a suitable measure [21]. the usual perspective to define a pulse parameter, �, is to achieve a collapse of the φ–� curves as different details of the pulse shape are changed (such as amplitude and duration). parameters defined in all these previous works fail to make such curves collapse for the data presented in this paper. the main reason for this is that our φ–� curves are non-monotonic (see for example fig. 6) presenting a minimum whose depth depends on the details of the pulse shape. therefore, a simple rescale of the horizontal axis does not suffice to collapse the curves. since we are interested in macroscopic states, the actual pulse used to drive the system is merely a control parameter but not a macroscopic variable that describes the state. therefore, the external pulse does not need to be described with a simplified quantity. the complete functional form of the pulse can be given instead. in our case, we use a sine pulse and both, the pulse amplitude and frequency, are needed to fully describe the excitation. we will employ the usual nondimensional peak acceleration γ ≡ apeak/g = a(2πν)2/g (where g is the acceleration of gravity) and the frequency ν to precisely define the external excitation. a detailed study of the dynamical response of a granular bed to a pulse of controlled intensity and duration can be found in damas et al. [22]. one major issue in studying equilibrium states is the evidence that one can have, indicating if the system is actually at equilibrium [5]. since the definition of equilibrium is circular [23], we can simply do our best to check if different properties of the system have well defined means (as well as higher order moments of the distributions) which should not depend on the history of the processes applied to the sample. 1 10 100 1000 number of taps 0.82 0.84 0.86 0.88 0.9 φ figure 2: evolution of φ towards the steady state starting from a disordered configuration initially obtained by strong taps (experimental results). results for two tapping intensities are shown: γ = 4.8 (blue), and γ = 17 (green). in both experiments, the frequency of the pulse is ν = 30 hz. we have generated configurations corresponding to a particular pulse (of a given shape and intensity) by repeatedly applying such pulse to the system and allowing enough time for any transient to fade. to prove that our samples are at equilib030004-4 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. rium, we approach a particular pulse through different paths —and starting from different initial conditions (ordered and disordered configurations)— and confirm that the steady states obtained present equivalent mean values and second moments of the distributions of the variables of interest. in ref. [26], we showed that the steady states corresponding to excitations of high intensity are reached in a few taps, even if the initial configuration corresponds to a very ordered structure. in fig. 2, we consider the equilibration process from a highly disordered initial configuration. for low tapping intensities (blue line in fig. 2), the packing fraction evolves to the steady state in two stages. just after switching the tapping amplitude to the reference value, the system rapidly evolves to values of φ close to the final steady state. beyond this initial convergence, a slower compaction phase takes the system to the final steady state. for high tapping intensities, the evolution to the steady state is very rapid. the steady state is reached after about a hundred taps [green line in fig. 2)]. therefore, we apply a sequence of at least 1000 taps in all our experiments before taking averages to warrant that the steady state has been reached. in our simulations, 400 taps of equilibration were enough. an interesting example of equilibration is presented in fig. 3. in this case, we present the values of φ and of tr(σ) during a very special sequence of pulses obtained in our simulations. the system is initially deposited from a dilute random configuration with all particles in the air. first, we tap the system 200 times at γ = 4.9, then 800 times at γ = 61.5 and finally, 200 times at γ = 4.9. in the whole run, we keep ν = 0.5 √ g/d. these two values of the pulse intensity have been chosen because they are known to produce packings with the same mean φ in the steady state [26]. however, the mean σ is clearly different. notice, however, that the same values of φ, σ and its fluctuations, ∆φ and ∆σ, are observed for the steady state of low γ obtained before and after the 800 pulses of high γ. unless otherwise stated, all the results we present in what follows correspond to steady states. we have tested this by obtaining the same states through two different preparation protocols consisting of: (i) application of a large number of identical pulses starting from a disordered configuration, and (ii) application of a reduced number of identical pulses after annealing the system from much 0 500 1000 number of taps 0.82 0.84 0.86 0.88 0.9 0.92 φ ∆φ=0.01185 φ=0.85905 ∆φ=0.01192 φ=0.85959 ∆φ=0.01191 φ=0.85887 (a) 0 500 1000 number of taps 20 30 40 50 60 70 80 90 100 t r( ) /n [ u n it s o f m g d ] σ=50.97 ∆σ=10.7 ∆σ=6.9 σ=37.68 ∆σ=11.1 σ=51.77 σ (b) figure 3: evolution of φ (a) and tr(σ) (b) as the pulse intensity is suddenly changed between two values that produce steady states with the same φ but different σ in our simulations. the initial 200 taps correspond to γ = 4.9, the middle 800 pulses to γ = 61.5 and the final 200 pulses, again, to γ = 4.9. in all cases, ν = 0.5 √ g/d. the mean values and standard deviations in each section are indicated by arrows. higher pulse intensities. in a few cases, the results (mean values and/or fluctuations) from both protocols did not match. this was an indication that a steady state, if existed, was not reached by one of the protocols (or by both protocols). this happens especially for low intensity taps, which require longer equilibration times. such cases have been removed from the analysis. v. volume and volume fluctuations in fig. 4(a), we plot φ in the steady state as a function of γ from our experiments for different pulse frequencies [24]. as we can see, there exists a min030004-5 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. imum φ at relatively high γ. a similar experiment on a three-dimensional cell also yielded analogous results [26]. this behavior has also been reported for various models [15, 20, 25]. an explanation for this, based on the formation of arches, has been given in [15]. the position of the minimum in φ shifts to larger γ if the frequency of the pulse is increased (i.e., if the pulse duration is reduced). 0 10 20 30 40 50 γ 0.84 0.86 0.88 0.9 0.92 φ 30 45 60 ν (hz) (a) 0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.89 φ 0 50 100 f re q u e n c y c o u n t (b) figure 4: (a) experimental results for the steady state packing fraction, φ, as a function of the reduced peak acceleration, γ, for different frequencies, ν, of the tap pulse. (b) histogram of the configurations visited by the system for ν = 30 hz: γ = 15 (green) and γ = 28 (red). although the increment in the packing fraction beyond the minimum is rather small, it is important to remark that this difference is not an artifact introduced by our experimental resolution. in fig. 4(b), we show the histogram for the sequence of packings obtained for γ ' 15 (the minimum packing fraction for ν = 30 hz) and for γ = 28 (the largest excitation explored for ν = 30 hz). both steady states are statistically comparable; however, it is possible to distinguish different mean values. since steady states of equal φ obtained at both 0 10 20 30 γ 0.005 0.006 0.007 0.008 0.009 0.01 ∆φ figure 5: the steady state volume fraction fluctuations, ∆φ, as a function of γ from our experiments with ν = 30 hz. the solid line is only a guide to the eyes. sides of the φ minimum are generated via very different tap intensities, it is worth assessing if such states are, in fact, equivalent. this can be done by comparing the volume fluctuations of such states. a similar analysis was done in ref. [27] for states generated with different pulse amplitude and duration in liquid fluidized beds. the fluctuations of φ in the steady state, as measured by the standard deviation ∆φ, are presented in fig. 5. as we can see, a minimum in the fluctuations is just apparent. the position of such minimum in the fluctuations coincides with the minimum in φ. unfortunately, the resolution of φ in our experiments is around the size of the fluctuations. however, the results from the simulations are not limited in this respect and we address to those for a more reliable assessment of the fluctuations. in fig. 6(a), we plot φ in the steady state as a function of γ for our simulations. although the fluctuations of φ are large [see fig. 3], its mean value is well defined with a small confidence interval (see error bars). for low excitations, φ decreases as γ is increased. however, beyond a certain value γmin, the packing fraction grows. the same trend is observed if the tap frequency ν is changed. however, the minimum is deeper for lower ν and its position γmin shifts to larger values of γ as ν is increased in coincidence with our experiments (see fig. 4). due to the change in the depth of the φ minimum in the φ–γ curves, a simple rescaling of γ is unable to collapse the data for different frequencies. however, rescaling φ and γ with φmin and 030004-6 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. γmin, respectively, yields very good collapse, even between simulated and experimental data [26]. in fig. 6(b), the volume fraction fluctuations, ∆φ, are plotted as a function of γ. as we can see, these fluctuations are non-monotonic, as suggested by our experiments (fig. 5). non-monotonic volume fluctuations have also been reported in ref. [6]. for ∆φ, we obtain a minimum and a maximum. we have also observed a maximum in ∆φ for values of γ below the ones reported here (see for example refs. [26] and [25]). however, we do not report such low values of γ in this work and focus on tapping intensities that warrant the steady state with a modest number of pulses. 1 10 100 γ 0.81 0.82 0.83 0.84 0.85 0.86 φ 0.125 0.250 0.500 0.750 1.000 ν (g/d) 1/2 (a) 1 10 100 γ 0.006 0.008 0.01 0.012 0.014 0.016 ∆φ 0.125 0.250 0.500 0.750 1.000 ν (g/d) 1/2 (b) figure 6: (a) simulation results for the volume fraction, φ, as a function of the reduced peak acceleration, γ, for different frequencies, ν, of the tap pulse. (b) the corresponding volume fraction fluctuation, ∆φ, as a function of γ. the value of γ, at which the fluctuations display a minimum, coincides with γmin, the value at which the minimum packing fraction, φmin, is obtained. the maximum coincides with the inflection point in the φ–γ curve at higher γ. since one expects to find few mechanically stable configurations compatible with a large volume (low φ), it seems reasonable that fluctuations reach a minimum if φ does so. similarly, there should be few low-volume, mechanical stable configurations which implies that at high φ fluctuations should diminish. hence, a maximum in ∆φ should be present at intermediate packing fractions. this is more clearly seen in fig. 7 where we plot the fluctuations as a function of the average value of φ. 0.81 0.82 0.83 0.84 0.85 0.86 φ 0.006 0.008 0.01 0.012 0.014 0.016 ∆φ 0.125 0.250 0.500 0.750 1.000 ν (g/d) 1/2 figure 7: density fluctuations as a function of φ for different frequencies of the tap pulse in the simulations. figure 7 presents two distinct branches: the lower branch corresponds to γ > γmin and the upper branch to γ < γmin. for γ > γmin, the fluctuations corresponding to different tap durations collapse, suggesting that such equal-φ, equal-∆φ states might correspond to unique states (below, we will find that this is not the case). for γ < γmin, we obtain states of same φ as some states in the lower branch but presenting larger fluctuations. this is clear evidence that the equal-φ states of the upper and lower branch are indeed distinct and that other macroscopic variables must be used to distinguish one from the other. we have assessed a number of other structural descriptors (coordination number, bond order parameter, radial distribution function). in all cases, equal–φ states from the upper and lower branch of the φ–∆φ curve (fig. 7) present similar values of the structural descriptors with only subtle discrepancies. although this indicates that the states are not equivalent, it also suggests that such descriptors are not good candidates to form a set of macroscopic variables, along with v , to uniquely 030004-7 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. identify a given steady state. in ref. [26], we have assessed the force moment tensor σ as a good candidate to complete v and n in describing a stationary state. this has been suggested by some theoretical speculations [12]. however, some authors prefer to directly replace v by σ. below, we will show that both, v and σ, are required for the equilibrium states generated in our simulations. moreover, we will show that only one of the invariants of σ is necessary (at least in our 2d systems) and that fluctuations of these variables suggest that no other extra macroscopic parameter may be required. vi. stress tensor before we focus on the force moment tensor, we will consider the stress tensor, σ, in order to understand the phenomenology of the force distribution in our tapped granular beds. we recall that σ and σ are simply related through σ ≡ v σ. however, we have to bear in mind that v is not a simple constant since the volume of the system depends, in a nontrivial way, on the shape and intensity of the excitation. in fig. 8, we show the components of σ as a function of γ for different ν. as a reference, we show results of our simulations for a frictionless system. in a frictionless system, the shear vanishes and σyy is only determined by the weight of the sample since the janssen’s effect is not present. as we can see, the frictionless sample presents a constant value of σyy for all γ. for low γ, the frictional samples display values of σyy below the frictionless reference. this is a consequence of the janssen’s effect since part of the weight of the sample is supported by the wall friction. consequently, in this region, σxy is also positive [see fig. 8(b)]. however, for each ν, there is a critical value of γ, γshear=0. beyond it, the sample presents an apparent weight above the weight of the packing. in correspondence with this, σxy changes sign and becomes negative. this indicates that, for γ > γshear=0, the frictional walls are not supporting any weight. rather, they prevent the packing from expansion by a downward frictional force. as γ is increased beyond γshear=0, the packing tends to store most of its stress in the horizontal direction (σxx) while σyy eventually saturates. for very intense pulses, the sample expands and lifts off significantly during the tap. when the bed falls back, it creates a very compressed structure with most of the stress transmitted in the lateral directions and the wall friction sustaining the system downwards. it is worth mentioning that γshear=0 is always higher than γmin (see fig. 6). 1 10 100 γ 10 15 20 [u n it s o f m g /d ] 0.125 0.250 0.500 0.750 1.000 0.5 ( =0)µ ν (g/d) 1/2 σ σ σ yy xx (a) 1 10 100 γ -0.8 -0.6 -0.4 -0.2 0 0.2 [u n it s o f m g /d ] 0.125 0.250 0.500 0.750 1.000 0.5 ( =0) xy σ ν (g/d) 1/2 µ x y (b) figure 8: (a) diagonal components of the stress tensor, σ, as a function of γ for different frequencies ν of the tap pulse (the upper set of curves corresponds to σyy and the lower set to σxx). (b) off diagonal component of the stress tensor, σxy. the horizontal line corresponds to σyy [in panel (a)] and to σxy [in panel (b)] from simulations of a packing of frictionless disks. vii. the force moment tensor master curve since the stress is not an extensive parameter, the force moment tensor is generally used to characterize the macroscopic state [12,13]. therefore, we will use σ in the rest of the paper. let us simply remark that since v presents a non-monotonic response to γ, the curves in fig. 8 present a somewhat differ030004-8 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. ent shape if σ is plotted instead of σ. particularly, σyy does not display a minimum at low γ, as the one observed for σyy in frictional disks but a monotonic increase. in fig. 9, we show the trace of σ as a function of γ. there is a clear monotonic increase of tr(σ) as γ is increased. moreover, for a given γ, if the frequency of the excitation pulse is increased, a significant reduction in the force moment tensor is observed. 1 10 100 γ 28 30 32 34 36 38 40 t r( ) /n [ u n it s o f m g d ] 0.125 0.250 0.500 0.750 1.000 σ ν (g/d) 1/2 figure 9: trace, tr(σ), of the force moment tensor as a function of γ for different frequencies ν of the tap pulse. in fig. 10, we plot the components of σ as a function of its trace for all the steady states generated. we can see that all data for different γ and ν collapse into three master curves. this indicates that if two equilibrium states present the same tr(σ), all the components of σ are also equal. here, we point out a relevant piece of information that will be discussed in the next section. two states may present equal force moment tensor but differ in volume. this means that many points collapsing in fig. 10 correspond to states of different φ. therefore, at equilibrium, irrespective of the structure of the sample, two states with the same trace in σ will present equal σ. in a liquid at equilibrium, the stress tensor is diagonal and all elements along the diagonal are equal. this hydrostatic property allows us to know the full stress tensor if we only know the hydrostatic pressure (i.e., if we only know the trace of the tensor). in our granular samples, the force moment tensor can also be known if the trace is known. however, the shape of the tensor in static packings under gravity is defined by the three master curves of fig. 10. 28 30 32 34 36 38 40 tr( )/n [units of mgd] 10 12 14 16 18 20 /n [ u n it s o f m g d ] 0.125 0.250 0.500 0.750 1.000 σ σ σ σ ν(g/d) 1/2 yy xx (a) 28 30 32 34 36 38 40 tr( ) [units of mgd] -0.8 -0.6 -0.4 -0.2 0 0.2 /n [ u n it s o f m g d ] 0.125 0.250 0.500 0.750 1.000 σ σ x y ν (g/d) 1/2 (b) figure 10: (a) diagonal components of the force moment tensor, σ, as a function of tr(σ) for different frequencies of the tap pulse (the upper set of curves corresponds to σyy and the lower to σxx). (b) off diagonal component of the force moment tensor, σxy. to our knowledge, there is no previous speculation that this property must hold for static granular packings. a more detailed study on the extent of this commonality of the shape of the force moment tensor will be pursued in a future paper [28]. however, we show some suggestive preliminary results below. in fig. 11, we show the components of σ as a function of tr(σ) for a range of samples of different materials, for different tapping intensities, and for different tapping frequencies. as we can see, there is reasonable collapse of the data onto the same three master curves shown in fig. 10. this is an indication that these master curves may be universal and enclose a rather fundamental underlying property (inaccessible to us at this point) of static granular beds. 030004-9 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. 25 30 35 40 tr( )/n [units of mgd] 8 10 12 14 16 18 20 / n [ u n it s o f m g d ] 0.125 0.5 300 0.250 0.5 300 0.500 0.5 300 0.750 0.5 300 1.000 0.5 300 0.250 0.0 300 0.250 0.2 300 0.250 1.0 300 0.250 5.0 300 0.250 50.0 300 0.250 0.5 30 0.250 0.5 3000 ν µ γ σ σ σ σ yy xx (a) 20 25 30 35 40 tr( )/n [units of mgd] -0.6 -0.4 -0.2 0 0.2 0.4 / n [ u n it s o f m g d ] 0.125 0.5 300 0.250 0.5 300 0.500 0.5 300 0.750 0.5 300 1.000 0.5 300 0.250 0.0 300 0.250 0.2 300 0.250 1.0 300 0.250 5.0 300 0.250 50.0 300 0.250 0.5 30 0.250 0.5 3000 ν µ γ σ σ x y (b) figure 11: (a) diagonal components of force moment tensor, σ, as a function of tr(σ) for different frequencies of the tap pulse, different friction coefficient and different restitution (the upper set of curves corresponds to σyy and the lower to σxx). (b) off diagonal component of the force moment tensor, σxy. the dashed lines are only a guide to the eyes. viii. force moment tensor fluctuations since we have shown in the previous section that only the trace of σ suffices (along with the master curves) to describe the full force moment tensor, we will now focus on this invariant and its fluctuations. in fig. 12, the fluctuations of tr(σ) are plotted as a function of γ. we obtain a single minimum, in contrast with the minimum and maximum observed in ∆φ. interestingly, the states with minimum ∆tr(σ) correspond to the states where the minimums φ and ∆φ are reached for each ν. however, unlike ∆φ, the depth of the minimum in ∆tr(σ) is fairly independent of ν. it is unclear why the force moment fluctuations should present a minimum. provided that the minimum of ∆tr(σ) coincides with the minimum of ∆φ, it can be speculated that a reduced number of geometric configurations can accommodate a limited number of force configurations. we have seen that all individual components of σ present the same minimum in their fluctuations, however, the actual values in ∆tr(σ) are dominated by ∆σxx which takes values five times larger than ∆σyy. 1 10 100 1000 γ 2 4 6 8 10 t r( ) /n [ u n it s o f m g d ] 0.125 0.250 0.500 0.750 1.000 ∆ σ ν(g/d) 1/2 (a) 28 30 32 34 36 38 40 tr( )/n [units of mgd] 2 4 6 8 10 t r( ) /n [ u n it s o f m g d ] 0.125 0.250 0.500 0.750 1.000 σ σ ∆ ν(g/d) 1/2 (b) figure 12: (a) fluctuations of the trace of the force moment tensor as a function of γ for different frequencies of the tap pulse. (b) fluctuations of the trace of the force moment tensor as a function of tr(σ). if we plot ∆tr(σ) in terms of the average value tr(σ) [see fig. 12(b)], we can see that the curves collapse on top of each other over a wide range of tr(σ). however, some deviations are apparent at very low and very high forces. although the fair collapse of the curves suggests that states of equalσ may correspond to the same equilibrium states, we will see in the next section that many of these equilibrium states are distinguishable through the 030004-10 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. volume. the inflection observed at high tr(σ) corresponds to the change of regime observed in fig. 8(a) where the vertical stress saturates and most of the contact forces are directed in the x–direction. ix. the thermodynamic phase space as we have suggested, the mean volume of static granular samples is not sufficient to describe the equilibrium state since states of equal-φ may present distinct fluctuations. on the other hand, the force moment tensor seems to be able to serve as a standalone descriptor since states of equal σ do generally present the same σ fluctuations. however, states of equal-σ may present different volumes. in fig. 13, we plot the loci of the equilibrium states generated in our simulations in a hypothetical φ–σ thermodynamic phase space. as we can see, states of equal v but different σ are obtained as well as states of equal σ and different v . 28 30 32 34 36 38 tr( )/n [units of mgd] 0.81 0.82 0.83 0.84 0.85 0.86 φ 0.125 0.250 0.500 0.750 1.000 σ ν (g/d) 1/2 figure 13: phase space φ–tr(σ). loci visited in the simulations for different frequencies of the tap pulse. we can ask now if these two state variables suffice to fully describe the equilibrium states. a hint that this may be the case is given by the fact that states generated with different γ and ν but that correspond to the same state in the φ–σ plot display the same fluctuations of these variables. in fig. 14, we have highlighted some pairs of neighboring states. we can see that such states do also present similar fluctuations [fig. 14(b)]. in contrast, states which are distant in the φ–σ plot present distinct fluctuations even if they correspond to an equivalent mean volume or an equivalent mean force moment tensor (see states joined by solid lines in fig. 14). 28 30 32 34 36 38 tr( )/n [units of mgd] 0.81 0.82 0.83 0.84 0.85 0.86 φ σ (a) 3 4 5 6 7 tr( )/n [units of mgd] 0.008 0.01 0.012 0.014 ∆ φ σ∆ (b) figure 14: (a) phase space φ–tr(σ). (b) fluctuations of the state variables. selected neighboring states are colored in pairs for comparison. distant states of equal v or tr(σ) are joined by thick lines. let us point out here, that the sole coincidence of fluctuations in the macroscopic variables is not a rigorous proof that the set of chosen variables is a full complete set of thermodynamic parameters. future explorations of these systems may confirm or disprove that n, v and tr(σ) are a full set of macroscopic, extensible variables able to describe all equilibrium states. meanwhile, it is clear that for moderate tapping intensities, around which the minimum in φ is observed, the approximation of a simple nv or nς ensemble is not warranted in view of the large discrepancies between the curves generated with different pulse frequencies in fig. 13. 030004-11 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. x. concluding remarks we have studied steady states of mechanically stable granular samples driven by tap like excitations. we have varied the external excitation by changing both, the pulse amplitude and the pulse duration. we have considered the macroscopic extensive variables v (volume) and σ (force moment tensor), and their fluctuations. from the results, we can draw the following conclusions: • there seems to be a rather robust set of master curves for σαβ which implies that the knowledge of tr(σ) suffices to infer the other components of the force moment tensor. • the equilibrium states cannot be only described by v or σ, apart from the number of particles n. • the equilibrium states seem to be well described by the set nv tr(σ). there exists a number of points to be considered in view of these findings. here, we mention a few that may serve as starting points for future directions of research: • what is the extent to which the σ master curves are applicable? is this dependent on the dimensionality of the system, the excitation procedure, the chosen contact force law, etc? • what is the dynamics during a single pulse that leads to the appearance of the φ minimum? is this minimum present in states generated with other types of pulses like fluidization or shear? the ubiquity of this minimum in simulation models [15, 20] suggests that it might be found in numerous conditions. • are the fluctuations shown in figs. 7 and 12 the definitive phenomenological equation of states? other authors have so far found monotonic density fluctuations [27] or concave up density fluctuations [6]. • how much of the φ–tr(σ) plane can be explored by changing material properties? • are there other excitation protocols (such as shearing) that may give rise to steady states that are thermodynamically equivalent to the ones obtained by tapping? • is it possible to construct a phenomenological entropy function from the equations of states (figs. 7 and 12) by simple integration of a gibbs–duhem-like equation? let us bear in mind that fig. 7 is multiply valued. it is worth stressing that if two ensembles generated by arbitrary excitation protocols —such as tapping or shearing— happen to present the same mean values (and fluctuation) for all macroscopic variables, then such macroscopic states should be considered thermodynamically identical. however, it may be the case that a given protocol produces a narrow range of macroscopic states that can be eventually described with a reduced set of macroscopic variables. acknowledgements lap acknowledges discussions with massimo pica ciamarra. jd acknowledges a scholarship of the fpi program from ministerio de ciencia e innovacin (spain). this work has been financially supported by conicet (argentina), anpcyt (argentina), project no. fis2008-06034-c02-01 (spain) and piuna (univ. navarra). [1] h b callen, thermodynamics and an introduction to thermostatistics, 2nd ed., wileyvch, new york (1985). [2] r k pathria, statistical mechanics, 2nd ed., butterworth-heinemann, oxford (1996). [3] e r nowak, j b knight, m l povinelli, h m jeager, s r nagel, reversibility and irreversibility in the packing of vibrated granular material, powder tech. 94, 79 (1997). [4] p richard, m nicodemi, r delannay, p ribire, d bideau, slow relaxation and compaction of granular systems, nature mater. 4, 121 (2005). [5] ph ribière, p richard, p philippe, d bideau, r delannay, on the existence of stationary states during granular compaction, eur. phys. j. e 22, 249 (2007). 030004-12 papers in physics, vol. 3, art. 030004 (2011) / l. a. pugnaloni et al. [6] m schröter, d i goldman, h l swinney, stationary state volume fluctuations in a granular medium, phys. rev. e 71, 030301(r) (2005). [7] s f edwards, r b s oakeshott, theory of powders, physica a 157, 1080 (1989). [8] j h snoeijer, t j h vlugt, w g ellenbroek, m van hecke, j m j van leeuwen, ensemble theory for force networks in hyperstatic granular matter, phys. rev. e. 70, 061306 (2004). [9] s f edwards, the full canonical ensemble of a granular system, physica a 353, 114 (2005). [10] s henkes, c s ohern, b chakraborty, entropy and temperature of a static granular assembly: an ab initio approach, phys. rev. lett. 99, 038002 (2007). [11] s henkes, b chakraborty, stress correlations in granular materials: an entropic formulation, phys. rev. e. 79, 061301 (2009). [12] r blumenfeld, s f edwards, on granular stress statistics: compactivity, angoricity, and some open issues, j. phys. chem. b 113, 3981 (2009). [13] b p tighe, a r t van eerd, t j h vlugt, entropy maximization in the force network ensemble for granular solids, phys. rev. lett. 100, 238001 (2008). [14] r arévalo, d maza, l a pugnaloni, identication of arches in two-dimensional granular packings, phys. rev. e 74, 021303 (2006). [15] l a pugnaloni, m mizrahi, c m carlevaro, f vericat, nonmonotonic reversible branch in four model granular beds subjected to vertical vibration, phys. rev. e 78, 051305 (2008). [16] j schäfer, s dippel, d e wolf, force schemes in simulations of granular materials, j. phys. i (france) 6, 5 (1996). [17] h hinrichsen, d wolf, the physics of granular media, willey-vch, weinheim (2004). [18] j a dijksman, m van hecke, the role of tap duration for the steady-state density of vibrated granular media, eur. phys. lett. 88, 44001 (2009). [19] f ludewig, s dorbolo, t gilet, n vandewalle, energetic approach for the characterization of taps in granular compaction, eur. phys. lett. 84, 44001 (2008). [20] p a gago, n e bueno, l a pugnaloni, high intensity tapping regime in a frustrated lattice gas model of granular compaction, granular matter 11, 365 (2009). [21] notice however, that this expansion parameter is not based on the properties of the mechanical excitation itself but on the sample response to it. [22] j damas et al., (unpublished) [23] according to callen [1], a system is at equilibrium if thermodynamics holds for such state. [24] notice, that in fig. 1(c) of our previous work [26], the maximum value of γ reported was half of the value shown here. this discrepancy is due to a deficient filtering of the accelerometers used in ref. [26]. the accelerations reported here have been corrected and checked against measurements done with a high speed camera. [25] c m carlevaro, l a pugnaloni, steady state of tapped granular polygons, j. stat. mech. p01007 (2011). [26] l a pugnaloni, i sánchez, p a gago, j damas, i zuriguel, d maza, towards a relevant set of state variables to describe static granular packings, phys. rev. e 82, 050301(r) (2010). [27] m pica ciamarra, a coniglio, m nicodemi, thermodynamics and statistical mechanics of dense granular media, phys. rev. lett. 97, 158001 (2006). [28] l. a. pugnaloni, et al., (unpublished). 030004-13 papers in physics, vol. 14, art. 140001 (2022) received: 26 march 2021, accepted: 23 november 2021 edited by: c. brito reviewed by: a. nicolas (ilm, cnrs & université claude bernard lyon, france) licence: creative commons attribution 4.0 doi: https://doi.org/10.4279/pip.140001 www.papersinphysics.org issn 1852-4249 physical distance characterization using pedestrian dynamics simulation d. r. parisi1*, g. a. patterson1�, l. pagni2, a. osimani2, t. bacigalupo2, j. godfrid2, f. m. bergagna2, m. rodriguez brizi2, p. momesso2, f. l. gomez2, j. lozano2, j. m. baader2, i. ribas2, f. p. astiz meyer2, m. di luca2, n. e. barrera2, e. m. keimel álvarez2, m. m. herran oyhanarte2, p. r. pingarilho2, x. zuberbuhler2, f. gorostiaga2 in the present work we study how the number of simulated customers (occupancy) affects social distance in an ideal supermarket, considering realistic typical dimensions and processing times (product selection and checkout). from the simulated trajectories we measure social distance events of less than 2 m, and their duration. among other observables, we define a physical distance coefficient that informs how many events (of a given duration) each agent experiences. i introduction one of the measures widely applied to mitigate the coronavirus disease (covid-19) outbreak is social distancing; that is, maintaining a certain physical distance between people [1]. this distance acts as a physical barrier to droplets released from the nose or mouth of a potentially infected person. when another person is too close, they could breathe in the droplets and become infected. although covid-19 is our current concern, physical distancing could be useful for any contagious disease. we should emphasize that a physical distance of 1-2 m is not sufficient for some other types of transmissionsuch as transmission by aerosols [2, 3] or fomites [3]. moreover, many other important *dparisi@itba.edu.ar �gpatters@itba.edu.ar 1 instituto tecnológico de buenos aires (itba), conicet, lavardén 315 (1437), c.a. de buenos aires, argentina. 2 instituto tecnológico de buenos aires (itba), lavardén 315 (1437), c. a. de buenos aires, argentina. factors, such as good ventilation (for indoor systems) and the use of face masks, are not included in our analysis. recent studies [4, 5] have suggested combining microscopic agent simulation with general diseasetransmission mechanisms. however, because of uncertainties and the complexity of current knowledge for quantifying covid-19 transmission processes, here we will not consider any particular contagion mechanism. we will focus instead on studying the distance between people in an everyday pedestrian facility as an isolated aspect to be integrated in the future by experts considering all mechanisms for any particular disease propagation. additionally, findings have been reported from recent physical distance studies that considered field data from a train station [6] and simulations of bottleneck scenarios [7]. one of the key questions we will try to answer is how to describe the physical distance for any given occupation of an establishment. to solve this problem, we must consider the displacements and trajectories of pedestrians while they perform certain tasks, thus the obvious tool to use is pedes140001-1 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. trian simulation. the time evolution of positions of simulated agents can provide not only the relative distance between agents, but also the duration of events in which the recommended social distance is not kept. many industries and shops have been closed in different phases of the covid-19 pandemic. however, grocery shops have to be kept open, and supermarkets in particular. to prevent crowding and to keep some physical distance between customers, the authorities reduced the allowed capacity. different countries’ regulations have adopted social distance requirements between 1 and 2 m [6]. in the present study we will consider a distance of 2 m as the social distance threshold. the main objective of this work is to introduce a methodology for characterizing and analyzing the physical distance between agents. we propose to investigate how the allowed capacity affects the physical distance between shoppers in an ideal supermarket of 448 m2. the results should not be extrapolated directly to other supermarkets or facilities; nevertheless, the methodology could be used with other trajectories based either on simulations or field data obtained from a pedestrian system. ii models in order to simulate the complex environment and the agents’ behavior, the proposed model involves three levels of complexity: operational, tactical, and strategic [8]. i strategic level the most general level of the model consists of a master plan for the agent when it is created. in practical terms, for the present system it gives a list of np products for agents to acquire (a shopping list). each of the np items is chosen at random from a total of mp available products. also, they are identified with a unique target location (xpn) in the supermarket. once the agent is initialized with its shopping list, the strategic level shows the first item on the list to the agent. the agent will move toward it using the lower levels of the model. when the agent reaches the position of the product, it will spend a picking time (tp) choosing and picking up the product, after which the strategic level will present the next item on the list to the agent. when the list of products is complete, the agent must proceed to the least busy supermarket checkout line. it will adopt queuing behavior until it gets to the checkout desk and spends time tco processing its purchase. ii tactical level the function of the tactical level is to present the agent with successive visible targets to guide it to the location of the desired product (xpn) or checkout line. as input the tactical module takes the current agent position (xi(t)) and the position of the current product (xpn) on the list. the output is a temporal target (xv(t)) visible from the current position of the agent. the definition of visibility is that if we take a virtual segment between (xi(t)) and (xv(t)), this segment does not intersect any of the walls or obstacles (shelves). the information delivered by the tactical module is obtained by implementing a squared network connecting all the accessible areas of the simulated layout (see fig. 2). for any pair of points within the walkable domain, the corresponding nearest points on the network are found and then the shortest path between these points is computed using the a* algorithm [9]. once the path in the network is defined, the temporary target xv(t) is chosen as the farthest visible point on that path, seen from the current agent position. clearly, xv(t) will change with time, as the position of the agent changes. when the product target is visible from the agent’s position, this is set as the visible target and the network path is no longer considered until a new product should be found. iii operational level for the lowest level describing the agents’ shortrange movements we propose an extended version of the contractile particle model (cpm) [10]. this will provide efficient navigation to prevent potential collisions with other agents and obstacles. the basic model is a first-order model in which particles have continuous variable radii, positions and velocities that change according to certain rules. specifically, the position is updated as 140001-2 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. xi(t + ∆t) = xi(t) + vi∆t , (1) where vi is the desired velocity and xi(t) the position at time t. the radius of the ith particle (ri) is dynamically adjusted between rimin and r i max. when this radius has large values, it represents the personal distance necessary for taking steps, but when it has low values it represents a hard incompressible nucleus that limits maximum densities. when particles are not in contact, the desired velocity vi points toward the visible target with a magnitude proportional to its radius, vi = eit v , (2) where the direction eit and the magnitude v are defined by the following equations: eit = (xv − xi) |(xv − xi)| , (3) v = vd[ (r −rmin) (rmax −rmin) ] , (4) where vd is the desired speed. while the radius has not reached the maximum rmax, it increases at each time step, following ∆r = rmax ( τ ∆t ) . (5) τ being a characteristic time at which the agent reaches its desired speed as if it was free, and ∆t is the simulation time step of eq. (1). when two particles come into contact (dij = |xi − xj| − (ri + rj) < 0) both radii collapse instantaneously to the minimum values, while an escape velocity moves the particles in directions that will separate the overlap: eij = (xi − xj) |xi − xj| . (6) the escape velocity has the magnitude of the free speed and can thus be written as vie = vd e ij. this velocity is only applied during one simulation step because, as the radii collapse simultaneously, the agents no longer overlap. so far we have described the basic cpm as it appears in ref. [10]. this model satisfactorily describes experimental data of specific flow rates and fundamental diagrams of pedestrian dynamics. however, particles do not anticipate any collisions, and this capacity is a fundamental requirement for simulating the ideal supermarket (displaying low and medium densities, and agents circulating in different directions). we therefore propose extending the calculation of agent velocity (eq. (2)) by considering a simple avoidance mechanism. the general idea is that the self-propelled particle will produce an action only by changing its desired velocity vi(t), as stated in ref. [11]. in this case, any change in the direction of desired velocity v through the new mechanism will depend on the neighbor particles and obstacles. first, the collision vector (nc i) is calculated as nc i = eij ap e −dij/bp cos(θj) + eik aw e −dij/bw cos(θk) + η̂ , (7) where j indicates the nearest visible neighbor, k the nearest point of the nearest visible wall or obstacle, and η̂ is a noise term for breaking possible symmetric situations. then the avoidance direction is obtained from eia = (nc i + eit) |(nci + eit)| , (8) and finally, the velocity of the particle to be used in eq. (1), if particles are not in contact, is vi = v eia. (9) in fig. 1 the vectors associated with the original and modified model can be seen in detail. for the sake of comparison with force-based models, we also implement other operational models: the social force model [12, 13] and the predictive collision avoidance (pca) model [14]. the results for all three operational models are compared for selected observables, while the deeper study is performed using the rule-based model (cpm). a states of agents because the agents must perform different tasks, more complex than just going from one point to 140001-3 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. + eit xj xi eia θj xv nc rmax ve rmin r ve r(t-δt) r=rmax v v dij a b c visibility field of agent "i" figure 1: contractile particle model. a: two particles without contact. b: the radii of two particles that overlapped in the previous time step (dashed circles) collapse, and the particles take the escape velocity. a and b correspond to the original cpm. c: modification considering an avoidance direction. another, it was necessary to define five behavioral states. this was achieved by setting different model parameters and movement patterns. more concisely, the five behavioral states of agents were: going : this is the normal walking behavior when going from one arbitrary point to another with the standard velocity and model parameters. only in this state does the agent use the modified cpm velocity (eq. (9)) to avoid potential collisions. the other behavioral states use only the basic cpm (eqs. (1) to (6)). approaching : when the agent is closer than 2 m to the current product, it reduces its desired speed and, because of how parameters are set, it will not be forced to reach it if there is another agent buying a product in the same target xpn. picking : once the agent reaches the product (closer than 0.1 m) a timer starts and it will remain in the same position (eq. (1) does not update its position) until the picking time (tp) is up. leaving : after spending time (tp), the agent leaves the current location and goes to the next product on the list. while abandoning this position it could find other waiting agents (in approaching behavioral state), so its parameters must be such that it can make its way through. once the agent is farther than 2 m from the last product, it changes to the ”going” behavioral state. queuing : finally, when the agent completes its shopping list it proceeds to the checkout desks by choosing the one with the shortest line. it waits at a distance of 1.5 m from the previous queuing agent, and when it reaches the checkout position it remains there for tco time. by considering these behavioral states in the agent model, the conflicts and deadlock situations are minimized. this model improvement thus enables us to simulate higher densities than with the basic operational models. iii simulations the 448 m2 site of the ideal supermarket to be simulated is shown in fig. 2. the dimensions of shelf (1 m x 10 m) and aisle width (2 m) are taken a 0 5 10 15 x (m) 0 5 10 15 20 25 30 y (m ) b 0 5 10 15 x (m) 0 5 10 15 20 25 30 y (m ) a b products network of paths shelves walls agent generator checkout points references: figure 2: the ideal supermarket layout. a: only walls and obstacles. b: the other model components as described in section ii. 140001-4 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. table 1: parameters of the cpm operational model for all the behavioral states. behavioral state going approaching picking leaving queuing rmin (m) 0.1 0.1 0.2 0.1 0.1 rmax (m) 0.37 0.35 0.2 0.3 0.12 vd (m/s) 0.7 0.5 0 0.9 0 or 0.5 from typical real systems. the different processing times and other data considered were provided by an argentine supermarket chain. we define n as the allowed capacity or the occupation of the supermarket; i.e., the total number of agents buying simultaneously inside the system. this is the most important input to be varied in our study and it ranges from n = 2 to n = 92. during the pandemic social groups are not allowed to enter commercial buildings, so we focus our study on single agents. during the first wave of the pandemic there were long queues outside supermarkets, caused by capacity limitations, fear of shortages, and limited hours of operation. we therefore assume that outside the shop there is an infinite queue of clients who enter in order as the occupancy limit allows. the agent generator produces an inflow of 1 agent every 5 s until it reaches the n value for the simulation. from that moment on, the agent generator monitors occupation, generating a new agent every time an existing agent completes its tasks and is removed from the simulation. by doing this, the value of n is maintained constant over the entire simulation. every agent created by the generator is equipped with a shopping list of exactly np = 15 items that, for simplicity, are chosen randomly from a total of 228 available items (shown in fig. 2b). the corresponding product locations (xpn) are separated by one meter from adjacent locations. agents visiting the products on their lists spend a picking time with a uniform distribution ((tp) ∈ [60s, 90s]). after completing the lists, agents choose the shortest queue to one of the eight checkout points shown in fig. 2b. the ideal supermarket has a maximum of four queues, each leading to two checkout desks. one of the strategies adopted in the supermarkets of argentina was delimitation of the positions on the floor to guarantee the minimum physical distance (1.5 m) while queuing for checkouts. the first positions in these queues are at a distance of 3 m (at y = 4 m, in fig. 2) from the checkout points. once an agent reaches the cashier (at y = 1 m, in fig. 2) it spends a checkout time tco uniformly distributed between tco ∈ [120 s, 240 s]. for each value of n we simulated 2 h (7200 s) and recorded the state of the system every ∆t2 = 0.5 s, thus producing 14400 data files with agents’ positions, velocity, and behavioral state. the simulation time step ∆t used in eq. (1) for all simulations was ∆t = 0.05 s. the noise term in eq. (7) is a random vector, whose components ηx and ηy are uniformly distributed in the range ηx = ηy = [−0.1 m/s, 0.1 m/s]. and the relaxation time τ is set to τ = 0.5 s. the remaining model parameters depend on the behavioral state of the agent. for the case of ”going”, the parameters of the avoidance mechanism described in eq. (7) are aa = 1.25, ba = 1.25 m, aw = 15 and bw = 0.15 m. the other behavioral states implement only the original cpm (without the avoidance mechanism) with the parameters displayed in table 1. iv results i general aspects we first show general results of the simulated supermarket by displaying typical trajectories (fig. 3) and density fields (fig. 4). figure 3 plots ten randomly chosen trajectories in the second hour of simulations for the selected n values. qualitatively, more intricate trajectory patterns can be seen as occupancy increases. however, in all cases it can be observed that the available area is uniformly visited by simulated agents while selecting the products on their list. complementary information is shown in fig. 4, where density is averaged overl the entire simulation time (2 h). as expected, greater occupancy 140001-5 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. a b c d 0 5 10 0 5 10 15 20 25 30 0 5 10 0 5 10 15 20 25 30 0 5 10 0 5 10 15 20 25 30 0 5 10 0 5 10 15 20 25 30 figure 3: ten random trajectories were chosen for different occupancies. a: n = 14, b: n = 35, c: n = 62, d: n = 92. presents higher mean density values. moreover, these density fields present higher values at the spots where agents stay longer, thus revealing product selection points and predefined queuing places. also, as a macroscopic observable of the system, we study the number of agents that could be processed (i.e., complete the shopping list and exit the supermarket within the two hours simulated) and the mean residence time for those agents. these results are presented in fig. 5. as can be observed, both quantities increase monotonically with the allowed occupancy for the studied range of values and the supermarket setup, considering eight checkout desks. even though the agents purchase the same number of items, the trajectories generated present great variability in residence times. a b c d 0 5 10 0 5 10 15 20 25 30 -2 -1.5 -1 -0.5 0 0 5 10 0 5 10 15 20 25 30 -2 -1.5 -1 -0.5 0 0 5 10 0 5 10 15 20 25 30 -2 -1.5 -1 -0.5 0 0 5 10 0 5 10 15 20 25 30 -2 -1.5 -1 -0.5 0100 10-1 10-2 m-2s-1 figure 4: density maps averaged over the 2 h simulation time for different occupancies. a: n = 14, b: n = 35, c: n = 62, d: n = 92. furthermore, it can be seen that different operational models display similar observables. the sfm [12, 13] and pca [12, 14] models are forcebased models that present more limitations in terms of the maximum density they can simulate before forces are balanced (generating deadlocks) for the complex scenarios and behavior considered. this is why the maximum occupancy studied with these models is lower than that simulated with the cpm described in section ii. ii distance analysis in this subsection we characterize the distance between agents during simulations with the modified cpm for different allowed capacities. an interesting outcome is the distance to the first neighbor for each agent shown in fig. 6. the probability density function (pdf) of firstneighbor distances (dfn) shows that for lower occupancy of the simulated supermarket, the probability of having the first neighbors further away than dfn ∼ 5 m is greater. on the other hand, higher occupancy values generate higher probabilities of having a distance of less than 5 m. in particular, all distributions show a maximum probable value around dfn ∼ 4 m. moreover, the height of these probability peaks decreases for lower occupancy values. now we take the physical distance threshold of 2 m, as discussed in section i, and calculate the related probabilities of agents below this critical 140001-6 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. 0 20 40 60 80 100 allowed capacity 1400 1600 1800 2000 2200 2400 2600 m ea n r es id en ce t im e (s ) cpm pca sfm data1 data2 data3 0 20 40 60 80 100 allowed capacity 0 20 40 60 80 100 120 to ta l a ge nt s pr oc es se d pe r h ou r cpm pca sfm a b figure 5: a: mean residence time of agent as a function of occupation, for three different operational models. error bars indicate one standard deviation. b: number of agents processed per hour for the entire two-hour simulations, and also for the different operational models. social distance. the first observable we calculate is the probability of the first neighbor being closer than 2 m (pfn<2m). in other words, this is the probability of having at least one neighboring agent within 2 m. this is determined by averaging the data recorded every ∆t2 = 0.5 s, from minute 20 to 120 as shown in eq. (10) pfn<2m = 1 nti ti=14400∑ ti=2400 nfn2m n , (10) 0 5 10 15 20 25 30 distance to first neighbor (m) 0 0.05 0.1 0.15 0.2 0.25 0.3 p d f n=14 n=35 n=62 n=92 figure 6: probability density function of first neighbor distances. where nti = 12000 = 14400 − 2400 is the data at recorded times after 20 min, n is the occupancy and nfn2m is the number of particles having a first neighbor at less than 2 m. note that if two particles i and j are the only particles at less than 2 m, nfn2m = 2. moreover, when j is the first neighbor of i, i will not necessarily be the first neighbor of j. the above probability (pfn<2m) only considers whether the first neighbor is closer than 2 m; it does not consider whether there are many occurrences of neighbors at less than 2 m. for this reason we now take into account the probability that a given pair of agents are within 2 m of one another (ppair<2m) ppair<2m = 1 nti ti=14400∑ ti=2400 np2m [n (n − 1)]/2 , (11) where np2m is the number of pairs of particles at a distance closer than 2 m and [n (n − 1)]/2 is the total number of possible pairs having n particles in the system. in this case, if only particles i and j are closer than 2 m, np2m = 1 because one pair is counted. in fig. 7 both probabilities (pfn<2m and ppair<2m) are displayed for the modified cpm and also for comparison with the sfm and the pca model. it can be seen that the probability of having the nearest neighbor at less than 2 m increases monotonically with the allowed capacity. however, pair probability quickly increases for low oc140001-7 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. 0 20 40 60 80 100 allowed capacity 0 0.2 0.4 0.6 0.8 1 p ro ba bi lit y 10-3 cpm sfm pca 0 20 40 60 80 100 allowed capacity 0 0.1 0.2 0.3 0.4 0.5 0.6 p ro ba bi lit y cpm sfm pca a b figure 7: a: probability of having the first neighbor closer than 2 m (eq. (10)). b: probability that a given pair of agents are within 2 m of one another (eq. (11)). cupancy, and after n ∼ 15 remains almost constant, indicating that the number of pairs np2m scaled with n as the number of total possible pairs (∼ n2). furthermore, fig. 7 indicates that different operational models display similar macroscopic behavior in terms of social distance, at least for values below or above 2 m. the above analysis focused on the occurrence of certain distances between simulated agents, but the duration of these events was not explicitly considered. this will be done in the following subsection. iii duration of social distance events here we study the time that events last when pairs of agents are found at less than 2 m (see section i). these events occur mainly when agents are selecting products at neighboring product locations or when queuing at the supermarket checkout. if two particles i and j meet at a given time and then separate by more than 2 m, should the same particles meet up again at a future time this is considered two separate events. considering that: (a) the parameter we choose to maintain constant during each simulation is the allowed capacity n, and this capacity is reached at the beginning of each simulation in a very short time compared to other processes, and (b) all agents have the same number of items on their list, and thus the required time to complete it is similar on average, the first group of n agents will go to the checkout points at nearly the same time, generating high checkout demand and long queues. following this, the new agents will enter slowly as other agents exit the simulation, and thus the described behavior will relax. these dynamics lead to more queuing agents during the first hour of simulation and fewer during the second hour. we therefore analyze separately the duration of encounters occurring during the first and the second simulation hour in fig. 8. the different time scales and the number of cases in both panels confirm that the first hour is dominated by particularly long queues waiting to check out, while in the second hour (fig. 8b) social distance events of less than 2 m are dominated by the shorter process: product selection. events in the queuing line are long lasting for two reasons. first, the particular process at the checkout desk takes between 2 and 4 min (rather than the 1 to 1.5 min of the picking process). second, a line with nl agents will make the last agents spend about nl times tco, which for a few agents, namely nl = 5, could represent 20 min waiting time at a distance of 1.5 m from another agent. this problem of high exposure time between pairs of agents in queuing lines could be avoided if a slower rate of inflow of agents was adopted at the start of the process, let us say something above the maximum average outflow of the system (eight agents in three minutes, i.e., ∼ 1 agent every 23 s). we did not adopt this in the simulations because it would take too long for simulations to reach the 140001-8 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. 0 20 40 60 80 100 allowed capacity 0 50 100 150 200 250 fr eq ue nc y (1 /h ) t e > 3 ‘ t e > 5 ‘ t e > 10 ‘ t e > 15 ‘ a 0 20 40 60 80 100 allowed capacity 0 100 200 300 400 500 fr eq ue nc y (1 /h ) t e > 1.5 ‘ t e > 2.0 ‘ t e > 2.5 ‘ t e > 3.0 ‘ b figure 8: a: number of events recorded in the first hour of simulations where two agents are at a distance of less than 2 m for more than te min. b: the same measurement as a but for the second hour of the simulations. desired occupation n. however, it is clear that the problem noted above at the beginning could be solved in a real operation by allowing a low flow rate of agents at opening time (of about twice the capacity of the checkout). also, this transient behavior would represent a problem only at opening time, most of the daily operation being as described in our second simulation hour. furthermore, fig. 8 shows that, as expected, fewer social distance events occur when the time thresholds increase. and in all cases, the number of events seems to grow quadratically with n. iv physical distance coefficient now, looking for a criterion that determines what a reasonable allowed capacity in the ideal supermarket would be, we define the physical distance coefficient (δπ(te)) for the threshold distance of 2 m, as δπ(te) = 2 ne(te) np , (12) where te is the minimum duration of a particular physical distance event (rij ≤ 2 m), ne(te) is the number of these events that last at least te, and np is the total number of agents processed by the system in the same period of time in which ne is computed. factor 2 is needed to take into account the number of agents in the numerator, since two agents (i and j) participate in each event. this coefficient enables us to compare the number of agents who have participated in physical distance events of duration greater than te with the number who have passed through the system. thus a value of δπ(te > 2min) = 1 indicates that, on average, each agent has participated in one event involving a physical distance of less than 2 m that lasts at least 2 min. if δπ(te > 2min) < 1, it would indicate that only a fraction of the agents have participated in such events. having established in section iii that the duration of events in the first simulation hour is dominated by the checkout line process, we now concentrate on looking at the second hour of simulation when the impact of these lines is very low and stationary. this situation is representative of the daily operation of the supermarket; this is shown in fig. 9, which displays the physical distance coefficient as a function of occupation for different event duration limits te. first, we note in fig. 9a that the curve corresponding to te > 1 min grows steeply with n. this could be related to the fact that the picking time ranges between 1 min and 1.5 min and that the products are spaced by 1 m, so if two agents aim simultaneously for the same product or the first or second nearest product, they could generate a 2 m physical event lasting at most 1.5 min, and in particular many events lasting more than 1 minute would occur. furthermore, the physical distance coefficient seems to follow a linear relation with n 140001-9 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. 0 20 40 60 80 100 allowed capacity 0 5 10 15 20 s oc ia l d is ta nc e c oe ffi ci en t t e > 1 ‘ t e > 1.5 ‘ t e > 2 ‘ t e > 3 ‘ a 0 20 40 60 80 100 allowed capacity 0 0.5 1 1.5 2 2.5 3 3.5 4 s oc ia l d is ta nc e c oe ffi ci en t t e > 1 ‘ t e > 1.5 ‘ t e > 2 ‘ t e > 3 ‘ b figure 9: a: physical distance coefficient as a function of supermarket occupation for the second simulation hour. b: close up of previous figure showing details near δπ ∼ 1. solid lines correspond to the theoretical approach presented in section v. for this particular time limit te. a change of regime can be observed for te > 1.5 min, in which curves are more similar to one another for the different te presented, and they follow a quadratic relation with n. because the maximum picking time is 1.5 min, this is the maximum possible overlapping time for two agents selecting neighboring (or the same) products. longer lasting events will arise when more than two agents are waiting for neighbouring or the same products, as in the case of products near any of the short lines for checking out. the results presented in fig. 9b could be used as a guide for determining allowed occupancy. if based on epidemiological knowledge or criteria, it was determined that it would be acceptable for all agents to participate once in a 2-m physical event lasting at most 1 min, but then the allowed occupation would be very small, n ∼ 10. alternatively, if events up to 1.5 min were accepted, then the allowed occupation would be n = 40. in the case of te = 2 min, the capacity could rise to n = 70. also, it could be established that even for n = 90 the events of the 2-m physical distance, lasting more than 3 min, would affect only 40% of the processed agents. of course, fig. 9b could be used to find another allowed occupancy if the criterion considered that, for example, only 25% of the agents could participate in the analyzed events. v theoretical derivation of δπ in this subsection we theoretically derive the curves by interpolating the simulation data shown in fig. 9. first we note that there are at least four sources of physical distance events, displaying increasing duration times: � a very short time when two walking agents pass by in an aisle between shelves (∼ 100 s), � a short time when conflicts appear due to lack of space (∼ 101 s), � a longer time when agents are picking products at a neighboring or the same location (∼ 102 s), � a very long time when agents are queuing at neighboring positions in a (long) checkout line (∼ 103 s). because long lines can be avoided by suitable operation parameters, the analysis of δπ in the above section was performed for the second simulated hour when checkout lines are kept to a minimum. thus the longer process is related to agents selecting products at neighboring locations and will dominate the relationship between δπ and occupancy. the goal is to compute eq. (12). we can write the numerator, ne(te), by taking into account the different time thresholds displayed in fig. 9. 140001-10 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. first, we consider the case of events that emerge from the encounter of two agents during a time slot given by the mean picking time t̂p = 75 s. we therefore calculate the average number of pairs of agents that go for the same product and are less than 2 m apart as n2 = 733( 2+mp−1 2 ) n (n − 1) 2 , (13) where mp is the total number of available products, ( 2+mp−1 2 ) is the total number of possible ways of arranging two indistinguishable agents between the mp products, and 733 is the subset of these arrangements of two particles at less than 2 m away. the second factor corresponds to the total number of possible pairs for a given value of n. since the agents do not arrive simultaneously at their respective products, we compute the probability that the encounter of two agents lasts longer than te as p2(te) = ∫ t̂p te dt2∫ t̂p 0 dt2 = 1 − (te/t̂p) , (14) where the denominator is the integral over the possible arrival times t2 of the second agent, and the numerator is the integral over the possible arrival times that meet t̂p > t2 > te. note that in this case the time te will be limited to between 0 < te ≤ t̂p; that is, on average the longest event is limited by the mean picking time t̂p. we then obtain the number of events ne(te > 60 s), counting the number of time slots t̂p within the observation time t , as ne(te > 60 s) = κ60 n2 p2(60 s) t/t̂p . (15) in our case t = 3600 s and κ60 is a parameter that will be used to fit the model to the data, and could be interpreted as a correction considering that simultaneous events can occur during the same time slot t̂p, given that this discretization of time is just an approximation. note that t̂p is the average time that customers spend on the collection of products. if this time increases, customers will be immobile for a longer time. for this reason, increasing t̂p decreases the number of encounters in a fixed period t . finally, the denominator of the δπ is the number of processed agents (np) in the same period of time t . considering the picking time at each product, the number of products, the time needed to walk between them, and the waiting time at the checkout desk, a rough estimation of time needed for a free agent to complete its product list (tr) would be between 25 and 30 min, as can be seen for low occupation in fig. 5a. thus, the number of processed agents per hour could be approximated as np ∼ t/tr n ∼ 2 n. however, when occupancy increases, all internal processes become slower and as a consequence the effective proportionality constant between np and n decreases. considering the result displayed in fig. 5b, we approximate the proportionality constant by 1.5 and thus np = 3/2 n. (16) therefore, for events lasting more than 60 s we can write δπ(te > 60 s) = 2 ne(te > 60 s) np = κ60 4 n2 p2(60 s) t/t̂p 3 n ∝ n, (17) therefore, the functional dependence of δπ(60s) on n is linear, in accordance with the data shown in fig. 9. we then consider the case of events emerging from an encounter between three agents. here, we calculate events that last longer than t̂p; this can only occur when three agents go together to the same product. the corresponding time slot for such events is 2 t̂p. in this case, the average number of sets of three agents that go for products that are less than 2 m apart is n3 = mp( 3+mp−1 3 ) n (n − 1) (n − 2) 6 , (18) where the first factor comes from calculating the probability that three indistinguishable agents head towards the same product, and the second factor corresponds to the total number of sets of three agents. only one pair of agents will have the chance 140001-11 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. to produce an event whose duration is longer than t̂p. this pair is made up of the two agents who arrived last, and the probability that the encounter of these agents lasts longer than te is p3(te) = ∫ t̂p te−t̂p ∫t2 te−t̂p dt3dt2∫ t̂p 0 ∫t2 0 dt3dt2 = ( 2 − ( te/t̂p ))2 , (19) with t̂p ≤ te ≤ 2 t̂p. note that the arrival time of the second agent t2 conditions the possible arrival time of the third t3. thus it is possible to calculate the number of events ne(te > 90 s) and ne(te > 120 s) as ne(te > 90 s) = κ90 n3 p3(90 s) t/2t̂p , (20) ne(te > 120 s) = κ120 n3 p3(120 s) t/2t̂p . (21) in these cases the δπ for events lasting longer than 90 and 120 s can be expressed as δπ(te > 90 s) = 2 ne(te > 90 s) np = κ90 4 n3 p3(90 s) t/2t̂p 3 n ∝ n2, and (22) δπ(te > 120 s) = 2 ne(te > 120 s) np = κ120 4 n3 p3(120 s) t/2t̂p 3 n ∝ n2. (23) because sets of three particles are considered in eq. (18), for 90 s and 120 s δπ grows with n 2, also according to the simulated data displayed in fig. 9. finally, we repeat our analysis for the case of events originated by an encounter between four agents. we focus on events that last longer than 2 t̂p; that is, events where the four agents go together to the same product. again, the pair of agents who arrived last will have the chance to produce such an event. the average number of sets of four agents is n4 = mp( 4+mp−1 4 ) n (n − 1) (n − 2) (n − 3) 24 , (24) and the probability that the encounter between the latest agents lasts longer than te is p4(te) = ∫ t̂p te−2t̂p ∫t2 te−2t̂p ∫t3 te−2t̂p dt4dt3dt2∫ t̂p 0 ∫t2 0 ∫t3 0 dt4dt3dt2 = ( 3 − ( te/t̂p ))3 , (25) with 2 t̂p ≤ te ≤ 3 t̂p. the calculation for the number of events ne(te > 180 s) is ne(te > 180 s) = κ180 n4 p4(180 s) t/3t̂p , (26) and the δπ for events lasting longer than 180 s is expressed as δπ(te > 180 s) = 2 ne(te > 180 s) np = κ180 4 n4 p4(180 s) t/3t̂p 3 n ∝ n3. (27) also, in this case the functionality dependence of δπ(180 s) seems to be in accordance with simulation results (fig. 9). the scale laws for δπ(te) are determined by the dominant encounter of agents; that is, the encounter that involves the lowest number of agents (which is the most probable event) and lasts longer than te. in fact, for the regime of te > 90 s and te > 120 s, we find the same scaling law, and this is because in these regimes the dominant encounter is that of three agents. we calibrate these simulation data with eqs. (17), (22), (23), and (27) by fitting the values of κ, and hence κ60 = 1.3, κ90 = 1.7, κ120 = 2.4, κ180 = 1.5. the solid lines shown in fig. 9 stand for these results. the values obtained for κ are reasonable in terms of interpretation of the fitting parameter proposed above, and indicate that our analysis is correct in terms of computing and the approximated value for the δπ coefficient independently of the simulations, at least for the simple and idealized system studied. 140001-12 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. v conclusions in this work we investigate and characterize social distancing in an everyday pedestrian system by simulating the dynamics of an ideal supermarket. many sources of complexity were successfully taken into account with a multilevel model, which enables us to simulate not only translation but also more complex behaviors such as waiting times when selecting particular products and queuing at checkout points. the main process that keeps pedestrians close to one another is the queuing lines for checkout. therefore advice for the operation would be to keep these lines as short as possible either by increasing the number of checkout points or by decreasing occupancy. at values greater than 2 m, different operational models display similar macroscopic observables regarding social distance, indicating that the results are robust with respect to microscopic collision avoidance resolution, and also suggesting that the simulated paths of the particles are more influenced by the geometry, shopping list, and time-consuming process than by the particular avoidance mechanism. however, first-order models such as the cpm presented in ref. [10] and section ii.iii seem more suitable for simulation of highly populated scenarios with complex behavioral agents. taking a physical distance threshold of 2 m, the probabilities and duration of such events are studied. the physical distance coefficient (δπ) is defined as an indicator of the fraction of the population passing through the system that is involved in one or many of these events lasting at least a certain time threshold te. we put forward a theoretical analysis that satisfactorily fits the simulation data. it is important to note that applying this analysis requires an estimate of the number of agents processed per unit of time. in this work we use a relationship found from numerical simulations that can in the future be calibrated by empirical data or new models. the same analysis can be carried out for a different set of parameters and for other pedestrian facilities such as other specific supermarkets or different systems (transport, entertainment, etc.). of course, existing facilities can be monitored with measurement methods [6] providing high-quality trajectory data. this kind of data could also be interpreted in terms of the analysis performed in the present work. the analysis presented takes into account only the duration of a given physical distance. as stated in the introduction, this is only a partial aspect of the contagion problem, and thus it must be integrated with other disciplines. for example, if a physical distance, a time threshold, and the fraction of the population that could be exposed to these conditions were determined, then maximum occupancy could be estimated using the observables defined in this work. acknowledgements the authors acknowledge the information and data provided by the argentinean supermarket chain “la anónima”. this work was funded by project pid2015-003 (anpcyt) argentina and project itbacyt-2018-42 and itbacyt-2018-43 (itba) argentina. [1] world health organization, coronavirus disease (covid-19) advice for the public, last updated: 4 june 2020. [2] s l miller, w w nazaroff, j l jimenez, a boerstra, g buonanno, s j dancer, j kurnitski, l c marr, l morawska, c noakes, transmission of sars-cov-2 by inhalation of respiratory aerosol in the skagit valley chorale superspreading event, indoor air 31, 314 (2020). [3] l marr, s miller, c haas, w bahnfleth, r corsi, j tang, h herrmann, k pollitt, and j l jimenez, faqs on protecting yourself from covid-19 aerosol transmission, last updated: 1 october 2020. [4] k rathinakumar, a quaini, a microscopic approach to study the onset of a highly infectious disease spreading, math. biosci. 329, 108475 (2020). [5] t harweg, d bachmann, f weichert, agentbased simulation of pedestrian dynamics for exposure time estimation in epidemic risk assessment, j. public health, 1 (2021). [6] c a s pouw, f toschi, f van schadewijk, a corbetta, monitoring physical distancing for crowd management: real-time trajectory and group analysis, plos one 15, e0240963 (2020). 140001-13 https://www.who.int/emergencies/diseases/novel-coronavirus-2019/advice-for-public https://www.who.int/emergencies/diseases/novel-coronavirus-2019/advice-for-public https://doi.org/10.1111/ina.12751 https://tinyurl.com/faq-aerosols https://tinyurl.com/faq-aerosols https://doi.org/10.1016/j.mbs.2020.108475 https://doi.org/10.1016/j.mbs.2020.108475 https://doi.org/10.1007/s10389-021-01489-y https://doi.org/10.1371/journal.pone.0240963 https://doi.org/10.1371/journal.pone.0240963 papers in physics, vol. 14, art. 140001 (2022) / d. r. parisi et al. [7] c m mayr, g köster, social distancing with the optimal steps model, arxiv preprint, arxiv:2007.01634 (2020). [8] s p hoogendoorn, p h l bovy, pedestrian route-choice and activity scheduling theory and models, transport. res. b: meth. 38, 169 (2004). [9] j yao, c lin, x xie, a j wang, c c hung, path planning for virtual human motion using improved a* star algorithm, in: 2010 seventh int. conf. on information technology: new generations, ieee, pag. 1154, las vegas (nv, usa) (2010). [10] g baglietto, d r parisi, continuous-space automaton model for pedestrian dynamics, phys. rev. e 83, 056117 (2011). [11] r martin, d parisi, pedestrian collision avoidance with a local dynamic goal, collective dynamics 5, 324 (2020). [12] d helbing, i farkas, t vicsek, simulating dynamical features of escape panic, nature 407, 487 (2000). [13] a johansson, d helbing, p k shukla, specification of the social force pedestrian model by evolutionary adjustment to video tracking data, adv. complex syst. 10, 271 (2007). [14] i karamouzas, p heil, p van beek, m h overmars, a predictive collision avoidance model for pedestrian simulation, in: motion in games, eds. a egges, r geraerts, m overmars, pag. 41, springer, the netherlands (2009). 140001-14 arxiv:2007.01634 arxiv:2007.01634 https://doi.org/10.1016/s0191-2615(03)00007-9 https://doi.org/10.1016/s0191-2615(03)00007-9 https://doi.org/10.1109/itng.2010.53 https://doi.org/10.1109/itng.2010.53 https://doi.org/10.1109/itng.2010.53 https://doi.org/10.1109/itng.2010.53 https://doi.org/10.1103/physreve.83.056117 https://doi.org/10.1103/physreve.83.056117 http://dx.doi.org/10.17815/cd.2020.66 http://dx.doi.org/10.17815/cd.2020.66 https://doi.org/10.1038/35035023 https://doi.org/10.1038/35035023 https://doi.org/10.1142/s0219525907001355 https://doi.org/10.1007/978-3-642-10347-6_4 https://doi.org/10.1007/978-3-642-10347-6_4 https://doi.org/10.1007/978-3-642-10347-6_4 https://doi.org/10.1007/978-3-642-10347-6_4 introduction models strategic level tactical level operational level states of agents simulations results general aspects distance analysis duration of social distance events physical distance coefficient theoretical derivation of conclusions papers in physics, vol. 4, art. 040004 (2012) received: 17 october 2011, accepted: 18 march 2012 edited by: j. pullin reviewed by: l. freidel, perimeter institute for theoretical physics, waterloo, canada licence: creative commons attribution 3.0 doi: 10.4279/pip.040004 www.papersinphysics.org issn 1852-4249 invited review: the new spin foam models and quantum gravity alejandro perez1∗ in this article, we give a systematic definition of the recently introduced spin foam models for four-dimensional quantum gravity, reviewing the main results on their semiclassical limit on fixed discretizations. i. introduction the quantization of the gravitational interaction is a major open challenge in theoretical physics. this review presents the status of the spin foam approach to the problem. spin foam models are definitions of the path integral formulation of quantum general relativity and are expected to be the covariant counterpart of the background independent canonical quantization of general relativity known as loop quantum gravity [1–3]. this article focuses on the definition of the recently introduced engle-pereira-rovelli-livine (eprl) model [4,5] and the closely related freidelkrasnov (fk) model [6]. an important original feature of the present paper is the explicit derivation of both the riemannian and the lorentzian models, in terms of a notation that exhibits the close relationship between the two, at the algebraic level, that might signal a possible deeper relationship at the level of transition amplitudes. we will take plebanski’s perspective in which general relativity is formulated as a constrained bf theory (for a review introducing the new models from a bottom-up perspective see ref. [7]; for an extended version of the present review including a wide collection of related work see ref. [8]). for that reason, it will be convenient to start this review by introducing the exact spin foam quantization of bf in the following section. in section iii, we present the eprl model in both its riemannian and lorentzian versions. a unified treatment of the representation theory of the relevant gauge groups is presented in that section. in section iv, we introduce the fk model and discuss its relationship with the eprl model. in section v, we describe the structure of the boundary states of these models and emphasize the relationship with the kinematical hilbert space of loop quantum gravity. in section vi, we give a compendium of important issues (and associated references) that have been left out but which are important for future development. finally, in section vii, we present the recent encouraging results of the nature of the semiclassical limit of the new models. ∗e-mail: perez@cpt.univ-mrs.fr 1 centre de physique théorique, campus de luminy, 13288 marseille, france. unité mixte de recherche (umr 6207) du cnrs et des universités aix-marseille i, aix-marseille ii, et du sud toulon-var; laboratoire afilié à la frumam (fr 2291). 040004-1 papers in physics, vol. 4, art. 040004 (2012) / a. perez ii. spin foam quantization of bf theory we will start by briefly reviewing the spin foam quantization of bf theory. this section will be the basic building block for the construction of the models of quantum gravity that are dealt with in this article. the key idea is that the quantum transition amplitudes (computed in the path integral representation) of gravity can be obtained by suitably restricting the histories that are summed over in the spin foam representation of exactly solvable bf theory. we describe the nature of these constraints at the end of this section. here, one follows the perspective of ref. [9]. let g be a compact group whose lie algebra g has an invariant inner product, here denoted 〈〉, and m a d-dimensional manifold. classical bf theory is defined by the action s[b, ω] = ∫ m 〈b ∧ f(ω)〉, (1) where b is a g valued (d−2)-form, ω is a connection on a g principal bundle over m. the theory has no local excitations: all the solutions of the equations of motion are locally related by gauge transformations. more precisely, the gauge symmetries of the action are the local g gauge transformations δb = [b, α] , δω = dωα, (2) where α is a g-valued 0-form, and the ‘topological’ gauge transformation δb = dωη, δω = 0, (3) where dω denotes the covariant exterior derivative and η is a g-valued 0-form. the first invariance is manifest in the form of the action, while the second one is a consequence of the bianchi identity, dωf (ω) = 0. the gauge symmetries are so vast that all the solutions to the equations of motion are locally pure gauge. the theory has only global or topological degrees of freedom. for the time being, we assume m to be a compact and orientable manifold. the partition function, z, is formally given by z = ∫ d[b]d[ω] exp(i ∫ m 〈b ∧ f (ω)〉). (4) formally integrating over the b field in (4), we obtain z = ∫ d[ω] δ (f (ω)) . (5) the partition function z corresponds to the ‘volume’ of the space of flat connections on m. in order to give a meaning to the formal expressions above, we replace the d-dimensional manifold m with an arbitrary cellular decomposition ∆. we also need the notion of the associated dual 2complex of ∆ denoted by ∆⋆. the dual 2-complex ∆⋆ is a combinatorial object defined by a set of vertices v ∈ ∆⋆ (dual to d-cells in ∆) edges e ∈ ∆⋆ (dual to (d−1)-cells in ∆) and faces f ∈ ∆⋆ (dual to (d−2)-cells in ∆). in the case where ∆ is a simplicial decomposition of m, the structure of both ∆ and ∆⋆ is illustrated in figs. 1, 2 and 3 in two, three, and four dimensions, respectively. figure 1: on the left: a triangulation and its dual in two dimensions. on the right: the dual two complex; faces (shaded polygon) are dual to 0-simplices in 2d. figure 2: on the left: a triangulation and its dual in three dimensions. on the right: the dual two complex; faces (shaded wedge) are dual to 1simplices in 3d. 040004-2 papers in physics, vol. 4, art. 040004 (2012) / a. perez figure 3: on the left: a triangulation and its dual in four dimensions. on the right: the dual two complex; faces (shaded wedge) are dual to triangles in 4d. the shaded triangle dual to the shaded face is exhibited. for simplicity, we concentrate on the case when ∆ is a triangulation. the field b is associated with lie algebra elements bf assigned to faces f ∈ ∆ ⋆. we can think of it as the integral of the (d−2)-form b on the (d−2)-cell dual to the face f ∈ ∆⋆, namely bf = ∫ (d−2)−cell b. (6) in other words, bf can be interpreted as the ‘smearing’ of the continuous (d−2)-form b on the (d−2)-cells in ∆. we use the one-to-one correspondence between faces f ∈ ∆⋆ and (d−2)-cells in ∆ to label the discretization of the b field bf . the connection ω is discretized by the assignment of group elements ge ∈ g to edges e ∈ ∆ ⋆. one can think of the group elements ge as the holonomy of ω along e ∈ ∆⋆, namely ge = p exp(− ∫ e ω), (7) where the symbol “p exp ” denotes the path-orderexponential that reminds us of the relationship of the holonomy with the connection along the path e ∈ ∆⋆. with this, the discretized version of the path integral (4) is z(∆) = ∫ ∏ e∈∆⋆ dge ∏ f∈∆⋆ dbf e ibf uf = ∫ ∏ e∈∆⋆ dge ∏ f∈∆⋆ δ(ge1 · · · gen ), (8) where uf = ge1 · · · gen denotes the holonomy around faces, and the second equation is the result of the b integration: it can be, thus, regarded as the analog of (5). the integration measure dbf is the standard lebesgue measure, while the integration in the group variables is done in terms of the invariant measure in g (which is the unique haar measure when g is compact). for given h ∈ g and test function f (g), the invariance property reads as follows ∫ dgf (g) = ∫ dgf (g−1) = ∫ dgf (gh) = ∫ dgf (hg) (9) the peter-weyl’s theorem provides a useful formula or the dirac delta distribution appearing in (8), namely δ(g) = ∑ ρ dρtr[ρ(g)], (10) where ρ are irreducible unitary representations of g. from the previous expression, one obtains z(∆) = ∑ c:{ρ}→{f} ∫ ∏ e∈∆⋆ dge ∏ f∈∆⋆ dρf tr [ ρf (g 1 e . . . g n e ) ] . (11) integration over the connection can be performed as follows. in a triangulation ∆, the edges e ∈ ∆⋆ bound precisely d different faces. therefore, the ge’s in (11) appear in d different traces. the relevant formula is p einv(ρ1, · · · , ρd) := ∫ dge ρ1(ge) ⊗ ρ2(ge) ⊗ · · · ⊗ ρd(ge). (12) for compact g, using the invariance (and normalization) of the the integration measure (9), it is easy to prove that p einv = (p e inv ) 2 is the projector onto inv[ρ1 ⊗ ρ2 ⊗ · · · ⊗ ρd]. in this way, the spin foam amplitudes of so(4) bf theory reduce to 040004-3 papers in physics, vol. 4, art. 040004 (2012) / a. perez zbf (∆) = ∑ cf :{f}→ρf ∏ f∈∆⋆ dρf ∏ e∈∆⋆ p einv (ρ1, · · · , ρd). (13) in other words, the bf amplitude associated to a two-complex ∆⋆ is simply given by the sum over of all possible assignments of irreducible representations of g to faces of the number obtained by the natural contraction of the network of projectors p einv, according to the pattern provided defined by the two-complex ∆⋆. there is a nice graphical representation of the partition function of bf theory that will be very useful for some calculations. on the one hand, using this graphical notation one can easily prove the discretization independence of the bf amplitudes. on the other hand, this graphical notation will simplify the presentation of the new spin foam models of quantum gravity that will be considered in the following sections. this useful notation was introduced by oeckl [10,11] and used in ref. [12] to give a general proof of the discretization independence of the bf partition function and the turaev-viro invariants for their definition on general cellular decompositions. we will present this notation in detail: the idea is to represent each representation matrix appearing in (11) by a line (called a wire) labeled by an irreducible representation, and integrations on the group by a box (called a cable). the traces in eq. (11) imply that there is a wire, labeled by the representation ρf , winding around each face f ∈ ∆ ⋆. in addition, there is a cable (integration on the group) associated with each edge e ∈ ∆⋆. as in (13), there is a projector p einv, which is the projector in inv[ρ1 ⊗ ρ2 ⊗ · · · ⊗ ρd] associated to each edge. this will be represented by a cable with d wires, as shown in (14). such graphical representation allows for a simple diagrammatic expression of the bf quantum amplitudes. p einv(ρ1, ρ2, ρ3, · · · , ρd) ≡ ρ1ρ2ρ3 · · · ρd (14) the case of physical interest is d = 4. in such case, edges are shared by four faces; each cable has now four wires. the cable wire diagram giving the bf amplitude is dictated by the combinatorics of the dual two complex ∆⋆. from fig. 3, one gets zbf (∆) = ∑ cf :{f}→ρf ∏ f∈∆⋆ dρ ρ1 ρ2 ρ3 ρ4 ρ5 ρ6 ρ7 ρ8 ρ9 ρ10 . (15) the 10 wires corresponding to the 10 faces f ∈ ∆⋆, sharing a vertex v ∈ ∆⋆, are connected to the neighboring vertices through the 5 cables (representing the projectors in (13) and fig. 14) associated to the 5 edges e ∈ ∆⋆, sharing the vertex v ∈ ∆⋆. a. su (2) × su (2) bf theory: a starting point for 4d riemannian gravity. we now present the bf quantum amplitudes in the case g = su (2) × su (2). this special case is of fundamental importance in the construction of the gravity models presented in the following sections. the product form of the structure group implies the simple relationship zbf (su (2) × su (2)) = zbf (su (2)) 2. nevertheless, it is important for us to present this example in an explicit way as it will provide the graphical notation that is needed to introduce the gravity models in a simple manner. the spin foam representation of the bf partition function follows from expressing the projectors in (15) in the orthonormal basis of intertwiners, i.e., invariant vectors in inv[ρ1 ⊗ · · · ⊗ ρ4]. from the product form of the structure group, one has ρ1 ρ2ρ3 ρ4 = j−1 j − 2 j − 3 j − 4 j + 1 j + 2 j + 3 j + 4 = ∑ ι−ι+ j−1j − 2 j − 3 j − 4 ι− j+1j + 2 j + 3 j + 4 ι+ , (16) 040004-4 papers in physics, vol. 4, art. 040004 (2012) / a. perez where ρf = j − f ⊗ j + f , j ± f and ι ± are half integers labeling left and right representations of su (2) that defined the irreducible unitary representations of g = su (2) × su (2). we have used the expression of the right and left su (2) projectors in a basis of intertwiners, namely j1 j2 j3 j4 = ∑ ι j1 j2 j3 j4 ι , (17) where the four-leg objects on the right hand side denote the invariant vectors spanning a basis of inv[j1 ⊗ · · · ⊗ j4], and ι is a half integer, labeling those elements. accordingly, when replacing the previous expression in (15), one gets zbf (∆) = ∑ cf :{f}→ρf ∏ f∈∆⋆ dj− f dj+ f , (18) and equivalently, zbf (∆) = ∑ cf :{f}→ρf ∏ f∈∆⋆ dj− f dj+ f ∑ ce:{e}→ιe (19) from which we finally obtain the spin foam representation of the su (2) × su (2) partition function as a product of two su (2) amplitudes, namely zbf (∆) = ∑ cf :{f}→ρf ∏ f∈∆⋆ dj− f dj+ f ∑ ce:{e}→ιe ∏ v∈∆⋆ ι−1 ι−2 ι−3 ι−4 ι−5 j−1 j−2 j−3 j−4 j−5 j−6 j−7 j−8 j−9 j−10 ι+1 ι+2 ι+3 ι+4 ι+5 j+1 j+2 j+3 j+4 j+5 j+6 j+7 j+8 j+9 j+10 (20) extra remarks on four-dimensional bf theory the state sum (11) is generically divergent (due to the gauge freedom analogous to (3)). a regularized version defined in terms of suq(2) × suq(2) was introduced by crane and yetter [13, 14]. as in three dimensions, if an appropriate regularization of bubble divergences is provided, (11) is topologically invariant and the spin foam path integral is discretization independent. as in the three-dimensional case, bf theory can be coupled to topological defects [15] in any dimension. in the four-dimensional case, defects are string-like [16] and can carry extra degrees of freedom, such as topological yang-mills fields [17]. the possibility that quantum gravity could be defined directly from these simple kinds of topological theories has also been considered outside spin foams [18] (for which the uv problem described in the introduction is absent). this is attractive and should, in my view, be considered further. it is also possible to introduce one-dimensional particles in four-dimensional bf theory and gravity, as shown in ref. [19]. two-dimensional bf theory has been used as the basic theory in an attempt to define a manifold independent model of qft in ref. [20]. it is also related to gravity in two dimensions in two ways: on the one hand, it is equivalent to the so-called jackiw-teitelboim model [21,22], on the other hand it is related to usual 2d gravity via constraints in a way similar to the one exploited in four dimensions (see next section). the first relationship has been used in the canonical quantization of the jackiwteitelboim model in ref. [23]. the second relationship has been explored in ref. [24]. 040004-5 papers in physics, vol. 4, art. 040004 (2012) / a. perez three-dimensional bf theory and the spin foam quantization presented above are intimately related to classical and quantum gravity in three dimensions (for a classic reference see ref. [25]). the state sum, as presented above, matches the quantum amplitudes first proposed by ponzano and regge in the 60’s, based on their discovery of the asymptotic expressions of the 6j symbols [26], often referred to as the ponzano-regge model. divergences in the above formal expression require regularization. natural regularizations are available so that the model is well-defined [27–29]. for a detailed study of the divergence structure of the model, see refs. [30–32]. the quantum deformed version of the above amplitudes lead to the so-called turaev-viro model [33], which is expected to correspond to the quantization of threedimensional riemannian gravity in the presence of a non-vanishing positive cosmological constant. for the definition of observables in the latter context, as well as in the analogue four-dimensional analog, see ref. [34]. the topological character of bf theory can be preserved by the coupling of the theory with topological defects that play the role of point particles. in the spin foam literature, this has been considered from the canonical perspective in refs. [35,36] and from the covariant perspective extensively by freidel and louapre [37]. these theories have been proved by freidel and livine to be dual, in a suitable sense, to certain non-commutative fields theories in three dimensions [38, 39]. concerning coupling bf theory with nontopological matter, see refs. [40, 41] for the case of fermionic matter, and ref. [42] for gauge fields. a more radical perspective for the definition of matter in 3d gravity is taken in ref. [43]. for threedimensional supersymmetric bf theory models, see refs. [44, 45] recursion relations for the 6j vertex amplitudes have been investigated in refs. [46, 47]. they provide a tool for studying dynamics in spin foams of 3d gravity and might be useful in higher dimensions [48]. i. the coherent states representation in this section, we introduce the coherent state representation of the su (2) and spin(4) path integral of bf theory. this will be particularly important for the definition of the models defined by freidel and krasnov in ref. [6] that we will address in section iv as well as in the semiclassical analysis of the new models reported in section vii. the relevance of such representation for spin foams was first emphasized by livine and speziale in ref. [49]. a. coherent states coherent states associated with the representation theory of a compact group have been studied by thiemann and collaborators [50,51,51–59], see also ref. [60]. their importance for the new spin foam models was put forward by livine and speziale in ref. [49], where the emphasis was put on coherent states of intertwiners or the so-called quantum tetrahedron (see also [61]). here we follow the presentation of [6]. in order to build coherent states for spin(4), we start by introducing them in the case of su (2). starting from the representation space hj of dimension dj ≡ 2j + 1, one can write the resolution of the identity in terms of the canonical orthonormal basis |j, m〉 as 1j = ∑ m |j, m〉〈j, m|, (21) where −j ≤ m ≤ j. there exists an over complete basis |j, g〉 ∈ hj , labeled by g ∈ su (2), such that 1j = dj ∫ su(2) dg |j, g〉〈j, g|, (22) the states |j, g〉 ∈ hj are su (2) coherent states defined by the action of the group on maximum weight states |j, j〉 (themselves coherent), namely |j, g〉 ≡ g|j, j〉 = ∑ m |j, m〉d j mj (g), (23) where d j mj (g) are the matrix elements of the unitary representations in the |j, m〉 (wigner matrices). equation (22) follows from the orthonormality of unitary representation matrix elements, namely 040004-6 papers in physics, vol. 4, art. 040004 (2012) / a. perez dj ∫ su(2) dg |j, g〉〈j, g|, = dj ∑ mm′ |j, m〉〈j, m′| ∫ su(2) dg d j mj (g)d j m′j (g) = ∑ m |j, m〉〈j, m|, (24) where in the last equality we have used the orthonormality of the matrix elements. the decomposition of the identity (22) can be expressed as an integral on the two-sphere of directions s2 = su (2)/u (1) by noticing that d j mj (g) and d j mj (gh) differ only by a phase for any group element h from a suitable u (1) ⊂ su (2). thus, one has 1j = dj ∫ s2 dn |j, n〉〈j, n|, (25) where n ∈ s2 is integrated with the invariant measure of the sphere. the states |j, n〉 form (an overcomplete) basis in hj . su (2) coherent states have the usual semiclassical properties. indeed, if one considers the generators j i of su(2), one has 〈j, n|ĵ i|j, n〉 = j ni, (26) where ni is the corresponding three-dimensional unit vector for n ∈ s2. the fluctuations of ĵ 2 are also minimal with ∆j 2 = ~2j, where we have restored ~ for clarity. the fluctuations go to zero in the limit ~ → 0 and j → ∞, while ~j is kept constant. this kind of limit will be used often as a notion of semiclassical limit in spin foams. the state |j, n〉 is a semiclassical state describing a vector in r3 of length j and of direction n. it will be convenient to introduce the following graphical notation for eq. (25) j = dj ∫ s2 dn j n (27) finally, an important property of su (2) coherent states stemming from the fact that |j, j〉 = | 1 2 , 1 2 〉| 1 2 , 1 2 〉 · · · | 1 2 , 1 2 〉 ≡ | 1 2 , 1 2 〉⊗2j is that |j, n〉 = | 1 2 , n〉⊗2j . (28) the above property will be of key importance in constructing effective discrete actions for spin foam models. in particular, it will play a central role in the study of the semiclassical limit of the eprl and fk models studied in sections iii, and iv. in the following subsection, we provide an example for spin(4) bf theory. b. spin(4) bf theory: amplitudes in the coherent state basis here we study the coherent states representation of the path integral for spin(4) bf theory. the construction presented here can be extended to more general cases. the present case is, however, of particular importance for the study of gravity models presented in sections iii, and iv. with the introduction of coherent states, one achieves the most difficult part of the work. in order to express the spin(4) bf amplitude in the coherent state representation, one simply inserts a resolution of the identity in the form (25) on each and every wire connecting neighboring vertices in the expression (18) for the bf amplitudes. the result is zbf (∆) = ∑ cf :{f}→ρf ∏ f∈∆⋆ dj− f dj+ f ∫ ∏ e∈∈∆⋆ dj− ef dj+ ef dn−ef dn + ef n − 1 n + 1 n − 2 n + 2 n − 3 n + 3 n − 4 n + 4 ,(29) where we have explicitly written the n± ∈ s 2 integration variables only on a single cable. one observes that there is one n± ∈ s 2 per each wire coming out at an edge e ∈ ∆⋆. as wires are in one-to-one correspondence with faces f ∈ ∆⋆, the integration variables n±ef ∈ s 2 are labeled by an 040004-7 papers in physics, vol. 4, art. 040004 (2012) / a. perez edge and face subindex. in order to get an expression of the bf path integral in terms of an affective action, we restore, at this stage, the explicit group integrations represented by the boxes in the previous equation. one gets zbf (∆) = ∑ cf :{f}→ρf ∏ f∈∆⋆ dj− f dj+ f ∫ ∏ e∈∆⋆ dj− ef dj+ ef dn−ef dn + ef ∏ v∈∆⋆ ∏ e,e′∈v dg−ef dg + ef (〈n − ef |(g −)−1ef g − e′f |n − e′f 〉) 2j − f (〈n+ef |(g +)−1ef g + e′f |n + e′f 〉) 2j+ f , (30) where we have used the coherent states property (28), and |n±〉 is a simplified notation for | 1 2 , n±〉. the previous equation can be finally written as zbf (∆) = ∑ cf :{f}→ρf ∏ f∈∆⋆ dj− f dj+ f ∫ ∏ e∈∆⋆ dj− ef dj+ ef dn−ef dn + ef dg − ef dg + ef exp (sd j±,n± [g±]), (31) where the discrete action sdj±,n± [g ±] = ∑ v∈∆⋆ svjv ,nv [g ±] (32) with svj,n[g] = 5 ∑ a 1: in this case, according to ref. [69], one restricts the representations to riemannian: p = γ(j + 1), k = j. lorentzian: p = γ(j + 1), k = j. (49) which amounts to choosing the minimum weight component j = k in the expansion (41). for the riemannian case, we can write the solutions in terms of j± = (γ ± 1) j 2 + γ−1 2 . notice that for γ > 1 there is complete symmetry between the solutions of the riemannian and lorentzian sectors. in my opinion, this symmetry deserves further investigation as it might be an indication of a deeper connection between the riemannian and lorentzian models (again, such relationship is a fact in 3d gravity [62]. another criterion for weak imposition can be developed by studying the spectrum of the master constraint mf = df ·df . strong imposition of the constraints dif would amount to looking for the kernel of the master constraint mf . however, generically, the positive operator associated with the master constraint does not contain the zero eigenvalue in the spectrum due to the open nature of the constraint algebra (46). it is convenient, as in ref. [70], to express the master constraint in a manifestly invariant way. in order to get a gauge invariant constraint one starts 040004-11 papers in physics, vol. 4, art. 040004 (2012) / a. perez from the master constraint and uses the dif = 0 classically to write it in terms of casimirs, namely mf = (1 + σ 2γ2)c2 − 2c1γ, where c1 and c2 are the casimirs given in eq. (39). the minimum eigenvalue condition is riemannian: p = j, k = γj. lorentzian: p = γj, k = j. (50) the minimum eigenvalue is mmin = ~ 2γj(γ2 − 1) for the riemannian case and mmin = γ for the lorentzian case. the master constraint criterion works better in the lorentzian case, as pointed out in ref. [70]. more recently, it has been shown that the constraint solutions p = γj and k = j also follow naturally from a spinor formulation of the simplicity constraints [71–73]. the above criterion is used in the definition of the eprl model. it is important to point out that the riemannian case imposes strong restrictions on the allowed values of the immirzi parameter if one wants the spin j ∈ n/2 to be arbitrary (in order to have all possible boundary states allowed in lqg). in this case, the only possibilities are γ = n or γ = 1. this restriction is not natural from the viewpoint of lqg. its relevance, if any, remains mysterious at this stage. summarizing, in the lorentzian (riemannian) eprl model one restricts the sl(2, c) (spin(4)) representations of bf theory to those satisfying p = γj k = j (51) for j ∈ n/2. from now on, we denote the subset of admissible representation kγ ⊂ irrep(sl(2, c))(irrep(spin(4))) (52) the admissible quantum states ψ are elements of the subspace hj ⊂ hγj,j (i.e., minimum weight states) which satisfy the constraints (45) in the following semiclassical sense: (kif − γl i f )ψ = osc, (53) where the symbol osc (order semiclassical) denotes a quantity that vanishes in limit ~ → 0, j → ∞ with ~j =constant. in the riemannian case, the previous equation can be written as [(1 − γ)j i+ − (1 + γ)j i −]ψ = osc, (54) which in turn has a simple graphical representation in terms of spin-network grasping operators, namely −(1 + γ) +(1 − γ) k j− j+ = osc k j− j+ (55) the previous equation will be of great importance in the graphical calculus that will allow us to show that the linear constraint imposed here, at the level of states, implies the vanishing of the quadratic plebanski constraints (34) and their fluctuations, computed in the path integral sense, in the appropriate large spin semiclassical limit. iii. presentation of the riemannian eprl amplitude here we complete the definition of the eprl models by imposing the linear constraints on the bf amplitudes constructed in section ii. we will also show that the path-integral expectation value of the plebanski constraints (34), as well as their fluctuations, vanish in a suitable semiclassical sense. this shows that the eprl model can be considered as a lattice definition of the a quantum gravity theory. we start with the riemannian model for which a straightforward graphical notation is available. the first step is the translation of eq. (40)— for p and k satisfying the simplicity constraints— in terms of the graphical notation introduced in section ii. concretely, for γ < 1, one has j± = (1 ± γ)j/2 ∈ kγ and (40) becomes (1 − γ) j 2 (1 + γ) j 2= j ⊕ α=γj α (1 − γ) j 2 (1 − γ) j 2 (1 + γ) j 2 (1 + γ) j 2 (56) for γ > 1 we have (γ − 1) j 2 (1 + γ) j 2= γj ⊕ α=j α (γ − 1) j 2 (γ − 1) j 2 (1 + γ) j 2 (1 + γ) j 2 (57) 040004-12 papers in physics, vol. 4, art. 040004 (2012) / a. perez the implementation of the linear constraints of subsection ii consists of restricting the representations ρf of spin(4) (appearing in the state sum amplitudes of bf theory, as written in eq. (18)) to the subclass ρf ∈ kγ ⊂ irrep(spin(4)), defined above, while projecting to the highest weight term in (56) for γ < 1. for γ > 1, one must take the minimum weight term in (57) . the action of this projection will be denoted yj : h(1+γ)j/2,|(1−γ)|j/2 → hj , graphically yj   |γ − 1| j 2 (1 + γ) j 2   = j . (58) explicitly, one takes the expression of the bf partition function (13) and modifies it by replacing the projector p einv(ρ1, · · · , ρ4) with ρ1, · · · ρ4 ∈ kγ by a new object p eeprl(j1, · · · , j4) ≡ p e inv(ρ1 · · · ρ4) ×(yj1 ⊗ · · · ⊗ yj4 )p e inv (ρ1 · · · ρ4) (59) with j1, · · · j4 ∈ n/2, implementing the linear constraints described in the previous section. graphically, the modification of bf theory that produces the eprl model corresponds to the replacement p einv(ρ1 · · · ρ4) = p eeprl(j1 · · · j4) = (60) on the expression (18), where we have dropped the representation labels from the figure for simplicity. we have done the operation (58) on each an every of the four pairs of representations. the spin(4) integrations represented by the two boxes at the top and bottom of the previous graphical expression restore the full spin(4) invariance as the projection (58) breaks this latter symmetry for being based on the selection of a special subgroup su (2) ⊂ spin(4) in its definition (see subsection c for an important implication). one should simply keep in mind that green wires in the previous two equations and in the ones that follow are labeled by arbitrary spins j (which are being summed over in the expression of the amplitude (61)), while red and blue wires are labeled by j+ = (1 + γ)j/2 and j− = |1 − γ|j/2, respectively. with this, (18) is modified to zeeprl(∆) = ∑ ρf ∈k ∏ f∈∆⋆ d|1−γ| j 2 d(1+γ) j 2 × ∏ e p eeprl(j1, · · · , j4) = = ∑ ρf ∈k ∏ f∈∆⋆ d|1−γ| j 2 d(1+γ) j 2 × w ,(61) the previous expression defines the eprl model amplitude. a. the spin foam representation of the eprl amplitude now we will work out the spin foam representation of the eprl amplitude which, at this stage, will take no more effort than the derivation of the spin foam representation for spin(4) bf theory, as we went from eq. (18) to eq. (20) in section ii. the first step is given in the following equation 040004-13 papers in physics, vol. 4, art. 040004 (2012) / a. perez = = ∑ ι ι ῑ (62) which follows, basically, from the invariance of the haar measure (9) (in the last line, we have used (17)). more precisely, the integration of the subgroup su (2) ∈ spin(4), represented by the green box on the right, can be absorbed by suitable redefinition of the integration on the right and left copies of su (2), represented by the red and blue boxes, respectively. with this, we can already write the spin foam representation of the eprl model, namely zeeprl(∆) = ∑ jf ∑ ιe ∏ f∈∆⋆ d|1−γ| j 2 d(1+γ) j 2 × ∏ v∈∆⋆ ι1 ι2 ι3 ι4 ι5 , (63) where the vertex amplitude (graphically represented) depends on the 10 spins j associated to the face-wires and the 5 intertwiners associated to the five edges (tetrahedra). as in previous equations, we have left the spin labels of wires implicit for notational simplicity. we can write the previous spin foam amplitude in another form by integrating out all the projectors (boxes) explicitly. using (17), we get = ∑ ι+ι−ι y z x (64) thus replacing this in (61), we get 040004-14 papers in physics, vol. 4, art. 040004 (2012) / a. perez zeeprl(∆) = ∑ jf ∏ f∈∆⋆ d|γ−1| j 2 d(γ+1) j 2 ∑ ιe ∏ v∈∆⋆ ∑ ι − 1 ···ι − 5 ∑ ι + 1 ···ι + 5 5 ∏ a=1 f ιa ι − a ,ι + a (65) ι − 1 ι − 2 ι − 3 ι − 4 ι − 5 |1−γ| j1 2 |1−γ| j2 2 |1−γ| j3 2 |1−γ| j4 2 |1−γ| j5 2 |1−γ| j6 2 |1−γ| j7 2 |1−γ| j8 2 |1−γ| j9 2 |1−γ| j10 2 ι − 1 ι − 2 ι − 3 ι − 4 ι − 5 |1+γ| j1 2 |1+γ| j2 2 |1+γ| j3 2 |1+γ| j4 2 |1+γ| j5 2 |1+γ| j6 2 |1+γ| j7 2 |1+γ| j8 2 |1+γ| j9 2 |1+γ| j10 2 where the coefficients f ι ι+ι− are the so-called fusion coefficients which already appear in their graphical form in (64), more explicitly f ιι+ι− (j1, · · · , j4) = ι+ ι− ι |1−γ| j1 2 |1−γ| j2 2 |1−γ| j3 2 |1−γ| j4 2 |1+γ| j1 2 |1+γ| j2 2 |1+γ| j3 2 |1+γ| j4 2 j1 j2 j3 j4 (66) the previous eq. (66) is the form of the eprl model as derived in ref. [5]. iv. proof of validity of the plebanski constraints in this section, we prove that the quadratic constraints are satisfied in the sense that their path integral expectation value and fluctuation vanish in the appropriate semiclassical limit. a. the quadratic plebanski constraints the quadratic plebanski constraints are ǫijklb ij µν b kl ρσ − e ǫµνρσ ≈ 0. (67) the constraints in this form are more suitable for the translation into the discrete formulation. more precisely, according to (6), the smooth fields bijµν are now associated with the discrete quantities bij triangles , or equivalently bijf as faces f ∈ ∆ ⋆ are in one-to-one correspondence to triangles in four dimensions. the constraints (67) are local constraints valid at every spacetime point. in the discrete setting, spacetime points are represented by four-simplexes or (more addapted to our discussion) vertices v ∈ ∆⋆. with this, the constraints (67) are discretized as follows: triangle (or diagonal) constraints: ǫijklb ij f b kl f = 0, (68) for all f ∈ v, i.e., for each and every face of the 10 possible faces touching the vertex v. tetrahedron constraints: ǫijklb ij f b kl f ′ = 0, (69) for all f, f ′ ∈ v, so that they are dual to triangles sharing a one-simplex, i.e., belonging to the same tetrahedron out of the five possible ones. 4-simplex constraints: ǫijklb ij f b kl f̄ = ev, (70) for any pair of faces f, f̄ ∈ v that are dual to triangles sharing a single point. the last constraint will require a more detailed discussion. at this point, let us point out that the constraint (70) is interpreted as a definition of the four volume ev of the four-simplex. the constraint requires such definition to be consistent, i.e., the true condition is ǫijklb ij f b kl f̄ = ǫijklb ij f ′ b kl f̄ ′ = ǫijklb ij f ′′ b kl f̄ ′′ = · · · = ev (71) for all five different possible pairs of f and f̄ in a four simplex, and where we assume the pairs f -f̄ are ordered in agreement with the orientation of the complex ∆⋆. b. the path integral expectation value of the plebanski constraints here we prove that the plebanski constraints are satisfied by the eprl amplitudes in the path integral expectation value sense. 040004-15 papers in physics, vol. 4, art. 040004 (2012) / a. perez the triangle constraints: we start from the simplest case: the triangle (or diagonal) constraints (68). we choose a face f ∈ v (dual to a triangle) in the cable-wire-diagram of eq. (61). this amounts to choosing a pair of wires (right and left representations) connecting two nodes in the vertex cable wire diagram. the two nodes are dual to the two tetrahedra—in the four simplex dual to the vertex—sharing the chosen triangle. equation (36) shows that ǫijklb ij f b kl f ∝ (1 + γ)2 j−f · j − f − (1 − γ) 2j +f · j + f , (72) where j±f denotes the self-dual and anti-self-dual parts of πijf . the path integral expectation value of the triangle constraint is then 〈(1 + γ)2j−f · j − f − (1 − γ) 2j +f · j + f 〉 ∝ (73) (1 + γ)2 w −(1 − γ)2 w = osc, where the double graspings on the anti-self-dual (blue) wire and the self-dual (red) wire represent the action of the casimirs j−f · j − f and j + f · j + f , on the cable-wire diagram of the corresponding vertex. direct evaluation shows that the previous diagram is proportional to ~2jf which vanishes in the semiclassical limit ~ → 0, j → ∞ with ~j =constant. we use the notation already adopted in (54) and call such quantity osc. this proves that the triangle plebanski constraints are satisfied in the semiclassical sense. the tetrahedra constraints: the proof of the validity of the tetrahedra constraints (69). in this case we also have (1 + γ)2 w (74) −(1 − γ)2 w = osc. where we have chosen an arbitrary pair of faces. in order to prove this, let us develop the term on the right. the result follows from = = (1 + γ) |1 − γ| + osc = (1 + γ)2 (1 − γ)2 + osc = (1 + γ)2 (1 − γ)2 + osc, (75) where in the first line we have used the fact that the double grasping can be shifted through the group integration (due to gauge invariance (9)). 040004-16 papers in physics, vol. 4, art. 040004 (2012) / a. perez in the first and second terms on the second line, we have used eq. (55) to move the graspings on self-dual wires to the corresponding anti-self-dual wires. equation (75) immediately follows the previous one; the argument works in the same way for any other pair of faces. notice that the first equality in eq. (75) implies that we can view the plebanski constraint as applied in the frame of the tetrahedron as well as in a lorentz invariant framework (the double grasping defines an intertwiner operator commuting with the projection p einv represented by the box). an analogous statement also holds for the triangle constraints (73). the 4-simplex constraints now we show the validity of the four simplex constraints in their form (71). as we will show below, this last set of constraints follow from the spin(4) gauge invariance of the eprl node (i.e., the validity of the gauss law) plus the validity of the tetrahedra constraints (69). gauge invariance of the node takes the following form in graphical notation: + + + = 0, (76) where the above equation represents the gauge invariance under infinitesimal left su (2) rotations. an analogous equation with insertions on the right is also valid. the validity of the previous equation can, again, be related to the invariance of the haar measure used in the integration on the gauge group that defines the boxes (9). now we choose an arbitrary pair f and f̄ (where f̄ is one of the three possible faces whose dual triangle only shares a point with the one corresponding to f ) and will show how the four volumen ev defined by it equals the one defined by any other admissible pair. the first step is to show that we get the same result using the pairs f -f̄ and f ¯̄f , where ¯̄f is another of the three admissible faces opposite to f . the full result follows from applying the same procedure iteratively to reach any admissible pair. it will be obvious from the treatment given below, that this is possible. thus, for a given pair of admissible faces, we have 040004-17 papers in physics, vol. 4, art. 040004 (2012) / a. perez ev = (1 + γ) 2 w − (1 − γ)2 w = −(1 + γ)2           w + w + w           +(1 − γ)2           w + w + w           = −(1 + γ)2 w + (1 − γ)2 w + osc, (77) where going from the first line to the second and third lines we have simply used (76) on the bottom graspings on the right and left wires. the last line results from the validity of (69). notice that the second terms in the second and third lines add up to osc, as well as the third terms in the second and third line. there is an overall minus sign which amounts for an orientation factor. it should be clear that we can apply the same procedure to arrive at any admissible pair. c. peprl is not a projector we will study in detail the object p eeprl(j1, · · · , j4). we see that it is made of two ingredients. the first one is the projection to the maximum weight subspace hj for γ > 1 in the decomposition of hj+,j− for j ± = (1 ± γ)j/2 (j± = (γ ± 1)j/2 for γ > 1) in terms of irreducible representations of an arbitrarily chosen su (2) subgroup of spin(4). the second ingredient is to eliminate the dependence on the choice of subgroup by group averaging with respect to the full gauge group spin(4). this is diagrammatically represented in (60). however, p eeprl(j1, · · · , j4) is not a projector, namely p eeprl(j1, · · · , j4) 2 6= p eeprl(j1, · · · , j4). (78) technically, this follows from (59) and the fact that [p einv (ρ1 · · · ρ4), (yj1 ⊗ · · · ⊗ yj4 )] 6= 0 (79) i.e., the projection imposing the linear constraints (defined on the frame of a tetrahedron or edge) 040004-18 papers in physics, vol. 4, art. 040004 (2012) / a. perez and the spin(4) (or lorentz) group averaging— rendering the result gauge invariant—do not commute. the fact that the p eeprl(j1, · · · , j4) is not a projection operator has important consequences in the mathematical structure of the model: 1. from (61) one can immediately obtain the following expression for the eprl amplitude zeprl(∆) = ∑ ρf ∈k ∏ f∈∆⋆ d|1−γ| j 2 d(1+γ) j 2 × ∏ e p eeprl(j1, · · · , j4). (80) this expression has the formal structure of expression (13) for bf theory. the formal similarity, however, is broken by the fact that p eeprl(j1, · · · , j4) is not a projection operator. from the formal perspective, there is the possibility for the amplitudes to be defined in terms of a network of projectors (as in bf theory). this might provide an interesting structure that might be of relevance in the definition of a discretization independent model. on the contrary, the failure of p eeprl(j1, · · · , j4) to be a projector may lead, in my opinion, to difficulties in the limit where the complex ∆ is refined: the increasing of the number of edges might produce either trivial or divergent amplitudes 2. 2. another difficulty associated with p eeprl(j1, · · · , j4) 2 6= p eeprl(j1, · · · , j4) is the failure of the amplitudes of the eprl model, as defined here, to be consistent with the abstract notion of spin foams as defined in [74]. this is a point of crucial importance under current discussion in the community. the point is that the cellular decomposition ∆ has no physical meaning and is to be interpreted as a subsidiary regulating structure to be removed when computing physical quantities. spin foams configurations can fit in different ways on a given ∆, yet any of these different embeddings represent the same physical process (like the same gravitational field in different coordinates). consistency requires the spin foam amplitudes to be independent of the embedding, i.e., well-defined on the equivalence classes of spin foams as defined by baez in ref. [74] (the importance of these consistency requirements was emphasized in ref. [75]). the amplitude (80) fails this requirement due to p eeprl(j1, · · · , j4) 2 6= p eeprl(j1, · · · , j4). d. the warsaw proposal if one sees the above as difficulties, then there is a simple solution, at least in the riemannian case. as proposed in ref. [76, 77], one can obtain a consistent modification of the eprl model by replacing p eeprl in (80) by a genuine projector p e w, graphically p ew(j1 · · · j4) = ∑ αβ inv    α β    α β , (81) it is easy to check that by construction (p ew(j1 · · · j4)) 2 = p ew(j1 · · · j4). (82) the variant of the eprl model proposed in refs. [76, 77] takes then the form zeprl(∆) = ∑ jf ∏ f∈∆⋆ d|1−γ| j 2 d(1+γ) j 2 × ∏ e p ew(j1, · · · , j4) (83) = ∑ jf ∑ ιev ∏ f∈∆⋆ d|1−γ| j 2 d(1+γ) j 2 × ∏ e∈∆⋆ geιevs ι e vt ∏ v∈∆⋆ ι1v ι2v ι3v ι4v ι5v . 2this is obviously not clear from the form of (80). we are extrapolating the properties of (p e eprl )n for large n to those of the amplitude (80) in the large number of edges limit implied by the continuum limit. 040004-19 papers in physics, vol. 4, art. 040004 (2012) / a. perez thus, in the modified eprl model, edges e ∈ ∆⋆ are assigned pairs of intertwiner quantum numbers ιevs and ι e vt and an edge amplitude given by the matrix elements geιevs ,ι e vt (where vs and vt stand for the source and target vertices of the given oriented edge). the fact that edges are not assigned a single quantum number is not really significative; one could go to a basis of normalized eigenstates of p ew and rewrite the modified model above as a spin foam model where edges are assigned a single (basis element) quantum number. as the nature of such basis and the quantum geometric interpretation of its elements are not clear at this stage, it seems simpler to represent the amplitudes of the modified model in the above form. the advantages of the modified model are important. however, a generalization of the above modification of the eprl model in the lorentzian case is still lacking. notice that this modification does not interfere with the results on the semiclassical limit (to leading order), as reviewed in section vii. the reason for this is that the matrix elements geαβ → δαβ in that limit [78]. v. the coherent states representation we have written the amplitude defining the eprl model by constraining the state sum of bf theory. for semiclassical studies that we will review in section vii, it is convenient to express the eprl amplitude in terms of the coherent states basis. the importance of coherent states in spin foam models was put forward in ref. [49] and explicitly used to re-derive the eprl model in ref. [79]. the coherent state technology was used by freidel and krasnov in [6] to introduce a new kind of spin foam models for gravity: the fk models. in some cases, the fk model is equivalent to the eprl model. we will review this in detail in section iv. the coherent state representation of the eprl model is obtained by replacing (27) in each of the intermediate su (2) (green) wires in the expression (61) of the eprl amplitudes, namely (84) = ∫ [s2]4 4 ∏ i=1 dji dni n1n1n2n2n3n3n4n4 the case γ < 1 in this case, the coherent state property (28) implies n1n1n2n2n3n3n4n4 = n1n1 n1 n1 n2n2 n2 n2 n3n3 n3 n3 n4n4 n4 n4 , (85) where we have used, in the last line, the fact that for γ < 1 the representations j of the subgroup su (2) ∈ spin(4) are maximum weight, i.e., j = j+ + j−. doing this at each edge, we get 040004-20 papers in physics, vol. 4, art. 040004 (2012) / a. perez zeeprl(∆) = ∑ jf ∏ f∈∆⋆ dj− f dj+ f ∫ ∏ e∈∈∆⋆ djef dnef n1 n1 n2 n2 n3 n3 n4 n4 , (86) where we have explicitly written the n ∈ s2 integration variables on a single cable. the expression above is very similar to the coherent states representation of spin(4) bf theory given in eq. (29). in fact, one gets the above expression if one starts from the expression (29) and sets n+ef = n − ef = nef while dropping, for example, all the sphere integrations corresponding to the n+ef (or equivalently n−ef ). moreover, by construction, the coherent states participating in the previous amplitude satisfy the linear constraints (45) in expectation values, namely 〈j, nef |d i f |j, nef 〉 = 〈j, nef |(1 − γ)j +i f + (1 + γ)j −i f |j, nef 〉 = 0. (87) thus, the coherent states participating in the above representation of the eprl amplitudes solve the linear simplicity constraints in the usual semiclassical sense. the same manipulations leading to (89) in section ii lead to a discrete effective action for the eprl model, namely z γ<1eprl = ∑ jf ∏ f∈∆⋆ d (1−γ) jf 2 d (1+γ) jf 2 (88) × ∫ ∏ e∈∆⋆ djef dnef dg − ef dg + ef exp (s γ<1 j±,n [g±]), where the discrete action sγ<1 j±,n [g±] (89) = ∑ v∈∆⋆ (sv (1−γ) jf 2 ,n [g−] + sv (1+γ) jf 2 ,n [g+]) with svj,n[g] (90) = 5 ∑ a 1 the case γ > 1 is more complicated [80]. the reason for this is that the step (85), directly leading to the discrete action in the previous case, is no longer valid, as the representations of the subgroup su (2) ∈ spin(4) are now minimum instead of maximum weight. however, the representations j+ = j− + j are maximum weight. we can, therefore, insert coherent states resolution of the identity on the right representations and get: 040004-21 papers in physics, vol. 4, art. 040004 (2012) / a. perez n1 n2 n3 n4 (91) = ∫ [s3]4 4 ∏ i=1 d (1+γ) ji 2 dmi m1 m2 m3 m4 n1 n2 n3 n4 = ∫ [s3]4 4 ∏ i=1 d (1+γ) ji 2 dmi m1 m1 m1 m2 m2 m2 m3 m3 m3 m4 m4 m4 n1 n2 n3 n4 , where we are representing the relevant part of the diagram appearing in eq. (85). in the last line, we have used j+ = j + j− (i.e., maximum weight), and the graphical notation m n ≡ 〈m|n〉 as it follows from our previous conventions. with this, one gets z γ>1eprl = (92) ∑ jf ∏ f∈∆⋆ d (1−γ) jf 2 d (1+γ) jf 2 × ∫ ∏ e∈∆⋆ djef d(1+γ) jef 2 dnef dmef dg − ef dg + ef × exp (sγ>1 j±,n,m [g±]), where the discrete action sγ>1 j±,n,m [g±] = ∑ v∈∆⋆ svj±,n,m[g ±] (93) with svj±,n,m[g ±] (94) = ∑ 1≤a 1 for the case γ > 1, the fk amplitude is given by z γ>1f k (∆) = ∑ jf ∏ f∈∆⋆ d|1−γ| j 2 d(1+γ) j 2 ∏ e∈∆⋆ ∫ d(1+γ) j 2 d (γ−1) jef 2 dnef (100) n1n1 −n1 −n1 n2n2 −n2 −n2 n3n3 −n3 −n3 n4n4 −n4 −n4 . the study of the coherent state representation of the fk model for γ > 1, in comparison with eq. (92) for the eprl model, clearly shows the difference between the two models in this regime. z γf k = ∑ jf ∏ f∈∆⋆ d (1−γ) jf 2 d (1+γ) jf 2 ∫ ∏ e∈∆⋆ d |1−γ| jef 2 d (1+γ) jef 2 dnef dg − ef dg + ef exp (sf k γ j±,n [g±]), (101) where the discrete action sf k γ j±,n [g±] = ∑ v∈∆⋆ [ sv (1−γ) jf 2 ,n [g−] +sv (1+γ) jf 2 ,s(γ)n [g+] ] , (102) 040004-24 papers in physics, vol. 4, art. 040004 (2012) / a. perez where s(γ) = sign(1 − γ) and svj,n[g] = 5 ∑ a 1. it is important to mention that the knotting properties of boundary spin networks do not seem to play a role in present definitions of transition amplitudes [109]. vi. further developments and related models the spin foam amplitudes discussed in the previous sections have been introduced by constraining the bf histories through the simplicity constraints. however, in the path integral formulation, the presence of constraints has the additional effect of modifying the weights with which those histories are to be summed: second class constraints modify the path integral measure (in the spin foam context this issue was raised in ref. [75]). as pointed out before, this question has not been completely settled in the spin foam community yet. the explicit modification of the formal measure in terms of continuous variables for the plebansky action was pre040004-25 papers in physics, vol. 4, art. 040004 (2012) / a. perez sented in ref. [110]. a systematic investigation of the measure in the spin foam context was attempted in ref. [111] and [112]. as pointed out in ref. [75], there are restrictions in the manifold of possibilities coming from the requirement of background independence. the simple bf measure chosen in the presentation of the amplitudes in the previous sections satisfies these requirements. there are other consistent possibilities; see, for instance, ref. [113] for a modified measure which remains extremely simple and is suggested from the structure of lqg. an important question is the relationship between the spin foam amplitudes and the canonical operator formulation. the question of whether one can reconstruct the hamiltonian constraints out of spin foam amplitudes has been analysed in detail in three dimensions. for the study of quantum three-dimensional gravity from the bf perspective, see ref. [114]. we will, in fact, present this perspective in detail in the three dimensional part of this article. for the relationship with the canonical theory using variables that are natural from the regge gravity perspective, see [115, 116]. there are generalizations of regge variables more adapted to the interpretation of spin foams [117]. in four dimensions,the question has been investigated in ref. [118] in the context of the new spin foam models. in the context of group field theories, this issue is explored in ref. [119]. finally, spin foams can, in principle, be obtained directly from the implementation of the dirac program using path integral methods. this has been explored in refs. [120, 121], from which a discrete path integral formulation followed [122]. the question of the relationship between covariant and canonical formulations in the discrete setting has been analyzed also in ref. [123]. by construction, all tetrahedra in the fk and eprl models are embedded in a spacelike hypersurface and hence have only spacelike triangles. it seems natural to ask the question of whether a more general construction allowing for timelike faces is possible. the models described in previous sections have been generalized in order to include timelike faces in the work of f. conrady [124–126]. an earlier attempt to define such models in the context of the barrett-crane model can be found in refs. [127]. the issue of the coupling of the new spin foam models to matter remains to a large extend unexplored territory. nevertheless, some results can be found in the literature. the coupling of the barrett-crane model (the γ → ∞ limit of the eprl model) to yang-mills fields was studied in ref. [128]. more recently, the coupling of the eprl model to fermions has been investigated in refs. [129, 130]. a novel possibility of unification of the gravitational and gauge fields was recently proposed in ref. [131]. the introduction of a cosmological constant in the construction of four-dimensional spin foam models has a long history. barrett and crane introduced a vertex amplitude [132], in terms of the crane and yetter model [13], for bf theory with cosmological constant. the lorentzian quantum deformed version of the previous model was studied in ref. [133]. for the new models, the coupling with a cosmological constant is explored in terms of the quantum deformation of the internal gauge symmetry in refs. [134, 135], as well as (independently) in ref. [136]. the asymptotics of the vertex amplitude are shown to be consistent with a cosmological constant term in the semiclassical limit in ref. [137]. the spin foam approach applied to quantum cosmology has been explored in refs. [138–143]. the spin foam formulation can also be obtained from the canonical picture provided by loop quantum cosmology (see ref. [144] and references therein). this has been explored systematically in refs. [145–148]. as we have discussed in the introduction of the new models, heisenberg uncertainty principle precludes the strong imposition of the plebanski constraints that reduce bf theory to general relativity. the results of the semiclassical limit of these models seem to indicate that metric gravity should be recovered in the low energy limit. however, it seems likely that the semiclassical limit could be related to certain modifications of plebanski’s formulation of gravity [149–153]. a simple interpretation of the new models in the context of the bi-gravity paradigm proposed in ref. [154] could be of interest. as it was already pointed out in ref. [74], spin foams can be interpreted in close analogy to feynman diagrams. standard feynman graphs are generalized to 2-complexes and the labeling of propagators by momenta to the assignment of spins to 040004-26 papers in physics, vol. 4, art. 040004 (2012) / a. perez faces. finally, momentum conservation at vertices in standard feynmanology is now represented by spin-conservation at edges, ensured by the assignment of the corresponding intertwiners. in spin foam models, the non-trivial content of amplitudes is contained in the vertex amplitude which, in the language of feynman diagrams, can be interpreted as an interaction. this analogy is indeed realized in the formulation of spin foam models in terms of a group field theory (gft) [155, 156]. the gft formulation resolves, by definition, the two fundamental conceptual problems of the spin foam approach: diffeomorphism gauge symmetry and discretization dependence. the difficulties are shifted to the question of the physical role of λ and the convergence of the corresponding perturbative series. this idea has been studied in more detail in three dimensions. in ref. [157], the scaling properties of the modification of the boulatov group field theory introduced in ref. [158] were studied in detail. in a further modification of the previous model (known as colored tensor models [159], new techniques based on a suitable 1/n expansion imply that amplitudes are dominated by spherical topology [160]. moreover, it seems possible that the continuum limit might be critical as in certain matrix models [161–165]. however, it is not yet clear if there is a sense in which these models correspond to a physical theory. the naive interpretation of the models is that they correspond to a formulation of 3d quantum gravity including a dynamical topology. vii. results on the semiclassical limit of eprl-fk models having introduced the relevant spin foam models in the previous sections, we now present the results of the large spin asymptotics of the spin foam amplitudes suggesting that on a fixed discretization the semiclassical limit of the eprl-fk models is given by regge’s discrete formulation of general relativity [80, 166]. the semiclassical limit of spin foams is based on the study of the the large spin limit asymptotic behavior of coherent state spin foam amplitudes. the notion of large spin can be defined by the rescaling of quantum numbers and planck constant according to j → λj and ~ → ~/λ and taking λ >> 1. in this limit, the quantum geometry approximates the classical one when tested with suitable states (e.g., coherent states). however, the geometry remains discrete during this limiting process as the limit is taken on a fixed regulating cellular structure. that is why one usually makes a clear distinction between semiclassical limit and the continuum limit. in the semiclassical analysis presented here, one can only hope to make contact with discrete formulations of classical gravity. hence, the importance of regge calculus in the discussion of this section. the key technical ingredient in this analysis is the representation of spin foam amplitudes in terms of the coherent state basis introduced in subsection i. here we follow refs. [80, 166–169]. the idea of using coherent states and discrete effective actions for the study of the large spin asymptotics of spin foam amplitudes was put forward in refs. [170, 171]. the study of the large spin asymptotics has a long tradition in the context of quantum gravity, dating back to the study of ponzano-regge [26]. more directly related to our discussion, here are the early works [172,173]. the key idea is to use asymptotic stationary phase methods for the amplitudes written in terms of the discrete actions presented in the previous section. in this section, we review the results of the analysis of the large spin asymptotics of the eprl vertex amplitude for both the riemannian and lorentztian models. we follow the notation and terminology of ref. [80] and related papers. b. su(2) 15j-symbol asymptotics as su (2) bf theory is quite relevant for the construction of the eprl-fk models, the study of the large spin asymptotics of the su (2) vertex amplitude is a key ingredient in the analysis of [80]. the coherent state vertex amplitude is 15j(j, n) (104) = ∫ 5 ∏ a=1 dga ∏ 1≤a≤b≤5 〈nab|g −1 a gb|nba〉 2jab , which depends on 10 spins jab and 20 normals nab 6= nba. the previous amplitude can be expressed as 040004-27 papers in physics, vol. 4, art. 040004 (2012) / a. perez 15j(j, n) = ∫ 5 ∏ a=1 dga ∏ 1≤a≤b≤5 exp sj,n[g], (105) sj,n[g] = 5 ∑ a 1 [80]. the first term in the vertex asymptotics is in essence the expected one: it is the analog of the 6j symbol asymptotics in three-dimensional spin foams. due to their explicit dependence on the immirzi parameter, the last two terms are somewhat strange from the theoretical point of view of the continuum field. however, this seems to be a peculiarity of the riemannian theory alone, as shown by the results of ref. [166] for the lorentzian models. non-geometric configurations are exponentially suppressed 040004-28 papers in physics, vol. 4, art. 040004 (2012) / a. perez d. lorentzian eprl model to each solution, one can associate a second solution corresponding to a parity related 4-simplex and, consequently, the asymptotic formula has two terms. it is given, up to a global sign, by the expression aeprlv ∼ 1 λ12 [ n+ exp ( iλγ ∑ a> 1. if configurations are geometric (i.e., regge-like), one has two kinds of contributions to the amplitude asymptotics: those coming from degenerate and non-degenerate configurations. if one (by hand) restricts to the nondegenerate configurations, then one has w γ ∆⋆ (jf ) ∼ c λ(33ne−6nv−4nf ) × exp(iλse regge (∆⋆, jf )), (113) where ne, nv, and nf denote the number of edges, vertices, and faces in the two complex ∆⋆, respectively. there are recent works by m. han in which asymptotics of general simplicial geometry amplitudes are studied in the context of the eprl model [174, 175]. the problem of computing the two-point function and higher correlation functions in the context of spin foam has received a lot of attention recently. the framework for the definition of the correlation functions in the background independent setting has been generally discussed by rovelli in ref. [176], and correspods to a special application of a more general proposal investigated by oeckl [177–184]. it was then applied to the barrett-crane model in refs. [185–187], where it was discovered that certain components of the twopoint function could not yield the expected result compatible with regge gravity in the semiclassical limit. this was used as the main motivation for the weakening of the imposition of the plebanski constraints, leading to the new models. soon thereafter, it was argued that the difficulties of the barrett-crane model where indeed absent in the eprl model [188]. the two-point function for the eprl model was calculated in ref. [189] and it was shown to produce a result in agreement with that of regge calculus [190,191], in the limit γ → 0. the fact that, for the new model, the double scaling limit γ → 0 and j → ∞ with γj= constant defines the appropriate regime where the fluctuation behave as in regge gravity (in the leading order) has been further clarified in ref. [192]. this indicates that the quantum fluctuations in the new models are more general than simply metric fluc040004-29 papers in physics, vol. 4, art. 040004 (2012) / a. perez tuations. the fact that the new models are not metric at all scales should not be surprising as we know that the plebanski constraints that produce metric general relativity out of bf theory have been implemented only semiclassically (in the large spin limit). at the deep planckian regime, fluctuations are more general than metric. however, it is not clear at this stage why this is controlled by the immirzi parameter. all the previous calculations involve a complex with a single four-simplex. the first computation involving more than one simplex was performed in refs. [187, 193], for the case of the barrett-crane model. certain peculiar properties were found and it is not clear at this stage whether these issues remain in the eprl model. higher order correlation functions have been computed in ref. [194], the results are in agreement with regge gravity in the γ → 0 limit. viii. acknowledgements i would like to thank the help from many people in the field that have helped me in various ways. i am grateful to eugenio bianchi, carlo rovelli and simone speziale for the many discussions on aspects and details of the recent literature. many detailed calculations that contributed to the presentation of the new models in this review were done in collaboration with mercedes velázquez to whom i would like to express my gratitude. i would also like to thank you ding, florian conrady, laurent freidel, muxin han, merced montesinos for their help and valuable interaction. [1] c rovelli, quantum gravity, cambridge university press, cambridge (uk) (2004), pag. 480. [2] t thiemann, modern canonical quantum general relativity, cambridge university press, cambridge (uk) (2007), pag. 819. [3] a ashtekar, j lewandowski, background independent quantum gravity: a status report, class. quant. grav. 21, r53 (2004). [4] j engle, r pereira, c rovelli, the loopquantum-gravity vertex-amplitude, phys. rev. lett. 99, 161301 (2007). [5] j engle, e livine, r pereira, c rovelli, lqg vertex with finite immirzi parameter, nucl. phys. b 799, 136 (2008). [6] l freidel, k krasnov, a new spin foam model for 4d gravity, class. quant. grav. 25, 125018 (2008). [7] c rovelli, zakopane lectures on loop gravity, arxiv:1102.3660 (2011). [8] a perez, the spin foam approach to quantum gravity, liv. rev. rel. (in press). [9] j c baez, an introduction to spin foam models of quantum gravity and bf theory, lect. notes phys. 543, 25 (2000). [10] r oeckl, discrete gauge theory: from lattices to tqft, imperial college press, london (uk) (2005), pag. 202. [11] r oeckl, h pfeiffer, the dual of pure nonabelian lattice gauge theory as a spin foam model, nucl. phys. b 598, 400 (2001). [12] f girelli, r oeckl, a perez, spin foam diagrammatics and topological invariance, class. quant. grav. 19, 1093 (2002). [13] d yetter l crane, a categorical construction of 4-d topological quantum field theories, in: quantum topology, eds. l kaufmann, r baadhio, pag. 120, world scientific, singapore (1993). [14] d n yetter, l crane, l kauffman, state-sum invariants of 4-manifolds, j. knot theor. ramif. 6, 177 (1997). [15] j c baez, a perez, quantization of strings and branes coupled to bf theory, adv. theor. math. phys. 11, 3 (2007). [16] w j fairbairn, a perez, extended matter coupled to bf theory, phys. rev. d, 78, 024013 (2008). [17] m montesinos, a perez, two-dimensional topological field theories coupled to fourdimensional bf theory, phys. rev. d 77, 104020 (2008). [18] g ’t hooft, a locally finite model for gravity, found. phys. 38, 733 (2008). 040004-30 papers in physics, vol. 4, art. 040004 (2012) / a. perez [19] l freidel, j kowalski-glikman, a starodubtsev, particles as wilson lines of gravitational field, phys. rev. d 74, 084002 (2006). [20] e r livine, a perez, c rovelli, 2d manifoldindependent spinfoam theory, class. quant. grav. 20, 4425 (2003). [21] r jackiw, liouville field theory: a twodimensional model for gravity? in: quantum theory of gravity, eds. s m christensen, b s dewitt, pag. 403, adam hilger ltd., bristol (1984). [22] c teitelboim, the hamiltonian structure of two-dimensional space-time and its relation with the conformal anomaly, in: quantum theory of gravity, eds. s m christensen, b s dewitt, pag. 403, adam hilger ltd., bristol (1984). [23] c p constantinidis, o piguet, a perez, quantization of the jackiw-teitelboim model, phys. rev. d 79, 084007 (2009). [24] d oriti, c rovelli, s speziale, spinfoam 2d quantum gravity and discrete bundles, class. quant. grav. 22, 85 (2005). [25] s carlip, quantum gravity in 2+1 dimensions, cambridge university press, cambridge (uk) (1998), pag. 276. [26] t regge, g ponzano, semiclassical limit of racah coeficients, in: spectroscopy and group theoretical methods in physics, eds. f block et al., north-holland, amsterdam (1968). [27] j w barrett, i naish-guzman, the ponzanoregge model, class. quant. grav. 26, 155014 (2009). [28] k noui, a perez, three dimensional loop quantum gravity: physical scalar product and spin foam models, class. quant. grav. 22, 1739 (2005). [29] l freidel, d louapre, diffeomorphisms and spin foam models, nucl. phys. b 662, 279 (2003). [30] v bonzom, m smerlak, bubble divergences from cellular cohomology, lett. math. phys. 93, 295 (2010). [31] v bonzom, m smerlak, bubble divergences from twisted cohomology, arxiv:1008.1476 (2010). [32] v bonzom, m smerlak, bubble divergences: sorting out topology from cell structure, ann. henri poincare 13, 185 (2012). [33] o y viro, v g turaev, statesum invariants of 3-manifolds and quantum 6j-symbols, topology 31, 865 (1992). [34] j w barrett, j m garcia-islas, j f martins, observables in the turaev-viro and craneyetter models, j. math. phys. 48, 093508 (2007). [35] k noui, a perez, observability and geometry in three dimensional quantum gravity, in: quantum theory and symmetries, eds. p c argyres et al., pag. 641, world scientific, singapore (2004). [36] k noui, a perez, three dimensional loop quantum gravity: coupling to point particles, class. quant. grav. 22, 4489 (2005). [37] l freidel, d louapre, ponzano-regge model revisited. i: gauge fixing, observables and interacting spinning particles, class. quant. grav. 21, 5685 (2004). [38] l freidel, e r livine, ponzano-regge model revisited. iii: feynman diagrams and effective field theory, class. quant. grav. 23, 2021 (2006). [39] l freidel, e r livine, effective 3d quantum gravity and non-commutative quantum field theory, phys. rev. lett. 96, 221301 (2006). [40] w j fairbairn, fermions in threedimensional spinfoam quantum gravity, gen. rel. grav. 39, 427 (2007). [41] r j dowdall, w j fairbairn, observables in 3d spinfoam quantum gravity with fermions, gen. rel. grav. 43, 1263 (2011). [42] s speziale, coupling gauge theory to spinfoam 3d quantum gravity, class. quant. grav. 24, 5139 (2007). 040004-31 papers in physics, vol. 4, art. 040004 (2012) / a. perez [43] w j fairbairn, e r livine, 3d spinfoam quantum gravity: matter as a phase of the group field theory, class. quant. grav. 24, 5277 (2007). [44] e r livine, r oeckl, three-dimensional quantum supergravity and supersymmetric spin foam models, adv. theor. math. phys. 7, 951 (2004). [45] v baccetti, e r livine, j p ryan, the particle interpretation of n = 1 supersymmetric spin foams, class. quant. grav. 27, 225022 (2010). [46] v bonzom, e r livine, yet another recursion relation for the 6j-symbol, arxiv:1103.3415 (2011). [47] m dupuis, e r livine, the 6j-symbol: recursion, correlations and asymptotics, class. quant. grav. 27, 135003 (2010). [48] v bonzom, e r livine, s speziale, recurrence relations for spin foam vertices, class. quant. grav. 27, 125002 (2010). [49] e r livine, s speziale, a new spinfoam vertex for quantum gravity, phys. rev. d 76, 084028 (2007). [50] t thiemann, coherent states on graphs, prepared for 9th marcel grossmann meeting on recent developments in theoretical and experimental general relativity, gravitation and relativistic field theories (mg 9), rome (italy), 2-9 july (2000). [51] t thiemann, gauge field theory coherent states (gcs). i: general properties, class. quant. grav. 18, 2025 (2001). [52] h sahlmann, t thiemann, o winkler, coherent states for canonical quantum general relativity and the infinite tensor product extension, nucl. phys. b 606, 401 (2001). [53] t thiemann, o winkler, gauge field theory coherent states (gcs) 2. peakedness properties, class. quant. grav. 18, 2561 (2001). [54] t thiemann, o winkler, gauge field theory coherent states (gcs) 3. ehrenfest theorems, class. quant. grav. 18, 4629 (2001). [55] t thiemann, o winkler, gauge field theory coherent states (gcs) 4. infinite tensor product and thermodynamical limit, class. quant. grav. 18, 4997 (2001). [56] t thiemann, complexifier coherent states for quantum general relativity, class. quant. grav. 23, 2063 (2006). [57] b bahr, t thiemann, gauge-invariant coherent states for loop quantum gravity. i. abelian gauge groups, class. quant. grav. 26, 045011 (2009). [58] b bahr, t thiemann, gauge-invariant coherent states for loop quantum gravity. ii. non-abelian gauge groups, class. quant. grav. 26, 045012 (2009). [59] c flori, t thiemann, semiclassical analysis of the loop quantum gravity volume operator. i. flux coherent states, arxiv:0812.1537 (2008). [60] e bianchi, e magliaro, c perini, coherent spin-networks, phys. rev. d 82, 024012 (2010). [61] f conrady, l freidel, quantum geometry from phase space reduction, j. math. phys. 50, 123510 (2009). [62] e buffenoir, p roche, harmonic analysis on the quantum lorentz group, commun. math. phys. 207, 499 (1999). [63] w ruhl, the lorentz group and harmonic analysis, w. a. benjamin inc., new york (1970). [64] i m gelfand, generalized functions, academic press, new york (1966), vol. 5. [65] i m gelfand, r a minlos, z ya shapiro, representations of the rotation and lorentz groups and their applications, pergamon press, oxford (1963). [66] j w barrett, l crane, relativistic spin networks and quantum gravity, j. math. phys. 39, 3296 (1998). [67] y ding, c rovelli, the volume operator in covariant quantum gravity, class. quant. grav. 27, 165003 (2010). 040004-32 papers in physics, vol. 4, art. 040004 (2012) / a. perez [68] y ding, m han, c rovelli, generalized spinfoams, phys. rev. d 83, 124020 (2011). [69] s alexandrov, the new vertices and canonical quantization, phys. rev. d 82, 024024 (2010). [70] c rovelli, s speziale, lorentz covariance of loop quantum gravity, phys. rev. d 83, 104029 (2011). [71] w m wieland, twistorial phase space for complex ashtekar variables, class. quant. grav. 29, 045007 (2012).. [72] m dupuis, l freidel, e r livine, s speziale, holomorphic lorentzian simplicity constraints, arxiv:1107.5274 (2011). [73] e r livine, s speziale, j tambornino, twistor networks and covariant twisted geometries, phys. rev. d 85, 064002 (2012). [74] j c baez, spin foam models, class. quant. grav. 15, 1827 (1998). [75] m bojowald, a perez, spin foam quantization and anomalies, gen. rel. grav. 42, 877 (2010). [76] b bahr, f hellmann, w kaminski, m kisielowski, j lewandowski, operator spin foam models, class. quant. grav. 28, 105003 (2011). [77] w kaminski, m kisielowski, j lewandowski, the eprl intertwiners and corrected partition function, class. quant. grav. 27, 165020 (2010). [78] e alesci, e bianchi, e magliaro, c perini, asymptotics of lqg fusion coefficients, class. quant. grav. 27, 095016 (2010). [79] e r. livine, s speziale, consistently solving the simplicity constraints for spinfoam quantum gravity, europhys. lett. 81, 50004 (2008). [80] j w barrett, r j dowdall, w j fairbairn, h gomes, f hellmann, asymptotic analysis of the eprl four-simplex amplitude, j. math. phys. 50, 112504 (2009). [81] s alexandrov, simplicity and closure constraints in spin foam models of gravity, phys. rev. d 78, 044033 (2008). [82] s alexandrov, spin foam model from canonical quantization, phys. rev. d 77, 024009 (2008). [83] v bonzom, spin foam models for quantum gravity from lattice path integrals, phys. rev. d 80, 064028 (2009). [84] v bonzom, from lattice bf gauge theory to area-angle regge calculus, class. quant. grav. 26, 155020 (2009). [85] v bonzom, e r livine, a lagrangian approach to the barrett-crane spin foam model, phys. rev. d 79, 064034 (2009). [86] m han, t thiemann, commuting simplicity and closure constraints for 4d spin foam models, arxiv:1010.5444 (2010). [87] a baratin, c flori, t thiemann, the holst spin foam model via cubulations, arxiv:0812.4055 (2008). [88] m dupuis, e r livine, revisiting the simplicity constraints and coherent intertwiners, class. quant. grav. 28, 085001 (2011). [89] l freidel, e r livine, u(n) coherent states for loop quantum gravity, j. math. phys. 52, 052502 (2011). [90] l freidel, e r livine, the fine structure of su(2) intertwiners from u(n) representations, j. math. phys. 51, 082502 (2010). [91] e f borja, l freidel, i garay, e r livine, u(n) tools for loop quantum gravity: the return of the spinor, class. quant. grav. 28, 055005 (2011). [92] e r livine, j tambornino, spinor representation for loop quantum gravity, j. math. phys. 53, 012503 (2012). [93] b dittrich, j p ryan, simplicity in simplicial phase space, phys. rev. d 82, 064026 (2010). [94] j engle, r pereira, regularization and finiteness of the lorentzian lqg vertices, phys. rev. d 79, 084034 (2009). 040004-33 papers in physics, vol. 4, art. 040004 (2012) / a. perez [95] l liu, m montesinos, a perez, a topological limit of gravity admitting an su(2) connection formulation, phys. rev. d 81, 064033 (2010). [96] y ding, c rovelli, physical boundary hilbert space and volume operator in the lorentzian new spin-foam theory, class. quant. grav. 27, 205003 (2010). [97] w kaminski, j lewandowski, t pawlowski, quantum constraints, dirac observables and evolution: group averaging versus schroedinger picture in lqc, class. quant. grav. 26, 245016 (2009). [98] w kaminski, m kisielowski, j lewandowski, spin-foams for all loop quantum gravity, class. quant. grav. 27, 095006 (2010). [99] l freidel, e r livine, spin networks for noncompact groups, j. math. phys. 44, 1322 (2003). [100] s alexandrov, e r livine, su(2) loop quantum gravity seen from covariant theory, phys. rev. d 67, 044009 (2003). [101] e r livine, projected spin networks for lorentz connection: linking spin foams and loop gravity, class. quant. grav. 19, 5525 (2002). [102] s alexandrov, e buffenoir, p roche, plebanski theory and covariant canonical formulation, class. quant. grav. 24, 2809 (2007). [103] s alexandrov, reality conditions for ashtekar gravity from lorentzcovariant formulation, class. quant. grav. 23, 1837 (2006). [104] s alexandrov, hilbert space structure of covariant loop quantum gravity, phys. rev. d 66, 024028 (2002). [105] s alexandrov, choice of connection in loop quantum gravity, phys. rev. d 65, 024011 (2002). [106] s alexandrov, so(4,c)-covariant ashtekarbarbero gravity and the immirzi parameter, class. quant. grav. 17, 4255 (2000). [107] s alexandrov, i grigentch, d vassilevich, su(2)-invariant reduction of the 3+1 dimensional ashtekar’s gravity, class. quant. grav. 15, 573 (1998). [108] m dupuis, e r livine, lifting su(2) spin networks to projected spin networks, phys. rev. d 82, 064044 (2010). [109] b bahr, on knottings in the physical hilbert space of lqg as given by the eprl model, class. quant. grav. 28, 045002 (2011). [110] e buffenoir, m henneaux, k noui, ph roche, hamiltonian analysis of plebanski theory, class. quant. grav. 21, 5203 (2004). [111] j engle, m han, t thiemann, canonical path integral measures for holst and plebanski gravity. i. reduced phase space derivation, class. quant. grav. 27, 245014 (2010). [112] m han, canonical path-integral measures for holst and plebanski gravity. ii. gauge invariance and physical inner product, class. quant. grav. 27, 245015 (2010). [113] e bianchi, d regoli, c rovelli, face amplitude of spinfoam quantum gravity, class. quant. grav. 27, 185009 (2010). [114] k noui, a perez, three dimensional loop quantum gravity: physical scalar product and spin foam models, class. quant. grav. 22, 1739 (2005). [115] v bonzom, l freidel, the hamiltonian constraint in 3d riemannian loop quantum gravity, class. quant. grav. 28, 195006 (2011). [116] v bonzom, a taste of hamiltonian constraint in spin foam models, arxiv:1101.1615 (2011). [117] b dittrich, s speziale, area-angle variables for general relativity, new j. phys. 10, 083006 (2008). [118] e alesci, k noui, f sardelli, spin-foam models and the physical scalar product, phys. rev. d 78, 104009 (2008). [119] e r livine, d oriti, j p ryan, effective hamiltonian constraint from group field theory, class. quant. grav. 28, 245010 (2011). 040004-34 papers in physics, vol. 4, art. 040004 (2012) / a. perez [120] m han, t thiemann, on the relation between operator constraint, master constraint, reduced phase space, and path integral quantisation, class. quant. grav. 27, 225019 (2010). [121] m han, t thiemann, on the relation between rigging inner product and master constraint direct integral decomposition, j. math. phys. 51, 092501 (2010). [122] m han, a path-integral for the master constraint of loop quantum gravity, class. quant. grav. 27, 215009 (2010). [123] b dittrich, p a hohn, from covariant to canonical formulations of discrete gravity, class. quant. grav. 27, 155001 (2010). [124] f conrady, j hnybida, unitary irreducible representations of sl(2,c) in discrete and continuous su(1,1) bases, j. math. phys. 52, 012501 (2011). [125] f conrady, spin foams with timelike surfaces, class. quant. grav. 27, 155014 (2010). [126] f conrady, j hnybida, a spin foam model for general lorentzian 4-geometries, class. quant. grav. 27, 185011 (2010). [127] a perez, c rovelli, 3+1 spinfoam model of quantum gravity with spacelike and timelike components, phys. rev. d 64, 064002 (2001). [128] d oriti, h pfeiffer, a spin foam model for pure gauge theory coupled to quantum gravity, phys. rev. d 66, 124010 (2002). [129] m han, c rovelli, spinfoam fermions: pct symmetry, dirac determinant, and correlation functions, arxiv:1101.3264 (2011). [130] e bianchi et al., spinfoam fermions, arxiv:1012.4719 (2010). [131] s alexander, a marciano, r a tacchi, towards a spin-foam unification of gravity, yang-mills interactions and matter fields, arxiv:1105.3480 (2011). [132] j w barrett, l crane, a lorentzian signature model for quantum general relativity, class. quant. grav. 17, 3101 (2000). [133] k noui, p roche, cosmological deformation of lorentzian spin foam models, class. quant. grav. 20, 3175 (2003). [134] y ding, m han, on the asymptotics of quantum group spinfoam model, arxiv:1103.1597 (2011). [135] m han, 4-dimensional spin-foam model with quantum lorentz group, j. math. phys. 52, 072501 (2011). [136] w j fairbairn, c meusburger, quantum deformation of two four-dimensional spin foam models, j. math. phys. 53, 022501 (2012). [137] m han, cosmological constant in lqg vertex amplitude, arxiv:1105.2212 (2011). [138] e bianchi, t krajewski, c rovelli, f vidotto, cosmological constant in spinfoam cosmology, phys. rev. d 83, 104015 (2011). [139] f vidotto, spinfoam cosmology: quantum cosmology from the full theory, arxiv:1011.4705 (2010). [140] a henderson, c rovelli, f vidotto, e wilson-ewing, local spinfoam expansion in loop quantum cosmology, class. quant. grav. 28, 025003 (2011). [141] e bianchi, c rovelli, f vidotto, towards spinfoam cosmology, phys. rev. d 82, 084035 (2010). [142] c rovelli, f vidotto, on the spinfoam expansion in cosmology, class. quant. grav. 27, 145005 (2010). [143] c rovelli, f vidotto, stepping out of homogeneity in loop quantum cosmology, class. quant. grav. 25, 225024 (2008). [144] m bojowald, loop quantum cosmology, liv. rev. rel. 8, 11 (2005). [145] a ashtekar, m campiglia, a henderson, path integrals and the wkb approximation in loop quantum cosmology, phys. rev. d 82, 124043 (2010). [146] a ashtekar, m campiglia, a henderson, casting loop quantum cosmology in the spin foam paradigm, class. quant. grav. 27, 135020 (2010). 040004-35 papers in physics, vol. 4, art. 040004 (2012) / a. perez [147] a ashtekar, ml campiglia, a henderson, loop quantum cosmology and spin foams, phys. lett. b 681, 347 (2009). [148] m campiglia, a henderson, w nelson, vertex expansion for the bianchi i model, phys. rev. d 82, 064036 (2010). [149] k krasnov, renormalizable non-metric quantum gravity? arxiv:hep-th/0611182 (2006). [150] k krasnov, on deformations of ashtekar’s constraint algebra, phys. rev. lett. 100, 081102 (2008). [151] k krasnov, plebanski gravity without the simplicity constraints, class. quant. grav. 26, 055002 (2009). [152] k krasnov, gravity as bf theory plus potential, int. j. mod. phys. a 24, 2776 (2009). [153] k krasnov, metric lagrangians with two propagating degrees of freedom, europhys. lett. 89, 30002 (2010). [154] s speziale, bi-metric theory of gravity from the non-chiral plebanski action, phys. rev. d 82, 064003 (2010). [155] m p reisenberger, c rovelli, spacetime as a feynman diagram: the connection formulation, class. quant. grav. 18, 121 (2001). [156] m p reisenberger, c rovelli, spin foams as feynman diagrams, in: 2001, a relativistic spacetime odyssey. eds. i ciufolini, d dominici, l lusanna, pag. 431, world scientific, singapore (2003). [157] j magnen, k noui, v rivasseau, m smerlak, scaling behaviour of three-dimensional group field theory, class. quant. grav. 26, 185012 (2009). [158] l freidel, d louapre, non-perturbative summation over 3d discrete topologies, phys. rev. d 68, 104004 (2003). [159] r gurau, colored group field theory, commun. math. phys. 304, 69 (2011). [160] r gurau, the 1/n expansion of colored tensor models, ann. henri poincare 12, 829 (2011). [161] r gurau, a generalization of the virasoro algebra to arbitrary dimensions, nucl. phys. b 852, 592 (2011). [162] v bonzom, r gurau, a riello, v rivasseau, critical behavior of colored tensor models in the large n limit, nucl. phys. b 853, 174 (2011). [163] r gurau, the complete 1/n expansion of colored tensor models in arbitrary dimension, ann. henri poincare 13, 399 (2011). [164] r gurau, v rivasseau, the 1/n expansion of colored tensor models in arbitrary dimension, europhys. lett 95, 50004 (2011). [165] j p ryan, tensor models and embedded riemann surfaces, phys. rev. d 85, 024010 (2012). [166] j w barrett, r j dowdall, w j fairbairn, f hellmann, r pereira, lorentzian spin foam amplitudes: graphical calculus and asymptotics, class. quant. grav. 27, 165009 (2010). [167] j w barrett, r j dowdall, w j fairbairn, h gomes, f hellmann, a summary of the asymptotic analysis for the eprl amplitude, in: aip conf. proc. 1196, pag. 36, (2009). [168] j w barrett, w j fairbairn, f hellmann, quantum gravity asymptotics from the su(2) 15j symbol, int. j. mod. phys. a 25, 2897 (2010). [169] j w barrett et al., asymptotics of 4d spin foam models, gen. relat. gravit. 43, 2421 (2011). [170] f conrady, l freidel, on the semiclassical limit of 4d spin foam models, phys. rev. d 78, 104023 (2008). [171] f conrady, l freidel, path integral representation of spin foam models of 4d gravity, class. quant. grav. 25, 245010 (2008). [172] j w barrett, ch m steele, asymptotics of relativistic spin networks, class. quant. grav. 20, 1341 (2003). 040004-36 papers in physics, vol. 4, art. 040004 (2012) / a. perez [173] j w barrett, r m williams, the asymptotics of an amplitude for the 4-simplex, adv. theor. math. phys. 3, 209 (1999). [174] m han, m zhang, asymptotics of spinfoam amplitude on simplicial manifold: euclidean theory, arxiv:1109.0500 (2011). [175] m han, m zhang, asymptotics of spinfoam amplitude on simplicial manifold: lorentzian theory, arxiv:1109.0499 (2011). [176] c rovelli, graviton propagator from background-independent quantum gravity, phys. rev. lett. 97, 151301 (2006). [177] r oeckl, affine holomorphic quantization, arxiv:1104.5527 (2011). [178] r oeckl, observables in the general boundary formulation, in: quantum field theory and gravity, eds. f finster et al., pag. 137, birkhäuser, basel, (2012). [179] r oeckl, holomorphic quantization of linear field theory in the general boundary formulation, arxiv:1009.5615 (2010). [180] d colosi, robert oeckl, on unitary evolution in quantum field theory in curved spacetime, open nucl. part. phys. j. 4, 13 (2011). [181] d colosi, robert oeckl, states and amplitudes for finite regions in a two-dimensional euclidean quantum field theory, j. geom. phys. 59, 764 (2009). [182] d colosi, r oeckl, spatially asymptotic smatrix from general boundary formulation, phys. rev. d 78, 025020 (2008). [183] d colosi, r oeckl, s-matrix at spatial infinity, phys. lett. b 665, 310 (2008). [184] r oeckl, probabilites in the general boundary formulation, j. phys. conf. ser. 67, 012049 (2007). [185] e alesci, c rovelli, the complete lqg propagator. ii. asymptotic behavior of the vertex, phys. rev. d 77, 044024 (2008). [186] e alesci, c rovelli, the complete lqg propagator. i. difficulties with the barrett-crane vertex, phys. rev. d 76, 104012 (2007). [187] e bianchi, l modesto, c rovelli, s speziale, graviton propagator in loop quantum gravity, class. quant. grav. 23, 6989 (2006). [188] e alesci, e bianchi, c rovelli, lqg propagator: iii. the new vertex, class. quant. grav. 26, 215001 (2009). [189] e bianchi, e magliaro, c perini, lqg propagator from the new spin foams, nucl. phys. b 822, 245 (2009). [190] e bianchi, a satz, semiclassical regime of regge calculus and spin foams, nucl. phys. b 808, 546 (2009). [191] e magliaro, c perini, comparing lqg with the linearized theory, int. j. mod. phys. a 23, 1200 (2008). [192] e magliaro, c perini, regge gravity from spinfoams, arxiv:1105.0216 (2011). [193] d mamone, c rovelli, second-order amplitudes in loop quantum gravity, class. quant. grav. 26, 245013 (2009). [194] c rovelli, m zhang, euclidean three-point function in loop and perturbative gravity, class. quant. grav. 28, 175010 (2011). 040004-37