Mathematical Problems of Computer Science 46, 117–125, 2016. The Parallel Simulation Method for d-dimensional Abelian Sandpile Automata Hayk E. Nahapetyan, Suren S. Poghosyan, Vahagn S. Poghosyan and Yuri H. Shoukourian Institute for Informatics and Automation Problems of NAS RA e-mail: povahagn@gmail.com, shouk@sci.am Abstract In this paper, the star-packing problem introduced in [1] for a square lattice is generalized for d-dimensional lattice Ld, d ∈ N. The problem is to pack the lattice Ld with star graphs S2d. Using the solution of this problem, a parallel algorithm for the simulation of d-dimensional cellular automata is developed. As an example of cellular automata, the relaxation process of unstable states of Abelian sandpile model is considered. Appropriate software packages have been developed using OpenMP and CUDA technologies. The parallel simulation results, carried out for 3-dimensional lattices of different sizes, are presented. Keywords: Abelian sandpile model, Dense packing problem, Parallel algorithm, Cellular automata. 1. Introduction Cellular automata (CA) are discrete models studied in computability theory, mathematics, physics, complexity science, theoretical biology and microstructure modeling. Modern mul- ticore computers with shared memory and multiprocessor clusters with distributed memory make it possible to efficiently parallelize simulation algorithms for CA. In the work [2], a cluster-based parallel algorithm and a package to simulate two-dimensional CA is introduced. The concept of self-organized criticality was first introduced by Bak, Tang and Wiesenfeld in 1987 [4], and gave rise to growing interest in the study of self-organizing systems. Bak et al. argued that in many natural phenomena, the dissipative dynamics of the system is such that it drives the system to a critical state, thereby leading to ubiquitous power law behaviors. This mechanism has been invoked to understand the power law distributions observed in turbulent fluids, earthquakes, distribution of visible matter in the universe, solar flares and surface roughening of growing interfaces. The Sandpile models, being a class of cellular automata, are among the simplest theo- retical models which show self-organized criticality. A special subclass of interest consists of so called Abelian sandpile models (ASM). The Abelian property means that the final stable state of the CA is independent of the order in which the updates of cells are carried out. This property plays a key role during the numerical, as well as analytical studies of the ASM [5, 6, 7, 8, 9, 10, 11, 12, 13, 14]. In numerous works, in the absence of analytical relations for 117 118 The Parallel Simulation Method for d-dimensional Abelian Sandpile Automata various physical characterizers, different methods of statistical analysis have been applied for calculation or verification of hypothetical and analytical expressions for sandpile models. The luck of such systems, also the demand for accurate calculations of physical observables would dramatically increase the acceptable sizes of lattice as well as the time constraints for calculations. An approach of efficient parallel simulation of sandpile model on multicore computers with shared memory, which is based on the idea that there are few active cells (unstable vertices) at each moment of time, is given in [3]. In general, when the initial unsta- ble configuration contains many unstable vertices, this approach does not give good results for big lattices. In what follows we present a more efficient parallel algorithm for the sim- ulation of d-dimensional CA implemented on systems with shared memory, which is based on the star-packing of the d-dimensional lattices. A solution to the star-packing problem is given in [1] for the 2-dimensional square lattice. In this paper we generalize the solution to the case of general d dimensions. Below a cellular automaton for implementing the Abelian Sandpile model on a d-dimensional lattice is considered aiming at the development of parallel simulation. 2. The Star-Packing Problem Consider a d-dimensional lattice Ld of linear size n with periodic boundary conditions, which is a graph with the vertex set V = V (Ld) = {v = (x1, x2, . . . , xd) : xi = 0, 1, 2, . . . , n − 1; i = 1, 2, . . . , d} (1) and the set of undirected edges E = E(Ld) defined by the following rule. Each vertex v = (x1, x2, . . . , xd) ∈ V is connected with 2d vertices given by adjacency list Adj(v) = {(x1, x2, . . . , (xk ± 1) mod n, . . . , xd) : k = 1, 2, . . . , d}. (2) The number of vertices is |V | = nd, and the number of edges is |E| = d nd. Two different vertices v1 and v2 are called independent, if they are not adjacent and they do not have any common adjacent vertex: v1 /∈ Adj(v2), v2 /∈ Adj(v1) and Adj(v1) ∩ Adj(v2) = ∅. (3) The star-packing problem is to pack the lattice Ld with non-overlapping star graphs S2d. Consider a star packing with a set of central vertices V ′ ⊂ V of stars. The following two rules define the set V ′: 1. Full coverage condition: ( ∪ v∈V ′ Adj(v) )∪ V ′ = V. 2. Non-overlapping condition (independence of stars): Any two different vertices v1, v2 ∈ V ′ are independent. It is obvious, that a star packing exists if |V | is dividable by 2d + 1. Given a vertex v = (x1, x2, . . . , xd) ∈ V and a vector r⃗ = (p1, p2, . . . , pd) ∈ Z. Define the sum v + r⃗ to be a vertex in V with coordinates v + r⃗ ≡ ((x1 + p1) mod n, (x2 + p2) mod n, . . . , (xd + pd) mod n) ∈ V. (4) H. Nahapetyan, Su. Poghosyan, V. Poghosyan and Yu. Shoukourian 119 Equivalently, the sum of a vector r⃗ and an arbitrary subset V ′ ⊆ V to be the following subset of V V ′ + r⃗ ≡ {v + r⃗ : v ∈ V ′} ⊆ V. (5) Note that, given a star packing defined by a set V ′, the set V ′ + r⃗ also defines a star packing, where r⃗ ∈ Z. Theorem: Consider a d-dimensional lattice Ld of linear size n with periodic boundary conditions. Assume that n is dividable by 2d + 1. Then a star packing of Ld exists, and the set of central vertices of stars is V ′ = {v = (x1, x2, . . . , xd) : (x1 + 2x2 + 3x3 + . . . + d xd) mod (2d + 1) = 0, v ∈ V }. (6) Proof: Given a vector r⃗ = (p1, p2, . . . , pd) ∈ Z, define a function Q(r⃗) = p1 + 2p2 + 3p3 + . . . + d pd. (7) Let us introduce the d-dimensional basis vectors e⃗k = (0, 0, . . . , 0︸ ︷︷ ︸ k-1 , 1, 0, 0, . . . , 0︸ ︷︷ ︸ d-k ), k = 1, 2, 3, . . . , d, (8) and the null vector e⃗0 = (0, 0 . . . , 0). First, we prove the full coverage condition. Given an arbitrary vertex v ∈ V \ V ′, it is necessary to prove that there exists a vertex u ∈ Adj(v), which is a central vertex of some star, i.e., u ∈ V ′. In other words, ∀r⃗ = (p1, p2, . . . , pd) ∈ Z, 0 ≤ pi ≤ n − 1, (9) one of the following 3 conditions holds 1. Q(r⃗) mod (2d + 1) = 0. 2. ∃k = 1, 2, . . . , d; Q(r⃗ − e⃗k) mod (2d + 1) = 0. 3. ∃k = 1, 2, . . . , d; Q(r⃗ + e⃗k) mod (2d + 1) = 0. Assume that Q(r⃗) has the following form: Q(r⃗) = (2d + 1)s + t, 0 ≤ t < 2d + 1, s ≥ 0. (10) Let the value of k be k =   0, if t = 0; ⇒ Q(r⃗) mod (2d + 1) = 0 t, if t < d; ⇒ Q(r⃗ − e⃗k) mod (2d + 1) = 0 2d + 1 − t, if t > d; ⇒ Q(r⃗ + e⃗k) mod (2d + 1) = 0 , (11) which proves the coverage condition. Now we prove the non-overlapping condition. It states, ∀ r⃗ = (p1, p2, . . . , pd) ∈ Z, 0 ≤ pi ≤ n − 1, i = 1, 2, . . . , d, which satisfies the condition Q(r⃗) mod (2d + 1) = 0, (12) and ∀∆⃗r = (∆p1, ∆p2, . . . , ∆pd) with 0 < |∆p1| + |∆p2| + . . . + |∆pd| ≤ 2, (13) 120 The Parallel Simulation Method for d-dimensional Abelian Sandpile Automata we have Q(r⃗ + ∆⃗r) mod (2d + 1) ̸= 0. (14) Since the Q function is linear, we have to show that Q(∆⃗r) mod (2d + 1) ̸= 0. Then, it is sufficient to prove the inequality 0 < |Q(∆⃗r)| < 2d + 1. Note that there are 3 possible cases, which satisfy the condition (13): 1. ∃k0 = 1, 2, . . . , d, for which ∆pk0 = ±2 and ∆ps = 0, ∀s ̸= k0. Then 0 < |Q(∆⃗r)| = 2k0 < 2d + 1. (15) 2. ∃k0 = 1, 2, . . . , d, for which ∆pk0 = ±1 and ∆ps = 0, ∀s ̸= k0. Then 0 < |Q(∆⃗r)| = k0 < 2d + 1. (16) 3. ∃k1, k2 = 1, 2, . . . , d; k1 ̸= k2, for which ∆pk1 = ±1, ∆pk2 = ±1 and ∆ps = 0, ∀s ̸= k1, k2. Then 0 < |Q(∆⃗r)| = |k1∆pk1 + k2∆pk2| ≤ k1 + k2 < 2d + 1. (17) 2 Fig.1 and Fig.2 illustrate the star packing of the 2 and 3-dimensional lattices, respectively. Fig. 1. A star-packing of a 2-dimensional lattice. H. Nahapetyan, Su. Poghosyan, V. Poghosyan and Yu. Shoukourian 121 Fig. 2. A star-packing of a 3-dimensional lattice. The layer z=4 is depicted separately. 3. Abelian Sandpile Model Consider an undirected graph G = (V, E) with vertex set V = {v1, v2, . . . , vN} and edge set E. A random variable hi, which takes integer values, is attached to each vertex vi ∈ V , representing the height of the sand at that vertex. hmaxi denotes the maximum allowed height of vertex vi of the graph G. For d-dimensional lattice we take h max i = 2d−1. CT denotes the collection of heights hi, which defines a configuration of the system at a given discrete time T . A configuration is called stable, if all heights satisfy hi ≤ hmaxi . Vertex vi is called closed, if hmaxi = deg (vi) − 1, where deg (vi) indicates the number of adjacent vertices of vi. Vertex vi is called open, if h max i ≥ deg (vi). The dynamics of the system is defined by the following rules. Consider a stable configuration CT at a given time T . We add a grain of sand at a random vertex vi ∈ V by setting hi to hi + 1 (we assume that the vertex is chosen randomly with a uniform distribution on the set V ). This new configuration, if stable, defines CT+1. If hi > h max i , vi becomes unstable and topples losing h max i +1 grains of sand, while all neighbors of vi receive one grain. Note that if the vertex is open, the system loses grains. During the toppling of the closed vertices, the number of grains is conserved. Note also that toppling of a vertex may cause some of its neighboring vertices to become unstable. In this case those vertices also topple according to the same toppling rule. Once all unstable vertices have been toppled, a new stable configuration CT+1 is obtained. If the finite connected graph G has at least one open vertex, then all vertices become stable after finite number of topplings. Moreover, the new stable configuration is independent of the toppling order. Therefore, the dynamics is well defined. Let âi be an operator, which acts on sandpile configurations and adds a grain at vertex i. It can easily be shown that âiâj = âjâi. This is the reason why the sandpile model is called Abelian. 122 The Parallel Simulation Method for d-dimensional Abelian Sandpile Automata 4. Simulation Results In this section we present the simulation results of the implemented software with built-in parallelization algorithms. The Abelian sandpile model on the finite d-dimensional lattice of linear size n is considered. The number of nodes is N = nd. The value of maximal height at all vertices vi ∈ V is hmaxi = 2d − 1, i = 1, 2, . . . , N. To perform parallel simulation, the concurrently toppled unstable vertices should meet the requirement of being independent. Partitioning the set of vertices into the sets of independent vertices is carried out by star- packing discussed above. Let Twait be the time needed for a cell to finish its job, and H denotes the initial number of grains at each node. In order to obtain a parallel software environment, the OpenMP and CUDA technologies have been used, which was implemented on the CPU i7 2670QM by Intel and Geforce GT. Simulation results include 2 types of Cellular Automata simulating Sandpile model. Difference is that for the second system an additional job is introduced during each toppling, which takes Twait constant time to finish. Four types of implementations have been analyzed and compared: 1. Standard − topplings are implemented sequentially on CPU. There is a loop around all nodes, and once an unstable node is found, it becomes stable. 2. No OMP − topplings are implemented sequentially on CPU. By a parallelization algo- rithm, we partition the set of nodes into 7 sections for a 3-dimensional lattice. There are two types of loops, one type over all groups, and the second type over all vertices for each group. 3. OMP − topplings are implemented parallel for each group on CPU, also there is one loop over all groups. 4. GPU − topplings are implemented in parallel for each group on GPU, also there is one loop over all groups. The simulations have been done on the same CPU and GPU architecture. In Figs. 3-7 the simulation results are presented. 140 210 280 n 5 10 15 20 25 30 Computation time (ms) Standard no-OMP OMP GPU Fig. 3. Sandpile on a 3-dimensional lattice. Grains are added randomly with uniform distribution. Total amount of added grains is 10 × n3. H. Nahapetyan, Su. Poghosyan, V. Poghosyan and Yu. Shoukourian 123 n=7,H=6 n=7,H=12 n=14,H=6 n=14,H=12 100 200 300 400 Computation time (ms) Standard no-OMP OMP GPU Fig. 4. Sandpile on a 3-dimensional lattice with initial height H=6, H=12. Twait = 5ms. n=7,H=6 n=7,H=12 n=14,H=6 n=14,H=12 200 400 600 800 Computation time (ms) Standard no-OMP OMP GPU Fig. 5. Sandpile on a 3-dimensional lattice with initial height H=6, H=12. Twait = 10ms. n=21,H=6 n=21,H=12 n=42,H=6 n=42,H=12 10000 20000 30000 40000 50000 Computation time (ms) Standard no-OMP OMP GPU Fig. 6. Sandpile on a 3-dimensional lattice with start height H=6, H=12. Twait = 3ms. 5. Conclusion In this paper, a parallel algorithm for simulating d-dimensional CA has been described. The algorithm is based on the idea of star-packing of d-dimensional lattices. Delmas and Perennes in their paper [15] found a star packing for 3-dimensional lattice of linear size n = 7i with integer i. We found an explicit star packing of d-dimensional lattice, which meets weaker restriction on n, namely, n mod (2d + 1) = 0. As the results show, the algorithm developed gains about 8× speedup with openMP, and more than 50× on CUDA (depending on GPU architecture and cellular automata, it can be 100× and more faster than non-paralyzed 124 The Parallel Simulation Method for d-dimensional Abelian Sandpile Automata performance). Our next aim is to use algorithms for computing different observables of the sandpile model on d-dimensional lattices, meanwhile analytical calculations have been obtained for 2-dimension. References [1] V. S. Poghosyan, S. S. Poghosyan and H. E. Nahapetyan, “The investigation of models of self-organized systems by parallel programming methods based on the example of an abelian sandpile model”, Proc. CSIT Conference 2013, Yerevan Armenia, Sept. 23-27, pp. 260–262, 2013. [2] E. Davtyan, H. Karapetyan and K. Shahbazyan, “Software tool for cluster-based mod- eling of 2d cellular automata”, Proc. CSIT Conferance, Yerevan Armenia, 28 Sept. - 2 Oct., pp. 404–407, 2013. [3] S. Frehmel, “The sandpile model: parallelization of efficient algorithms for systems with shared memory”, ACRI, Lecture Notes in Computer Science 6350, pp. 35–45, 2010. [4] P. Bak, C. Tang and K. Wiesenfeld, “Self-organized criticality: An explanation of the 1/f noise”, Phys. Rev. Lett., vol.59, no. 4, pp. 381–384, 1987. [5] D. Dhar, “Self-organized critical state of sandpile automaton models”, Phys. Rev. Lett., vol. 64, no. 14, pp. 1613–1616, 1990. [6] D. Dhar, “Theoretical studies of self-organized criticality”, Physica A, vol. 369, no. 1, pp. 29–70, 2006. [7] P. Grassberger and S. S. Manna, “Some more sandpiles”, J. Phys. France, vol. 51, pp. 1077–1098, 1990. [8] V. S. Poghosyan, S. Y. Grigorev, V. B. Priezzhev and P. Ruelle, “Pair correlations in the sandpile model: A check of logarithmic conformal field theory”, Phys. Lett. B, vol. 659, pp. 768–772, 2008. [9] Su. S. Poghosyan, V. S. Poghosyan, V. B. Priezzhev and P. Ruelle, “Numerical study of correspondence between the dissipative and fixed-energy Abelian sandpile models”, Phys.Rev. E, 84, 066119, 2011. [10] A. Fey, L. Levine, and D.B. Wilson, “Driving Sandpiles to Criticality and Beyond”, Phys. Rev. Lett., 104, 145703, 2010. [11] A. Fey, L. Levine, and D. B. Wilson, “Approach to criticality in sandpiles”, Phys. Rev. E 82, 031121 (2010). [12] V. S. Poghosyan, S. Y. Grigorev, V. B. Priezzhev and P. Ruelle, “Logarithmic two-point correlators in the Abelian sandpile model”, J. Stat. Mech., vol. 2010, no. 07, P07025, 2010. [13] V. S. Poghosyan and V. B. Priezzhev, “The problem of predecessors on spanning trees”, Acta Polytechnica, vol. 51, no. 1, pp. 59–62, 2011. [14] S. N. Majumdar and D. Dhar, “Height correlations in the Abelian sandpile model”, J. Phys. A: Math. Gen., vol. 24, no. 7, L357–L362, 1991. [15] O.Delmas, S. Perennes, “Circuit-switched gossiping in 3-dimensional torus networks”, Proc. Euro-Par, Parallel-Processing, pp. 370–373, 1996. Submitted 10.06.2016, accepted 25.10.2016. H. Nahapetyan, S. Poghosyan, V. Poghosyan and Yu. Shoukourian 1 2 5 ²í³½³ÏáõÛïÇ d-ã³÷³ÝÇ ²µ»ÉÛ³Ý ³íïáÙ³ïÇ ½áõ·³Ñ»é ÙṻɳíáñÙ³Ý Ù»Ãá¹ Ð. ܳѳå»ïÛ³Ý, ê. äáÕáëÛ³Ý, ì. äáÕáëÛ³Ý ¨ Úáõ. ÞáõùáõñÛ³Ý ²Ù÷á÷áõÙ ²Ûë Ñá¹í³ÍáõÙ µ»ñí³Í ¿ ù³é³ÏáõëÇ ó³Ýó»ñÇ Ñ³Ù³ñ [1]-áõÙ Ý»ñÙáõÍí³Í ³ëïÕ³ÛÇÝ Í³ÍÏáõÛÃÇ ËݹñÇ ÁݹѳÝñ³óáõÙÁ L d, d 2 N d -ã³÷³ÝÇ ó³ÝóÇ Ñ³Ù³ñ: ÊݹÇñÁ ϳ۳ÝáõÙ ¿ S2d ·ñ³ýáí L d ó³ÝóÇ Í³ÍÏÙ³Ý Ù»ç: îíÛ³É ËݹñÇ ÉáõÍÙ³Ý ÑÇÙ³Ý íñ³ Ùß³Ïí»É ¿ d-ã³÷³ÝÇ µçç³ÛÇÝ ³íïáÙ³ïÇ ÙṻɳíáñÙ³Ý ½áõ·³Ñ»é³óí³Í ³É·áñÇÃÙ: àñå»ë µçç³ÛÇÝ ³íïáÙ³ïÇ ûñÇÝ³Ï ¿ ¹Çï³ñÏí»É ²µ»ÉÛ³Ý ³í³½³ÏáõÛïÇ ³ÝϳÛáõÝ íÇ׳ÏÝ»ñÇ é»É³ùë³óÇáÝ åñáó»ëÁ: гٳå³ï³ëË³Ý Íñ³·ñ³ÛÇÝ ÷³Ã»ÃÝ»ñÁ ݳ˳·Íí»É »Ý OpenMP ¨ CUDA ï»ËÝáÉá·Ç³Ý»ñÇ ÏÇñ³éٳٵ: ´»ñí³Í »Ý ï³ñµ»ñ ã³÷»ñÇ »é³ã³÷ ó³Ýó»ñÇ ½áõ·³Ñ»é³óí³Í ÙṻɳíáñÙ³Ý ³ñ¹ÛáõÝùÝ»ñÁ: Ìåòîä ïàðàëëåëüíîé ñèìóëÿöèè d-ìåðíûõ àâòîìàòîâ Àáåëåâîé ïåñî÷íîé ãîðêè À. Íàãàïåòÿí, Ñ. Ïîãîñÿí, Â. Ïîãîñÿí è Þ. Øóêóðÿí Àííîòàöèÿ  ýòîé ñòàòüå ïðèâåäåíî îáîáùåíèå ïðîáëåìû çâåçäíîãî ïîêðûòèÿ, âûäâèíóòîé â [1] äëÿ êâàäðàòíûõ ðåøåòîê. Îáîáùåíèå äîñòèãíóòî äëÿ d-ìåðíîé ðåøåòêè L d, d 2 N. Ïðîáëåìà ñîñòîèò â ïîêðûòèè L d ðåøåòêè çâåçäíûì ãðàôîì S2d. Íà îñíîâàíèè ðåøåíèÿ äàííîé ïðîáëåìû, ðàçðàáîòàíà ïàðàëëåëèçîâàííàÿ ïðîãðàììà ñèìóëÿöèè d-ìåðíîãî êëåòî÷íîãî àâòîìàòà.  êà÷åñòâå ïðèìåðà êëåòî÷íîãî àâòîìàòà ðàññìîòðåí ïðîöåññ ðåëàêñàöèè íåñòàáèëüíûõ ñîñòîÿíèé Àáåëåâîé ìîäåëè ïåñî÷íîé ãîðêè. Ñîîòâåòñòâóþùèå ïðîãðàììíûå ïàêåòû ðàçðàáîòàíû ñ èñïîëüçîâàíèåì òåõíîëîãèé OpenMP è CUDA. Ïðåäñòàâëåíû ðåçóëüòàòû ïàðàëëåëüíîé ñèìóëÿöèè, ïðîâåäåííîé äëÿ 3-ìåðíîé ðåøåòêè ðàçëè÷íûõ ðàçìåðîâ. MPCS.pdf (p.1-8) Vahagn_abstract.pdf (p.9)