Layout 1 ISDS Annual Conference Proceedings 2012. This is an Open Access article distributed under the terms of the Creative Commons Attribution- Noncommercial 3.0 Unported License (http://creativecommons.org/licenses/by-nc/3.0/), permitting all non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. ISDS 2012 Conference Abstracts Using the Flow of People in Cluster Detection and Inference Sabino J. Ferreira*1, Francisco S. Oliveira1, Ricardo Tavares2 and Flavio R. Moura2 1Federal University of Minas Gerais, Belo Horizonte, Brazil; 2Federal University of Ouro Preto, Ouro Preto, Brazil Objective We present a new approach to the circular scan method [1] that uses the flow of people to detect and infer clusters of regions with high incidence of some event randomly distributed in a map. We use a real database of homicides cases in Minas Gerais state, in south- east Brazil to compare our proposed method with the original circu- lar scan method in a study of simulated clusters and the real situation. Introduction The traditional SaTScan algorithm[1],[2] uses the euclidean dis- tance between centroids of the regions in a map to assemble a con- nected (in the sense that two connected regions share a physical border) sets of regions. According to the value of the respective log- arithm of the likelihood ratio (LLR) a connected set of regions can be classified as a statistically significant detected cluster. Considering the study of events like contagious diseases or homicides we con- sider using the flow of people between two regions in order to build up a set of regions (zone) with high incidence of cases of the event. In this sense the regions will be closer as the greater the flow of peo- ple between them. In a cluster of regions formed according to the cri- terion of proximity due to the flow of people, the regions will be not necessarily connected to each other. Methods We consider a study map with a number of observed cases and risk population for each region. The original circular scan algorithm ran- domly chooses one region as the first zone and calculates its respec- tive LLR. In the next step a new zone is created including the first region and the region closest to it according the euclidean distance be- tween their centroids and the respective LLR is calculated. This process is repeated until the zone population exceeds a certain per- centage of the total population of the map. In our spatial flow scan al- gorithm everything works in the same manner except that the degree of proximity of two regions is given by the flow of people between them, the higher the flow between the regions closest one is the other. Instead of considering an order of increasing distances to add a region and create a new zone our algorithm uses a decreasing flow of peo- ple. In this way we can obtain a zone/cluster candidate composed of a number of non necessarily connected regions. Results Minas Gerais state is located in Brazil south-eastern region com- posed of 853 municipalities or regions with an estimated population of 19,150,344 in 2005. All data were obtained from the Brazilian Ministry of Health (WWW.DATASUS.GOV.BR ) and Brazilian In- stitute of Geography and Statistics (WWW.IBGE.GOV.BR). In the period of 2003 to 2008 were recorded 20,912 homicides at a rate of 22 cases per 100,000. To measure the flow of people between the cities we obtain the data of bus round trips between all the 853 Minas Gerais municipalities from state department of highways (www.der.mg.gov.br ). As a large number of pairs of cities have zero bus trips between them we use a gravity model [3] to estimate the flow of people. We use 30% as upper percentage for a zone popula- tion. With the real data of homicides cases the original circular scan found a significant cluster containing the city of Belo Horizonte which is the Minas Gerais state capital and large urban area that in- clude Belo Horizonte and 22 more cities totalizing a population of about 3.5 milion people. Our adapted spatial scan algorithm also found a similar cluster including the capital Belo Horizonte but with two small cities less. Conclusions In simulation studies where the real cluster is known we observe that our spatial flow scan algorithm has a performance similar to the circular scan concerning detection power and slightly worse in rela- tion to the positive predicted value (PPV) and the sensitivity when the real cluster is regular. However, the performance of our algorithm is clearly better with regard to the sensitivity and the PPV when the real cluster is irregular and or non-connected. Keywords Spatial scan statistics; flow of people; spatial flow scan algorithm; gravity models Acknowledgments SJF acknowledges the support by Fapemig, MG, Brazil. References [1] Kulldorff M. A Spatial Scan Statistic, Comm. Statist. Theory Meth., 1997, 26(6), 1481-1496. [2] Kulldorff M. SaTScan: Software for the spatial, tem-poral and space- time scan statistics. [www.satscan.org]. [3] Signorino G.; Pasetto R.; Gatto E.; Mucciardi M.; Rocca M. La; Muso P. Gravity models to classify commuting vs. resident workers. An ap- plication to the analysis of residential risk in a contaminated area. . Int. J. of Health Geographics, 2011. 10:11, pp. 1-10. *Sabino J. Ferreira E-mail: sabjfn@gmail.com Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 5(1):e12, 2013