key: cord-017590-w5copp1z authors: fresnadillo, maría j.; garcía, enrique; garcía, josé e.; martín, ángel; rodríguez, gerardo title: a sis epidemiological model based on cellular automata on graphs date: 2009 journal: distributed computing, artificial intelligence, bioinformatics, soft computing, and ambient assisted living doi: 10.1007/978-3-642-02481-8_160 sha: doc_id: 17590 cord_uid: w5copp1z the main goal of this work is to introduce a new sis epidemic model based on a particular type of finite state machines called cellular automata on graphs. the state of each cell stands for the fraction of the susceptible and infected individuals of the cell at a particular time step and the evolution of these classes is given in terms of a local transition function. the public health issues have a lot of importance in our society, particularly viral spread through populated areas. epidemics refer to a disease that spreads rapidly and extensively by infection and affecting many individuals in an area at the same time. in this way, the most recent worrying epidemic was the severe acute respiratoy syndrome (sars) outbreak in asia. infectious disease accounts for 29 of 96 major causes of human morbidity and mortality listed by the world health organization and the world bank, and 25% of global deaths (over 14 million deaths annually). consequently, since the publication of the first modern mathematical epidemic models in the first years of xx century (see [6, 9] ), several mathematical models to study the dynamics of epidemics have been appeared in the literature. traditionally, mathematical models are based on differential equations. nevertheless, this approach has some drawbacks since they do not take into account spatial factors such as population density, they neglect the local character of the spreading process, they do not include variable susceptibility of individuals, etc. as a consequence, this can lead to very unrealistic results, such as, for example, endemic patterns relaying on very small densities of individuals, which are called "atto-foxes" or "nano-hawks" (see [8] ). other mathematical models are based on a particular type of discrete dynamical systems called cellular automata (see, for example [2, 5, 7, 10, 11] ). these simple models of computation eliminate the last mentioned shortcomings, and are specially suitable for computer simulations. roughly speaking, cellular automata (ca for short) are a special type of finite state machines capable to simulate physical, biological or environmental complex phenomena. consequently, several models based on such mathematical objects have been proposed to simulate growth processes, reaction-diffusion systems, selfreproduction models, epidemic models, forest fire spreading, image processing algorithms, cryptographic protocols, etc. (see, for example, [12, 13] ). specifically, a two-dimensional ca is formed by a two-dimensional array of identical objects called cells which can be disposed in a rectangular, triangular or an hexagonal lattice (called cellular space). these cells are endowed with a state that changes in discrete steps of time according to a specific rule. as the ca evolves, the updated function (whose variables are the states of the neighbor cells) determines how local interactions can influence the global behaviour of the system. usually, mathematical models to study epidemic spreading are divided into three types: sis models, sir models and seir models, depending on the classes in which the population can be classified. the model introduced in this paper deals with sis epidemic diseases (for example the group of those responsible for the common cold), that is, the population is divided into susceptible individuals (s) and infected individuals (i). the susceptible individuals are those capable to contracting the disease whereas the infected individuals are those capable of spreading the disease. for a sis model, infected individuals return to the susceptible class on recovery because the disease confers no immunity against reinfection. moreover, some assumptions will be common to all models: (1) the disease is transmitted by contact between an infected individual and a susceptible individual; (2) there is no latent period for the disease, hence the disease is transmitted instantaneously upon contact; (3) all susceptible individuals are equally susceptible and all infected individuals are equally infectious; (4) the population under consideration is fixed in size. this means that no births or migration occurs, and no deaths are taken into account. the main goal of this work is to introduce a new sis model to simulate the spread of a general epidemic based on cellular automata on graph. specifically, in the proposed model, the state of each cell stands for the fraction of the susceptible and infected individuals of the cell at a particular time step. the local transition function is a function involving the states of the neighbor cells and other parameters such as the virulence of the epidemic, the rate of recovered infected individuals, etc. moreover, as is mentioned above, the standard paradigm for cellular automata states that the topology of the cellular space is given in terms of a regular rectangular or hexagonal lattices. nevertheless, in this paper we will consider a more efficient topology to model an epidemic disease, which is given by an undirected graph where its nodes stand for the cells of the cellular automata. there are several ca-based algorithms to simulate a sis epidemic model (see, for example [1, 3, 4] ). the standard paradigm of these models states that each cell stands for an only one individual. unfortunately, there are few models considering more than one invidual in each cell (see for example [5] ). we think that this new paradigm is more accurate than the other one in order to obtain more realistic simulations. the main advantage of the model presented in this paper over the model introduced in [5] is the use of graph tology and a more realistic transition function involving new parameters as the portion of susceptible individuals that moves from one cell to another one. the rest of the paper is organized as follows: in section 2 the basic theory about cellular automata on graphs is provided; the proposed model is introduced in section 3; the analysis of the model is shown in section 4; and, finally, the conclusions and the future work are presented in section 5. . . , v n } is a ordered non-empty finite set of elements called nodes (or vertices), and e is a finite family of pairs of elements of v called edges. two nodes of the graph, v i , v j ∈ v , are said to be adjacent (or neighbors) if there exists an edge in e of the form (v i , v j ). we consider undirected graphs, that is, is not two edges of g with the same ends and no loops exist, i.e. edges whose start and end is located at the same node. the neighborhood of a node v ∈ v , n v , is the set of all nodes of g which are adjacent to v, that is: the degree of a node v, d v , is the number of its neighbors. a cellular automaton on an undirected graph g = (v, e) is a 4-uple a = (v, s, n, f ) where: the set v defines the cellular space of the ca such that each node stands for a cell the cellular automaton. s is the finite set of states that can be assumed by the nodes at each step of time. the state of the node v at time step t is denoted by s t v ∈ s. these states change accordingly to the local transition function f . n is the neighborhood function which assigns to each node its neighborhood, that is: note that the neighborhoods of the nodes are, in general, different from others. the local transition function f calculates the state of every node at a particular time step t + 1 from the states of the its neighbors at the previous time step t, that is, in the mathematical epidemiological model introduced in this paper, the population is divided into two classes: those who are susceptible to the disease and those who are infected to the disease. moreover, the population is located at city centres which stand for the nodes of a graph g. if there is some type of transport connection (by car, train, airplane, etc.) between two of these cities, the associated nodes are connected by an edge. the following assumptions are also made: 1. the population of each node remains constant over time, that is, no births or deaths are taking into account (it is a sis model without vital dynamics). moreover, the population distribution is inhomogeneous: let p u be the number of individuals of the node u ∈ v , and set p = max {p u , u ∈ v }. 2. the transmission of the disease (that is, the passing of the disease from an infected individual to a susceptible individual) is through direct physical contact: touching an infected person, including sexual contact. 3. the population (susceptible and infected people) are able to move from its node to another one and return to the origin node at every step of time. since the model introduced in this work is a sis model, then the state of the node u ∈ v at time step t is: s t u = (s t u , i t u ) ∈ q × q = s, where s t u ∈ [0, 1] stands for the fraction of susceptible individuals of the node u at time t, and i t u ∈ [0, 1] stands for the fraction of infected individuals of the node u at time t. consequently, the transition function of the ca is as follows: where d is a suitable discretization function. the ground where the epidemic is spreading is modeled as a weighted graph where each node stands for a city or a town, and the arc between two nodes represents the connection between the corresponding cities. in this sense, the connection factor between the nodes u and v is the weight associated to the arc (u, v) ∈ e and it is denoted by w uv . it depends on the transportation capacity of the public and non-public transport. consequently where h uv is the total amount of population wich move from u to v during a time step. the evolution of the number of infected individuals of the node u ∈ v is as follows: the infected individuals of u at time step t is given by the sum of: 1. the infected individuals at the previous time step which have not been recovered from the disease. 2. the susceptible individuals which have been infected during the time step. in this case we have to take into account the recovery rate r ∈ [0, 1]. these new sick individuals of u can be infected both by the infected individuals of u or by the infected individuals of the neighbor nodes of u which have moved to u during the time step. in the first case, only the rate of transmission, p ∈ [0, 1], is involved, whereas in the second case we have to consider the connection factors between the nodes, and the population and movement factor of each node. moreover we also consider the susceptible individuals of u moved to a neightbor node during the step of time and infected in this neighbor node by its corresponding infected individuals; in this case η u ∈ [0, 1] yields the portion of moved susceptible individuals from u to its neighbor nodes. then, the mean-field equation for infected individuals is the following: on the other hand, the susceptible individuals of each node is given by the difference of the susceptible individuals of the node at the previous time step and the susceptible individuals which have been infected as is mentioned above. note that, as a simple calculus shows: then a discretization function d : [0, 1] → q must be used in order to get a finite state set. in our case, the discretization function used is the following: 100 where [m] stands for the nearest integer to m. as a consequence, q ={0, 0.01, . . . , 1}. then, the system of equations governing the evolution of the two classes of population is: one of the most important question in a mathematical epidemiological model is the study of the possibility of the eradication of disease. in relation with every mathematical epidemiological model, it is very important to determine under what circumstances the epidemic occurs. taking into account the intrinsic characteristics of our model, we will demand two conditions: (1) the epidemic disease must spread among the nodes of the graph; and (2) the infected population grows. the initial conditions in the study are the following: at time step t = 0, we will consider only one node, for example u ∈ v , with infected individuals: first of all we will show the necessary condition for epidemic spreading from the node u to its neighbor v ∈ n u , at the next step of time t = 1. thus, it as the unique node with infected population at time t = 0 is u, then taking into account (1), it yields: as a consequence: . this equation must hold for every neighbor nodes of u, then the following result holds: theorem. the epidemic disease spreads from node u to its neighbor nodes if the following condition holds: now we will study what conditions that must be held to get a growth of the infected population in a node u. we have to distinguish two cases: (1) there not exist infected individuals from neighbor nodes to u; (2) there exist such infected individuals. 1. in the first case it is i t+1 u > i t u , that is: as a consequence the growth occurs if: 2. in the second case, the inequality i t+1 u > i t u gives: in this example, for the sake of simplicity we will suppose that the epidemic is spreading over n = 10 cities, v 1 , . . . , v 10 , forming a complete graph k 10 . in this example, we will consider the following initial configuration: that is, there is only one node at time t = 0 with infected population. moreover, the parameters used are p = 0.25, r = 0.8, η ui = 0.2, 1 ≤ i ≤ 6. moreover, let us suppose that the population of each node is the same: p ui = 100 with 1 ≤ i ≤ 6, and also the transport capacity between two nodes is the same: w uiuj = 1 for 1 ≤ i, j ≤ 6. note that this example deals with an homogeneous-symmetric case. in figure 1 the evolution of the total number of infected and susceptible individuals is shown. if we set p = 0.15 instead of p = 0.25, the number of infected and susceptible individuals also remains constant with time, but in this case the number of susceptible is greater than the number of infected individuals. in this work a new mathematical model to simulate the spreading of an epidemic is introduced. it is based on the use of cellular automata on graphs endowed with fig. 1 . evolution of the total number of infected and susceptible individuals a suitable local transition function. the state of each cell is considered to be the portion of its population which is infected at each time step. the analysis of the model proposed in this paper seems to be in agreement with the results obtained for other mathematical models not based on discrete event systems, such as odes or pdes. future work aimed at designing a more complete ca-based epidemic model involving additional effects such as the population movement, virus mutation, etc. furthermore, it is also interesting to consider non-constant connections factors and also the effect of migration between the cells must be considered. on some applications of cellular automata a simple cellular automaton model for influenza a viral infections critical behaviour of a probablistic automata network sis model for the spread of an infectious disease in a population of moving individuals cellular automata and epidemiological models with spatial dependence a model based on cellular automata to simulate epidemic diseases contributions to the mathematical theory of epidemics, part i a cellular automata model for citrus variegated chlorosis the dependence of epidemic and population velocities on basic parameters the prevention of malaria extending the sir epidemic model a cellular automaton model for the effects of population movement and vaccination on epidemic propagation cellular automata machines: a new environment for modeling a new kind of science acknowledgments. this work has been partially supported by consejería de sanidad, junta de castilla y león (spain). key: cord-016196-ub4mgqxb authors: wang, cheng; zhang, qing; gan, jianping title: study on efficient complex network model date: 2012-11-20 journal: proceedings of the 2nd international conference on green communications and networks 2012 (gcn 2012): volume 5 doi: 10.1007/978-3-642-35398-7_20 sha: doc_id: 16196 cord_uid: ub4mgqxb this paper summarizes the relevant research of the complex network systematically based on statistical property, structural model, and dynamical behavior. moreover, it emphatically introduces the application of the complex network in the economic system. transportation network, and so on are of the same kind [2] . emphasis on the structure of the system and the system analysis from structure are the research thinking of the complex network. the difference is that the property of the topological structure of the abstracted real networks is different from the network discussed before, and has numerous nodes, as a result we call it complex network [3] . in recent years, a large number of articles are published in world leading publication such as science, nature, prl, and pnas, which reflects indirectly that complex network has been a new research hot spot. the research in complex network can be simply summarized as contents of three aspects each of which has close and further relationships: rely on the statistical property of the positivist network measurement; understanding the reason why the statistical property has the property it has through building the corresponding network model; forecasting the behavior of the network system based on the structure and the formation rule of the network. the description of the world in the view of the network started in 1736 when german mathematician eular solved the problem of johannesburg's seven bridges. the difference of complex network researching is that you should view the massive nodes and the properties they have in the network from the point of the statistics firstly. the difference of the properties means the different internal structures of the network; moreover the different internal structures of the network bring about the difference of the systemic function. therefore, the first step of our research on complex network is the description and understanding of the statistical properties, sketched as follows: in the research of the network, generally speaking we define the distance between two nodes as the number of the shortest path edge of the two connectors; the diameter of the net as the maximum range between any two points; the average length of the net is the average value of the distance among all the nodes, it represents the degree of separation of the nodes in the net, namely the size of the net. an important discover in the complex network researching is that the average path length of the most of the large-scale real networks is much less than our imagine, which we call ''small-world effect''. this viewpoint comes from the famous experiment of ''milgram small-world'', the experiment required the participators to send a letter to one of their acquaintances making sure the letter reach the recipient of the letter, in order to figure out the distribution of the path length in the network, the result shows that the number of the average passing person is just six, in addition the experiment is also the origin of the popular theory ''6°of separation''. the aggregation extent of the nodes in the network is represented by convergence factor c, that is how close of the network. for example in the social networks, your friend's friend may be your friend or both of your two friends are friends. the computational method is that: assuming node i connect other k i nodes through k i , if the k i connected each other, there should be k i ðk i à 1þ=2 sides among them, however if the k i nodes have e i sides, then the ratio of e i to k i ðk i à 1þ=2 is the convergence factor of node i. the convergence factor of the network is the average value of all the nodes' convergence factor in the network. obviously only is in fully connected network the convergence factor equals 1, in most other networks convergence factor less than 1. however, it proves to be that nodes in most large-scale realworlds network tent to be flock together, although the convergence factor c is far less than 1, it is far more than n à1 . the degree k i of the node i in the graph theory is the total amount of the sides connected by node i, the average of the degree k i of the node i is called average degree of the network, defined as \ k [. the degree of the node in the network is represented by distribution function p(k), the meaning of which is that the probability that any nodes with k sides, it also equals the number of nodes with k degree divide the number of all the nodes in the network. the statistical property described above is the foundation of the complex networks researching; with the further researching we generally discover the realworld network has other important statistical property, such as the relativity among network resilience, betweenness, and degree and convergence factor. the most simple network model is the regular net region; the same number around every node is its characteristic, such as 1 d chain-like, 2 d lattice, complete graph and so on. paul erdös and alfred rényi discovered a complete random network model in the late 50s twentieth century, it is made of any two nodes which connected with probability p in the graph made of n nodes, its average degree is \k [ ¼ pðn à 1þ % pn; the average path length l : ln n= lnð\k [ þ; the convergence factor c ¼ p; when the value of n is very large, the distribution of the node degree approximately equals poisson distribution. the foundation of the random network model is a significant achievement in the network researching, but it can hardly describe the actual property of the realworld, lots of new models are raised by other people. as the experiment puts, most of the realworld networks has small-world (lesser shortest path) and aggregation (larger convergence factor). however, the regular network has aggregation, but its average shortest path length is larger, random graph has the opposite property, having small-world and less convergence factor. so the regular networks and random networks can not reflect the property of the realworld, it shows that the realworld is not well-defined neither is complete random. watts and strogatz found a network which contains both small-world and high-aggregation in 1988, which is a great break in the complex network researching. they connected every side to a new node with probability p,through which they build a network between regular network and random network (calling ws net for short), it has less average path length and larger convergence factor, while the regular network and random network are special case when p is 0 and 1 in the ws net. after the ws model being put forward, many scholars made a further change based on ws model, the nw small-world model raised by newman and watts has the most extensive use. the difference between nw model and ws model is that nw model connects a coupe of nodes, instead of cutting off the original edge in the regular network. the advantage of nw model is that the model simplifies the theory analysis, since the ws model may have orphan nodes which nw would not do. in fact, when p is few while n is large, the results of the theory analysis of the two models will be the same; we call them small-world model now. although the scale-free network can describe the small-world and highaggregation of the realworld well, the theory analysis of the small-world model reals that the distribution of the node is still the index distribution form. as the empirical results put it is more accurate to describe the most of the large-scale realworld model in the form of the power-law namely pðkþ : k àc . compared with index distribution power-law has no peak, most nodes has few connection, while few nodes have lots of connection, there is no characteristic scale as the random network do, so barabási and some other people call this network distribution having power rate characteristics scale-free network. in order to explain the foundation of the scale-free network, barabási and albert found the famous ba model, they thought the networks raised before did not consider the two important property of the realworld-growth property and connection optimization, the former means the new nodes are constantly coming into the network, the latter means after their arriving the new nodes prefer to connect the nodes with large degree. not only do they make the simulation analysis of the generating algorithm of the ba model, but also it has given the analytic solution to the model using the way of the mean field in statistical physics, as the result put: after enough time of evolution, the distribution of ba network don't change with time, degree distribution is power-law with its index number 3 steadily. foundation of the ba model is another great breakout in the complex network research, demonstrating our further understanding of the objective network world. after that, many scholars made many improvements in the model, such as nonlinearity priority connection, faster growth, and local events of rewind side, being aging, and adaptability competition and so on. note that: most instead of all of the realworld is scale-free network, for some realworld network's degree distribution is the truncation form of the power-law. scholars also found some other network model such as local area world evolution model, weight evolution network model and certainty network model to describe the network structure of the realworld besides small-world model and scale-free network. study of the network structure is important, but the ultimate purpose is that we can understand and explain the system's modus operand based on these networks, and then we can forecast and control the behavior of network system. this systemic dynamical property based on network is generally called dynamical behavior, it involves so many things such as systemic transfusion, synchronization, phase change, web search and network navigator. the researched above has strong theoretical, a kind of research of network behavior which has strong applied has increasingly aroused our interests, for example the spread of computer virus on computer net, the spread of the communicable disease among multitude and the spread of rumours in society and so on, all of them are actually some propagation behavior obeying certain rules and spreading on certain net. the traditional network propagation models are always found based on regular networks, we have to review the issue with the further research of the complex networks. we emphatically introduce the research of the application. one of the uppermost and foremost purposes of network propagation behavior research is that we can know the mechanism transmission of the disease well. substitute node for the unit infected, if one unit can associate with another in infection or the other way round through some way, then we regard that the two units have connection, in this way can we get the topological structure of network propagation, the relevant propagation model can be found to study the propagation behavior in turn. obviously, the key to network propagation model studying is the formulation of the propagation rule and the choice of the network topological structure. however, it does not conform to the actual fact simply regarding the disease contact network as regular uniform connect network. moore studied the disease propagation behavior in small-world, discovering that the propagation threshold value of disease in small-world is much less than it does in regular network, in the same propagation degree, experience the same time, the propagation scope of disease in the small-world is significantly greater than the propagation scope in the regular network, that is to say: compared to regular network, disease in the smallworld inflects easily; paster satornas and others studied the propagation behavior in the scale-free world, the result turns out to be amazing: there is always positive propagation degree threshold value in both of regular world and small-world, while the propagation degree threshold value approves to be 0. we can get the similar results when analyzing the scale-free world. as lots of experiments put realworld network has both small-world and scale-free, the conclusion described above is quite frustrated. fortunately, no matter virus or computer virus they all has little infectious (k ¼ 1), doing little harm. however, once the intensity of disease or virus reaches some degree, we have to pay enough attention to it, the measurement to control it can not totally rely on the improvement of medical conditions, we have to take measures to quarantine the nodes and turn off the relevant connections in order to cut off avenue of infection in which we can we change the topological structure of the propagation network. in fact, just in this way can we defeat the war of fighting sars in 2003 summer in our country. the study of the disease's mechanism transmission is not all of the questions our ultimate goal is that we can master how to control disease propagation efficiently. while in practical applications, it is hard to stat the number of nodes namely the number of units which have possibilities connect with other nodes in infection period. for example in the research of std spread, researchers get the information about psychopath and high risk group only through questionnaire survey and oral questioning, while their reply has little reliability, for that reason, quite a lot of immunization strategy have been put forward by some scholars based on above-mentioned opinion, such as ''who is familiar with the immune'', ''natural exposure'', ''vaccination''. analyzing disease spread phenomenon is not just the purpose of researching network propagation behavior; what is more a large amount of things can be analyzed through it. for example we can apply it to propagation behavior's research in social network, the basic ideas showed as follows: first we should abstract the topological structure of the social network out from complex network theory, then analyze the mechanism transmission according to some propagation rules, analyze how to affect the propagation through some ways at last. actually, this kind of work has already started, such as the spread of knowledge, the spread of new product network and bank financial risk; they have both relation and difference, the purpose of the research of the former is to contribute to its spread; the latter is to avoid its spread. systems science. shanghai scientific and technological educational publishing house pearson education statistical mechanics of complex network the structure and function of complex networks key: cord-102935-cx3elpb8 authors: hassani-pak, keywan; singh, ajit; brandizi, marco; hearnshaw, joseph; amberkar, sandeep; phillips, andrew l.; doonan, john h.; rawlings, chris title: knetminer: a comprehensive approach for supporting evidence-based gene discovery and complex trait analysis across species date: 2020-04-24 journal: biorxiv doi: 10.1101/2020.04.02.017004 sha: doc_id: 102935 cord_uid: cx3elpb8 generating new ideas and scientific hypotheses is often the result of extensive literature and database reviews, overlaid with scientists’ own novel data and a creative process of making connections that were not made before. we have developed a comprehensive approach to guide this technically challenging data integration task and to make knowledge discovery and hypotheses generation easier for plant and crop researchers. knetminer can digest large volumes of scientific literature and biological research to find and visualise links between the genetic and biological properties of complex traits and diseases. here we report the main design principles behind knetminer and provide use cases for mining public datasets to identify unknown links between traits such grain colour and pre-harvest sprouting in triticum aestivum, as well as, an evidence-based approach to identify candidate genes under an arabidopsis thaliana petal size qtl. we have developed knetminer knowledge graphs and applications for a range of species including plants, crops and pathogens. knetminer is the first open-source gene discovery platform that can leverage genome-scale knowledge graphs, generate evidence-based biological networks and be deployed for any species with a sequenced genome. knetminer is available at http://knetminer.org. which is prone to information being overlooked and subjective biases being introduced. even when 38 the task of gathering information is complete, it is demanding to assemble a coherent view of how 39 each piece of evidence might come together to "tell a story" about the biology that can explain how 40 multiple genes might be implicated in a complex trait or disease. new tools are needed to provide 41 scientists with a more fine-grained and connected view of the scientific literature and databases, 42 rather than the conventional information retrieval tools currently at their disposal. 43 scientists are not alone with these challenges. search systems form a core part of the duties of 44 many professions. studies have highlighted the need for search systems that give confidence to 45 the professional searcher and therefore trust, explainability, and accountability remain a significant knetminer provides search term suggestions and real-time query feedback. from a search, a user 118 is presented with the following views: gene view is a ranked list of candidate genes along with a 119 summary of related evidence types. map view is a chromosome based display of qtl, gwas 120 peaks and genes related to the search terms. evidence view is a ranked list of query related 121 evidence terms and enrichment scores along with linked genes. by selecting one or multiple 122 elements in these three views, the user can get to the network view to explore a gene-centric or 123 evidence-centric knowledge network related to their query and the subsequent selection. (nilsson-ehle, 1914) and that the red pigmentation of wheat grain is controlled by r genes on the 136 long arms of chromosomes 3a, 3b, and 3d (sears, 1944 figure 3a ). this network is displayed in the 155 network view which provides interactive features to hide or add specific evidence types from the 156 network. nodes are displayed in a defined set of shapes, colors and sizes to distinguish different 157 types of evidence. a shadow effect on nodes indicates that more information is available but has 158 been hidden. the auto-generated network, however, is not yet telling a story that is specific to our 159 traits of interest and is limited to evidence that is phenotypic in nature. 160 to further refine and extend the search for evidence that links tt2 to grain color and phs, we can 162 provide additional keywords relevant to the traits of interest. seed germination and dormancy are 163 the underlying developmental processes that activate or prevent pre-harvest sprouting in many 164 8 grains and other seeds. the colour of the grain is known to be determined through accumulation of 165 proanthocyanidin, an intermediate in the flavonoid pathway, found in the seed coat. these terms 166 and phrases can be combined using boolean operators (and, or, not) and used in conjunction 167 with a list of genes. thus, we search for traescs3d02g468400 (tt2) and the keywords: "seed 168 germination" or "seed dormancy" or color or flavonoid or proanthocyanidin. this time, 169 knetminer filters the extracted tt2 knowledge network (823 nodes) down to a smaller subgraph of 170 68 nodes and 87 relations in which every path from tt2 to another node corresponds to a line of 171 evidence to phenotype or molecular characteristics based on our keywords of interest ( figure 3b ). overall the exploratory link analysis has generated a potential link between grain color and phs 193 due to tt2-mft interaction and suggested a new hypothesis between two traits (phs and root 194 hair density) that were not part of the initial investigation and previously thought to be unrelated. 195 furthermore, it raises the possibility that tt2 mutants might lead to increased root hairs and to 196 higher nutrient and water absorption, and therefore cause early germination of the grain. more data 197 and experiments will be needed to address this hypothesis and close the knowledge gap. biologists would generally agree to be informative when studying the function of a gene. searching 255 a kg for such patterns is akin to searching for relevant sentences containing evidence that 256 supports a particular point of view within a book. such evidence paths can be short e.g. gene a 257 was knocked out and phenotype x was observed; or alternatively the evidence path can be longer, 258 e.g. gene a in species x has an ortholog in species y, which was shown to regulate the 259 expression of a disease related gene (with a link to the paper). in the first example, the relationship 260 between gene and disease is directly evident and experimentally proven, while in the second 261 12 example the relationship is indirect and less certain but still biologically meaningful. there are 262 many evidence types that should be considered for evaluating the relevance of a gene to a trait. in 263 a kg context, a gene is considered to be, for example, related to 'early flowering' if any of its 264 biologically plausible graph patterns contain nodes related to 'early flowering'. in this context, the 265 word 'related' doesn't necessarily mean that the gene in question will have an effect on 'flowering shown to a user; let alone if combining gcss for tens to hundreds of genes. there is therefore a 293 need to filter and visualise the subset of information in the gcss that is most interesting to a 294 specific user. however, the interestingness of information is subjective and will depend on the 295 biological question or the hypothesis that needs to be tested. a scientist with an interest in disease 296 biology is likely to be interested in links to publications, pathways, and annotations related to 297 diseases, while someone studying the biological process of grain filling is likely more interested in 298 links to physiological or anatomical traits. to reduce information overload and visualise the most 299 interesting pieces of information, we have devised two strategies. 1) in the case of a combined 300 gene and keyword search, we use the keywords as a filter to show only paths in the gcs that 301 connect genes with keyword related nodes, i.e. nodes that contain the given keywords in one of 302 their node properties. in the special case where too many publications remain even after keyword 303 filtering, we select the most recent n publications (default n=20). nodes not matching the keyword 304 are hidden but not removed from the gcs. 2) in the case of a simple gene query (without 305 additional keywords), we initially show all paths between the gene and nodes of type 306 phenotype/trait, i.e. any semantic motif that ends with a trait/phenotype, as this is considered the 307 most important relationship to many knetminer users. 308 gene ranking 309 we have developed a simple and fast algorithm to rank genes and their gcs for their importance. 310 we give every node in the kg a weight composed of three components, referred to as sdr, 311 standing for the specificity to the gene, distance to the gene and relevance to the search terms. 312 specificity reflects how specific a node is to a gene in question. for example, a publication that is 313 cited (linked) by hundreds of genes receives a smaller weight than a publication which is linked to 314 one or two genes only. we define the specificity of a node x as: where n is the 315 frequency of the node occurring in all n gcs. d i s t a n c e assumes information which is associated 316 more closely to a gene can generally be considered more certain, versus one that's further away, 317 e.g. inferred through homology and other interactions increases the uncertainty of annotation 318 14 propagation. a short semantic motif is therefore given a stronger weight, whereas a long motif 319 receives a weaker weight. thus, we define the second weight as the inverse shortest path distance 320 of a gene g and a node x: both weights s and d are not influenced by the 321 search terms and can therefore be pre-computed for every node in the kg. relevance reflects the 322 relevance or importance of a node to user-provided search terms using the well-established 323 measure of inverse document frequency (idf) and term frequency (tf) (salton & yang, 1973 we define the knetscore of a gene as: 330 the sum considers only gcs nodes that contain the search terms. in the absence of search terms, 331 we sum over all nodes of the gcs with r=1 for each node. the computation of the knetscore 332 biologists, such as tables and chromosome views, allowing them to explore the data, make 370 choices as to which gene to view, or refine the query if needed. these initial views help users to 371 reach a certain level of confidence with the selection of potential candidate genes. however, they 372 16 do not tell the biological story that links candidate genes to traits and diseases. in a second step, to 373 enable the stories and their evidence to be investigated in full detail, the network view visualises 374 highly complex information in a concise and connected format, helping facilitate biologically 375 meaningful conclusions. consistent graphical symbols are used for representing evidence types 376 throughout the different views, so that users develop a certain level of familiarity, before being 377 exposed to networks with complex interactions and rich content. scientists spend a considerable amount of time searching for new clues and ideas by synthesizing 397 many different sources of information and using their expertise to generate hypotheses. knetminer 398 is a user-friendly platform for biological knowledge discovery and exploratory data mining. it allows 399 humans and machines to effectively connect the dots in life science data and literature, search the 400 17 connected data in an innovative way, and then return the results in an accessible, explorable, yet 401 concise format that can be easily interrogated to generate new insights. we discovering protein drug targets using 563 the monarch initiative: an integrative data and analytic platform connecting 568 phenotypes to genotypes across species a wheat homolog of mother of ft and tfl1 acts in the regulation 570 of germination zur kenntnis der mit der keimungsphysiologie des weizens in 573 zusammenhang stehenden inneren faktoren bioinformatics meets user-centred design: a perspective meta-analysis of the heritability of human traits based on fifty 579 years of twin studies information retrieval in the workplace: a 581 comparison of professional search practices progress in biomedical knowledge discovery: a 25-year 584 on the specification of term values in automatic indexing cytogenetic studies with polyploid species of wheat knowledge graphs and knowledge networks: the 590 story in brief knetmaps: a biojs component to visualize 592 biological knowledge networks identification of loci 594 governing eight agronomic traits using a gbs-gwas approach and validation by qtl 595 mapping in soya bean big data: astronomical or genomical? sensitivity to "sunk costs" in mice, rats, and humans iwgsc 605 whole-genome assembly principal investigators whole-genome sequencing and assembly shifting the limits in wheat research 608 and breeding using a fully annotated reference genome trend analysis of knowledge graphs for crop pest and diseases mother of ft and tfl1 regulates seed germination 613 through a negative feedback loop modulating aba signaling in arabidopsis use of graph database for the integration allelic variation and transcriptional isoforms of wheat tamyc1 gene 618 regulating anthocyanin synthesis in pericarp the authors declare that they have no competing interests. key: cord-155475-is3su3ga authors: kalogeratos, argyris; mannelli, stefano sarao title: winning the competition: enhancing counter-contagion in sis-like epidemic processes date: 2020-06-24 journal: nan doi: nan sha: doc_id: 155475 cord_uid: is3su3ga in this paper we consider the epidemic competition between two generic diffusion processes, where each competing side is represented by a different state of a stochastic process. for this setting, we present the generalized largest reduction in infectious edges (glrie) dynamic resource allocation strategy to advantage the preferred state against the other. motivated by social epidemics, we apply this method to a generic continuous-time sis-like diffusion model where we allow for: i) arbitrary node transition rate functions that describe the dynamics of propagation depending on the network state, and ii) competition between the healthy (positive) and infected (negative) states, which are both diffusive at the same time, yet mutually exclusive on each node. finally we use simulations to compare empirically the proposed glrie against competitive approaches from literature. in recent years, the growing amount of available data on networks led to a revolution in the application of diffusion processes. the enrichment of analysis by means of detailed information regarding specific populations yielded to a plethora of realistic and accurate models. through diffusion models, it is possible to study disparate branches of knowledge: in economy (competition among products [1] , viral marketing campaigns [2] ), in epidemiology (disease spreading, vaccination and immunization problems), in computer science (computer viruses, information flow), in social sciences (social behavior [3] ) and medicine (obesity diffusion [4] , smoking cessation [5] , alcohol consumption [6] ) are just some instances. a large number of social behaviors can be modeled as states propagating over networks [3, 4, 5, 7] . consequently to the availability of diffusion models, many intervention strategies were developed aiming to answer questions like: what are the most dangerous computers in a network? how to maximize the customer awareness for a product? on which individuals is better to focus to win a poll? few studies proposed strategies to advantage a state compare to another (like in marketing campaigns) or to mitigate the diffusion of an undesirable state (like in epidemiology). most of them are static strategies based on the network structure (e.g. [8] ), while others are dynamic strategies that use the whole information about the current state of the system to suggest the best elements to treat. among them, the largest reduction in infectious edges (lrie) [9] results to be the optimal greedy algorithm for resource allocation under limitations in the resource budget, in the n -intertwined susceptible-infected-susceptible (sis) epidemic model. this model is a two-state continuous-time markov process over a network, in which a node can change state according to a transition rate that is linear in its neighbors' states. however sis models have been deemed too simple to describe the complexity of real-world phenomena such as the contemporary presence of two distinct viruses which spread on the same network. in particular, there can be considered two possible cases; in the first an individual can be infected simultaneously by both diseases (e.g. as in the si 1 i 2 s model [1] ) and, in the second, mutual exclusivity only one infection is allowed for each individual at a given time (e.g. si 1|2 s model [10] ). other attempts tried to change the dynamical equations (like in the sisa [3, 7] ). in this study, we propose the generalized largest reduction in infectious edges (glrie) strategy, which is adapted for the diffusion competition of recurrent epidemics, as well as nonlinearity and saturation of the functions of node transition rates. this strategy includes the lrie strategy [9] and, as such, provides an optimal greedy approach for this more sophisticated network diffusion setting. glrie computes a node score using only local information about the state of close-by nodes. although in the present formulation the method can be applied to any two-state recurrent markov process and is easily generalizable to more states, in this work we focus on social behaviors that can be 'healthy' or 'unhealthy' with negative effects in the social environment. given a limited amount of resources we would like to target the few key-individuals so as to minimize the negative effects. apart from the mentioned habits affecting one's personal health (e.g. unhealthy diet, smoking, etc.), the recent covid-19 pandemic highlighted yet another interesting 'unhealthy' misbehavior: the disrespect of confinement under a city lock-down, or of social distantiation guidelines in general. indeed, this kind of misbehavior is a determinant factor for the reproduction rate (the infamous r t ) of an epidemic over time, and can be readily enforced by making more controls in key areas, or using mobility and contact information at individual level. a graph g = (v, e) is a set of nodes v, let n = |v|, endowed with a set of edges e ⊂ v × v. it can be intuitively represented by its adjacency matrix a = {0, 1} n , where each a ij element is 1 if (i, j) ∈ e, and 0 otherwise. without loss of generality, we refer to undirected graphs without self-loops, i.e. a = a t and a ii = 0, ∀i = 1, ..., n . the neighborhood of node i is the set of all nodes connected to it with a direct edge, and is denoted by n i = {i k , ∀k ∈ {1, ..., d i } : (i k , i) ∈ e}. the size of n i equals to the node degree, i.e. |n i | = d i = j a ji . we also denote the indicator function by 1{·}. the standard continuous-time homogeneous sis model describes the spread of a disease over a graph, where each node i represents an individual that can be in either the susceptible or the infected state: x i (t) = 0 or 1, respectively). the system at time t is hence globally represented by the node state vector x(t) ∈ {0, 1} n . the state of a specific node i evolves according to the following stochastic transition rates: where the parameters β, δ are the transition rates encoding respectively the infection aggressiveness and self-recovery capability of nodes. the epidemic control is realized by the resource allocation vector r(t) ∈ {0, 1} n , whose coordinate r i (t) = 1 iff we heal node i at time t, and 0 otherwise. finally, ρ is the increase in recovery rate when a node receives a resource unit (thought as treatment). a generic two-state recurrent model. in this paper we study the dynamic epidemic suppression problem by first introducing the following generic two-states markovian process: (ii.2) i i and h i are two node-specific memoryless functions; respectively the infection rate function and recovery rate function for node i. the rate functions depend on the current overall network state x(t) and implicitly on the network structure (we omit this dependency in our notation). a markovian poisson process can be recovered using the rate functions of eq. ii.2 as follows: in the dynamic resource allocation (dra) problem [9, 11] , the objective is to administer a budget of b treatment resources, each of them of strength ρ, in order to suppress an undesired states diffusion. the treatments can not be stored and their efficiency is limited to a certain value. in [9] , a greedy dynamic score-based strategy is developed, called largest reduction of infectious edges (lrie), in order to address the dra problem. specifically, each node is associated with a score quantifying how critical it is for further spreading the infection of the standard sis model, eq. (ii.1). other score-based solutions have been proposed, e.g. based on fixed priority planning [11] , or static ones based on spectral analysis [8] (see details in sec. iv). the proposed generalized largest reduction in infectious edges (glrie) strategy, each time identifies and targets the most critical nodes in order reduce the disease in as quickly as possible. the idea generalizes the one introduced in [9] as it to a wider range of models. let n i (t) . = i x i (t) be the number of infected nodes at time t. in a markovian setting, given the state of the population x at time t, the best intervention with respect to the resource allocation vector r would minimize the following cost function: where γ can be chosen so as to give emphasis on short-term effects [12, 13] . expanding in series with respect to u, yields: the detailed evaluation of the three terms can be found in the supplementary material 1 . here, though, we present the final results. to simplify our notation, we denote the updated transition rate of node j if node i is considered healthy, respectively: for the positive diffusion by we define accordingly the differences in these rates: then, the final forms of the derivatives are: in the third equation, and since our purpose is to minimize eq. iii.1 with respect to r i , we let the terms that are independent to any r i to get absorbed in the function ξ(t). the terms of the expansion provide information about the way in which healing a node affects the cost function: the first order does not provide any new information, the second order suggests something as trivial as to heal only infected nodes, while the third order quantifies the contribution of healing a specific node in reducing the cost function. based on eq. iii.4 we derive the following score for each infected node i: the score has the following interpretation. we can identify two main parts: the quantification of the transition rate h i + i i of the node, and the effect of its recovery on the neighbors on the one hand, if a node could get easily reinfected (high i i value) or is going to be healed rapidly by either the self-recovery or the positive diffusion (high h i value), then it is not a good candidate to invest resources on. on the other hand, if a possible node recovery would largely increase the healing rate of its infected neighbors (low ∆h −i j value), then the node is attributed with a higher score. finally, if a possible node recovery would largely decrease the infectious rate of its infected neighbors (low ∆i −i j value), then the node gets also higher score. algorithm. at time t, the glrie strategy would take as input the network state x(t)and the budget of resources b. it would independently compute the criticality score of eq. (iii.5) for each node, rank them and finally note with 1's in the resource allocation vector r(t) which nodes to target while respecting the budget, i.e. i r(t) = min(b, i x i (t))). the computational cost of the algorithm is o(n 2 + n log n ). in this section we select competitors from the literature, define specific diffusion functions for the comparison, and present simulations on random and real networks. other strategies. as a naive baseline, we use the random allocation (rand) that targets infected nodes at random. the second competitor is the largest reduction in spectral radius (lrsr) [8] , which is based on spectral graph analysis generalized to arbitrary healing effects (ρ = ∞). lrsr selects nodes that maximize the eigen-drop of the largest eigenvalue of the adjacency matrix, known as spectral radius. the next competitor is the maxcut minimization (mcm) [11] , which introduces the priority planning approach. the strategy proceeds according to a precomputed node priorityorder, that is a linear arrangement of the network with minimal maxcut, i.e. maximum number of edges need to be cut in order to split the ordering in two parts. the last but most direct competitor is the greedy dynamic lrie [9] that we generalize in this work. diffusion function. generally, the state transition rate for a node can be assumed to be a function either of the absolute number of neighbors in the opposing state (standard for sis), or of the fraction of those nodes our of all neighbors. here we take as an example the former option, as the strategies presented in the literature consider that type and it is therefore a fair comparison. future work could include additional experiments with the latter type. social behaviors have complex properties that are not covered by the standard sis models, such as non-linearity and saturation in the node transition rates [7] . we employ sigmoid functions to model these model properties: where n and d − n are the number of infected and healthy neighbors. also, s i (resp. s h ) parameter controls the saturation level and i (resp. h ) the slope at the origin. next, we present comparative experiments in erdös-rényi (er), preferential attachment (pr), and small-world (sw) random networks of size 300 nodes each. first we gradually introduce non-linearity in the diffusion, and then we show the effects of introducing also competition diffusion. from linear to non-linear spreading. we first consider only the negative diffusion (i.e. h = 0) and we gradually increase i in an er random graph, moving gradually from linear (as in the standard sis model) to non-linear functions. fig. 1 shows the average over 1,000 simulations of the percentage of infected nodes over time and the 95% confidence interval under the hypothesis of gaussian distribution. the results show that in the presence of non-linearity our strategy becomes much more efficient than the competitors. introducing competition. next, in fig. 2 we present the effects of the positive diffusion, embedded in the function h, on er, pr, and sw random networks. the last plot each row shows the shape of the diffusion functions used in the simulations. the simulations show that, unlike glrie, the methods of the literature lack modeling power to deal with this complex setting involving non-linearity and competition, and suppress the infection. we performed simulations on the gnutella 2 peer-to-peer network containing 8,846 nodes and 31,839 edges. two scenarios were used for the simulations, with and without positive diffusion, using a wide range of parameters. out of the many possible evaluation metrics for the quality of a strategy, e.g. expected extinction time (eet), final percentage of infection (fis), area under the curve (auc), we choose the auc. this has many advantages: it provides useful measurements even if the strategy did not removed the infection, which is a limitation of the eet metric; it accounts for the total amount of infected nodes in the process, which in a socioeconomic context is more interesting than the fis metric. the empirical comparison between glrie and competitors such as lrie and mcm is summarized in fig. 3 in each heatmap, we fix the shape of the transition function h of the positive diffusion and only play with the parameters of the function i of the negative diffusion: its saturation level increases along the x-axis and its slope increases along the y-axis. on the top-left side of a heatmap, the epidemic parameters define a weak infection and any strategy would perform well, while on the bottom-right side the infection becomes hard to completely remove for all strategies (given the amount of resources). moreover, in the left border, the low saturation level causes i to already saturate with just one neighbor of the opposing state. in the regime where i is almost linear and h = 0, glrie and lrie are equivalent and perform almost the same. the general remark on the results is that glrie appears to be the most versatile and best performing strategy in this setting of competitive spreading. in this paper we discussed a general form of recurrent two-states continuous-time markov process that allows both non-linear node transition functions and competition among the two states. we then proposed the generalized lrie (glrie) strategy to suppress the diffusion of the undesired state. experiments showed that glrie is well-adapted to the considered setting of competitive spreading, and makes better use of the resources available by targeting the most critical infected nodes compared to competitors from literature. future work could generalize to more competing epidemic states and the incorporation of factors related to the network structure in the node scores. winner takes all: competing viruses or ideas on fair-play networks the dynamics of viral marketing emotions as infectious diseases in a large social network: the sisa model the spread of obesity in a large social network over 32 years the collective dynamics of smoking in a large social network the spread of alcohol consumption behavior in a large social network infectious disease modeling of social contagion in networks on the vulnerability of large graphs a greedy approach for dynamic control of diffusion processes in networks interacting viruses in networks: can both survive? suppressing epidemics in networks using priority planning optimal control of epidemics in metapopulations optimizing the control of disease infestations at the landscape scale key: cord-102394-vk4ag44m authors: zhang, hai-feng; xie, jia-rong; chen, han-shuang; liu, can; small, michael title: impact of asymptomatic infection on coupled disease-behavior dynamics in complex networks date: 2016-08-14 journal: nan doi: 10.1209/0295-5075/114/38004 sha: doc_id: 102394 cord_uid: vk4ag44m studies on how to model the interplay between diseases and behavioral responses (so-called coupled disease-behavior interaction) have attracted increasing attention. owing to the lack of obvious clinical evidence of diseases, or the incomplete information related to the disease, the risks of infection cannot be perceived and may lead to inappropriate behavioral responses. therefore, how to quantitatively analyze the impacts of asymptomatic infection on the interplay between diseases and behavioral responses is of particular importance. in this letter, under the complex network framework, we study the coupled disease-behavior interaction model by dividing infectious individuals into two states: u-state (without evident clinical symptoms, labelled as u) and i-state (with evident clinical symptoms, labelled as i). a susceptible individual can be infected by uor i-nodes, however, since the u-nodes cannot be easily observed, susceptible individuals take behavioral responses emph{only} when they contact i-nodes. the mechanism is considered in the improved susceptible-infected-susceptible (sis) model and the improved susceptible-infected-recovered (sir) model, respectively. then, one of the most concerned problems in spreading dynamics: the epidemic thresholds for the two models are given by two methods. the analytic results emph{quantitatively} describe the influence of different factors, such as asymptomatic infection, the awareness rate, the network structure, and so forth, on the epidemic thresholds. moreover, because of the irreversible process of the sir model, the suppression effect of the improved sir model is weaker than the improved sis model. introduction. -many epidemic models have been proposed to enhance our understanding of infectious disease dynamics [1] , however, these mathematical models were often established with static parameters. in reality, outbreak of infectious diseases can trigger the behavioral responses toward diseases, which can further affect the epidemic dynamics. that is to say, the parameters in epidemic models should not be static but dynamic [2] . therefore, how to establish coupled disease-behavior interaction models to evaluate the interplay between disease dynamics and behavioural responses is becoming a hot field [2] [3] [4] [5] [6] . there are several key challenges that should be answered in this field [2] : how to incorporate behavioural changes in models of infectious disease dynamics; how to inform measurement of relevant behaviour to parameterise such models; and how to determine the impact of behavioural changes on observed disease dynamics. along this line, some researchers have already obtained meaningful results. for example, funk et al. [7] have revealed that in a well-mixed population, awareness of epidemics can lead to a lower prevalence of epidemics, but cannot alter the epidemic threshold. kiss et al. have investigated the impact of information transmission on epidemic outbreaks, and they found that infection can be eradicated if the dissemination of information is fast enough [8] . perra et al. considered the self-initiated social distancing into classical sir model, and they found a rich phase space with multiple epidemic peaks and tipping points [9] . previous epidemic models were established in wellmixed populations, however, the transmission of many infectious diseases requires direct or close contact between individuals. as a result, the network-based epidemic models have been extensively investigated [10, 11] . in particular, studies on how to characterize the interplay between epidemic dynamics and behavioral dynamics within network framework have attracted myriad attention recently [4, 12, 13] . more importantly, many new and interesting results can be revealed when considered in complex networks. for instance, refs. [12, 14, 15] have demonstrated that, under voluntary vaccination mechanism, degree heterogeneity of the network can trigger a broad spectrum of individual vaccinating behavior, where hub nodes are most likely to choose to be vaccinated -since they are at greatest risk of infection. some authors also have shown that local information based behavioral responses can enhance the epidemic threshold and reduce the prevalence of an epidemic [16, 17] , yet, global information based responses cannot alter the epidemic threshold but affect the prevalence of an epidemic [13, 17] . since the manner of the diffusion of awareness is quite different from the mechanism of epidemic spreading, the coupled disease-behavior interaction models in multiplex networks were also investigated [18] [19] [20] . in addition, individuals may change their connections (remove or rewire) when facing the outbreaks of epidemics, so the epidemic dynamics in adaptive networks were considered, and some interesting phenomena, such as assortative degree correlation of evolving network, oscillations, hysteresis and first order transitions can be observed [21] [22] [23] . on one hand, for many infectious diseases, such as h1n1 influenza [24] , severe acute respiratory syndrome (sars) [25] , human immunodeficiency virus (hiv) [26] , even once individuals have been infected by one kind of disease, they have no evident clinical symptoms, i.e., they are asymptomatic patients; on the other hand, it is difficult for individuals to obtain timely and accurate information related to the diseases. the asymptomatic infection and the incomplete information can affect the behavioral responds towards diseases. thus, to quantitatively analyze the effects of asymptomatic infection on the epidemic dynamics, in this letter, we introduce a new compartment-u-state (without evident clinical symptoms, i.e., asymptomatic patients. it is noticed that the u-state is different from the e-state in the standard seir model, where e-state is asymptomatic but not infectious, whereas u-state is asymptomatic and infectious) into coupled disease-behavior interaction models. in the model, we assume that susceptible individuals can be infected by the asymptomatic and symptomatic patients. nevertheless, since individuals cannot perceive the risks from asymptomatic patients, they alter their behaviors only when they contact symptomatic patients. then consider this mechanism into the improved sis model and the improved sir model, respectively. the epidemic thresholds for the two cases are analytically obtained and also verified by numerical simulations. the analytical results can quantitatively describe the impacts of different factors on the epidemic threshold. in addition, we find that the parameters used in models have more significant impact on the sis model than on the sir model owing to the reversible process of the sis model. our findings may partially answer the key challenges mentioned in the first paragraph proposed in [2] . descriptions of the model. -sauis model. for the classical sis process in complex networks, where each node in the network can be in one of two states: susceptible (s) or infected (i). the infection rate along each si link is β, and an infected node can go back to the s-state with a recovery rate µ. in our improved sis model (named sauis model), the i-nodes are divided into two different states: u-state (asymptomatic i-nodes) and i-state (symptomatic i-nodes). all new infected nodes first go to the u-state and then enter the i-state with rate δ. larger value of δ means the faster transition from ustate to i-state. also a new compartment-awareness (a) state is introduced to consider the behavioral responses for the s-nodes. in detail, an s-node can be infected by an u-neighbor or i-neighbor with infection rate β, and the s-node may go to the a-state when s/he contacts an ineighbor with awareness rate β f . for an a-node, which can also be infected by an u-neighbor or i-neighbor, but with a lower infection rate γβ with discount rate 0 ≤ γ < 1. i-nodes recover to s-state with recovery rate µ. the transition diagram is depicted in the upper panel of fig. 1 . sauir model. similar to the sauis model, an improved sir model (named sauir model) is used to mimic the coupled disease-behavior interaction model. the main difference between the sauir model and the sauis model is that, for an sauir mdoel, i-nodes recover to rnodes with recovery rate µ and cannot be infected again. therefore, the sauir model is an irreversible process but the sauis model is a reversible process, which yields different theoretical methods to deal with their epidemic thresholds and the results. the transition diagram of the sauir model is depicted in the lower panel of fig. 1 . let s k (t), a k (t), u k (t) and i k (t) be the densities of s-nodes, anodes, u-nodes and i-nodes of degree k at time t, respectively. by using the degree-based mean-field to the sauis model, the densities satisfy the following differential equations [27] : impact of asymptomatic infection on coupled disease-behavior dynamics . an s-node can be infected by an u-or i-neighbor with rate β and becomes an u-node. an s-node can also become aware of infection and goes to a-state with rate βf when contacting an i-neighbor. an a-node can be infected by an u-or i-neighbor with rate γβ and becomes an u-node. u-nodes become i-nodes with rate δ. for the sauis model, the i-nodes recover to s-nodes with rate µ, however, the i-nodes recover to r-nodes with rate µ and never being infected for the sauir model. where θ 1 (t) and θ 2 (t) represent the probabilities that any given link points to an u-node and i-node, respectively. in absence of any degree correlations, with p (k) being the degree distribution of networks and k = k kp (k). the first term in the right side of eq. (1) accounts for the loss of s-nodes of degree k who are infected by u-neighbors and i-neighbors, and the second term represents that the s-nodes of degree k become a-nodes when contacting i-neighbors, and the third term denotes the recovery of i-nodes of degree k. the meanings of the other terms in eqs. (2)-(4) can be explained in a similar way. at the steady state, by imposing eqs. (1) -(4) to be zero, one has and since s k + a k + u k + i k ≡ 1, by setting di k (t) dt = 0 in eq. (4), we obtain: from eqs. (5)(7), the following equation can be obtained: which induces the following self-consistent eq. (9): the value θ 2 = 0 is always a solution of eq. (9). in order to has a non-zero solution, the condition should be satisfied, where k 2 = k k 2 p (k). by defining f (β) as the epidemic threshold β c is the point at which f (β) just passes through the horizontal when other parameters are fixed (see the insets of fig.2 and fig.3 ). epidemic threshold of sauir model. for our sauir model, on one hand, the epidemic threshold is very difficult or impossible to obtain by solving the meanfield based differential equations; on the other hand, in ref. [28] , we have demonstrated that the mean-field method may yield incorrect results when it is used in modified sir model. instead, the cavity theory is used to obtain the epidemic threshold for the sauir model [29] . in our model, during a sufficiently small time interval dt, the transition rates of an su edge becoming an uu edge and an si edge are β and δ, respectively. as a result, the probabilities of an su edge becoming an uu edge and an si edge are t 11 = β δ+β and t 12 = δ δ+β , respectively. similarly, since the transition rates of an si edge becoming an ai, ui, sr edge are β f , β and µ, respectively. therefore, the corresponding probabilities of si → ai, si → u i and si → sr are t 21 = βf β+βf +µ , t 22 = β β+βf +µ and t 23 = µ β+βf +µ . in addition, the probabilities of ai → u i and ai → ar are t 31 = γβ γβ+µ and t 32 = µ γβ+µ , respectively. near the epidemic threshold, the number of infected nodes is very few, statistically, so that each node has one infected neighbor at most. under such a situation, only the following infection events can happen: 1) an s-node is infected by an u-node or an i-node; or 2) an s-node becomes an a-node and then is infected by the i-node again. however, the probability of an a-node being infected by an u-neighbor is negligible. for an a-node, there must be an i-neighbor to make the s-node become an a-node. according to our statement: each node has one infectious neighbor at most, so there has no u-neighbors again, inducing negligible effect of the probability of au → u u . therefore the infection probability t is given as following the method proposed by newman et al. [29] , we first define"externally infected neighbor" (ein) for any node. for a node i, if a neighbor j is an ein then node j is infected by a neighbor other than i. let θ be the probability that, for any node i, whose randomly selected neighbor j is an ein of i. near the epidemic threshold, the number of infected nodes is very few, indicating that the value of θ is a very small too. for a node i, the excess degree distribution of neighbor j is q(k) = (k+1)p (k+1) k . since the probability of each neighbor of j (excluding node i) being an ein is θ too, and each ein neighbor of j can infect j itself with probability t . therefore, the probability of j being infected by all ein is kt θ, then by summing all degree classes with q(k), the probability of node j being the ein of node i is determined by a self-consistent equation with g 1 (x) = k q(k)x k . from eq. (12) we have the threshold condition: t g ′ 1 (1) = 1, which indicates the following equation should be satisfied for the outbreak of an epidemic. similarly, by defining f (β) as the epidemic threshold β c is determined by the intersection the curve of f (β) and the horizontal when other parameters are fixed. simulation results. -in this section, we perform an extensive set of monte carlo simulations to validate the theoretical predictions in section . here we implement simulations on configuration networks generated by an uncorrelated configuration model (ucm) [30] (we also implemented the models on erdős-rényi networks, and found the same results as on the ucm network [31] ). the network contains n = 10000 nodes and the degree distribution meets p (k) ∼ k −3 , whose minimal and maximal degrees are k min = 3 and k max = √ n , respectively. in ref. [32] , the susceptibility measure is defined by ferreira et al. to numerically predict the epidemic threshold for the sis model, where ρ denotes the prevalence of epidemic in one simulation realization. the peak of χ corresponds to the epidemic threshold. in ref. [33] , shu et al. proved that the susceptibility measure is not a good measure to determine the epidemic threshold of sir model, they then defined the variability measure to numerically determine the epidemic threshold of the sir model. their results suggest that the variability measure can predict the epidemic threshold of the sir model well. in view of this, in this section, we use these two measures to numerically determine the epidemic thresholds of the sis model and the sir model, respectively. in our simulations, we have taken at least 1000 independent realizations to predict the epidemic threshold. without loss of generality, in this study, we set the recovery rate µ = 1.0. for sauis model, the final infected density i(∞) and the susceptibility measure χ are plotted as functions of β for different cases in fig. 2 . in general, by comparing the top panels with the bottom panels, one can see that χ can predict the epidemic threshold β c well. the theoretical values of β c obtained from eq. (10) for different cases are given in the insets, which indicate that the theoretical results are in good agreement with the simulation results. since increasing the awareness rate β f can induce more s-nodes to become a-nodes, as a result, the infection rate is reduced. therefore, as shown in figs. 2(a) and (d), increasing β f can effectively reduce the final epidemic size and enhance the epidemic threshold. also, lowering the infection rate of a-nodes (smaller values of γ) significantly reduces the epidemic size and enhances the epidemic threshold (figs. 2(b) and (e)). in particular, the effects of u-nodes on i(∞) and β c are presented in is increased. with the increase of δ, u-nodes cannot persist for a long time and quickly enter i-state, s-nodes have more chances to become a-nodes because the i-nodes can be easily perceived by s-nodes. hence, we can understand why increasing the value of δ can effectively suppress the outbreak of epidemics. we can also conclude that the existence of asymptomatic patients or un-timely information can hinder the behavioral responses of people, which can weaken the suppression effects of behavioral responses on disease controls. for the sauir model, the final infection density r(∞) and the variability measure ∆ versus transmission rate β for different cases are summarized in fig. 3 . obviously, the peak of ∆ gives the accurate validation of the epidemic threshold β c , and the theoretical values from eq. (14) (the insets) are in accordance with the numerical results. moreover, just like the results in fig. 2 , increasing the values of β f and δ, or reducing the value of γ can reduce the final epidemic size and enhance the epidemic threshold. nevertheless, by comparing fig. 2 with fig. 3 , we find that the suppression of behavioral responses on epidemics for the sauir model is worse than the case of the sauis model, especially for the impact of β f (figs. 3(a) and (d)) and γ ( fig. 3(b) and(e)). as for the sauir model, the epidemic process ends quickly owing to its irreversible nature of the model, which causes individuals have no sufficient time to take behavioral responses. conclusions. -in this letter we have studied the coupled disease-behavior interaction model in complex networks by dividing infectious individuals into asymptomatic (u-state) and symptomatic individuals (i-state). then the epidemic thresholds for the improved sis model and sir model were obtained by using different theoretical methods. the analytic results for the epidemic thresholds can exactly show how great impacts of u-state, network structure, awareness rate, and so forth, have on the epidemic dynamics. in addition, because of the irreversible process of the sauir model, the suppression effect of behavioral responses on disease control is not as good as the case of the sauis model. our findings provide a typical example to emphasize the importance of incorporating human behavioral response into epidemic models, and also partially offer a theoretical tool to quantify the impacts of behavioral responses. * * * mathematical models in population biology and epidemiology proceedings of the national academy of sciences key: cord-328875-fgeudou6 authors: leung, alexander k. c.; davies, h. dele title: cervical lymphadenitis: etiology, diagnosis, and management date: 2009-04-18 journal: curr infect dis rep doi: 10.1007/s11908-009-0028-0 sha: doc_id: 328875 cord_uid: fgeudou6 cervical lymphadenopathy is a common problem in children. the condition most commonly represents a transient response to a benign local or generalized infection. acute bilateral cervical lymphadenitis is usually caused by a viral upper respiratory tract infection or streptococcal pharyngitis. acute unilateral cervical lymphadenitis is caused by streptococcal or staphylococcal infection in 40% to 80% of cases. common causes of subacute or chronic lymphadenitis include cat-scratch disease and mycobacterial infection. generalized lymphadenopathy is often caused by a viral infection, and less frequently by malignancies, collagen vascular diseases, and medications. laboratory tests are not necessary in most children with cervical lymphadenopathy. most cases of cervical lymphadenitis are self-limited and require no treatment. the treatment of acute bacterial cervical lymphadenitis without a known primary source should provide adequate coverage for both staphylococcus aureus and streptococcus pyogenes. enlarged cervical lymph nodes are common in children [ 1 ] . about 38% to 45% of otherwise normal children have palpable cervical lymph nodes [ 2 ] . cervical lymphadenopathy is usually defi ned as cervical lymph nodal tissue measuring more than 1 cm in diameter [ 3 ] . cervical lymphadenopathy most commonly represents a transient reactive response to a benign local or generalized infection, but occasionally it might herald the presence of a more serious disorder (eg, malignancy). lymphadenitis specifi cally refers to lymphadenopathies that are caused by infl ammatory processes [ 4•• ] . this article reviews the pathophysiology, etiology, differential diagnosis, clinical and laboratory evaluation, and management of children with cervical lymphadenitis. the superfi cial cervical lymph nodes lie on top of the sternomastoid muscle and include the anterior group, which lies along the anterior jugular vein, and the posterior group, which lies along the external jugular vein [ 4•• ] . the deep cervical lymph nodes lie deep to the sternomastoid muscle along the internal jugular vein and are divided into superior and inferior groups. the superior deep nodes lie below the angle of the mandible, whereas the inferior deep nodes lie at the base of the neck. the superfi cial cervical lymph nodes receive afferents from the mastoid, tissues of the neck, and the parotid (preauricular) and submaxillary nodes [ 4•• ] . the efferent drainage terminates in the superior deep cervical lymph nodes [ 4•• ] . the superior deep cervical nodes drain the palatine tonsils and the submental nodes. the lower deep cervical nodes drain the larynx, trachea, thyroid, and esophagus. offending organisms usually fi rst infect the upper respiratory tract, anterior nares, oral cavity, or skin in the head and neck area before spreading to the cervical lymph nodes. the lymphatic system in the cervical area serves as a barrier to prevent further invasion and dissemination of these organisms. the nodal enlargement occurs as a result of proliferation of cells intrinsic to the node (eg, lymphocytes, plasma cells, monocytes, and histiocytes) or by infi ltration of cells extrinsic to the node (eg, neutrophils). because infections involving the head and neck areas are common in children, cervical lymphadenitis is common in this age group [ 5 ] . causes of cervical lymphadenopathy are listed in table 1 [ 1 ]. the most common cause is reactive hyperplasia resulting from an infectious process, typically a viral upper respiratory tract infection [ 6 ] . [ 7 ] . anaerobic bacteria can cause cervical lymphadenitis, usually in association with dental caries and periodontal disease. group b streptococci and haemophilus infl uenzae type b are less frequent causal organisms. diphtheria is a rare cause. bartonella henselae (cat-scratch disease), nontuberculosis mycobacteria (eg, mycobacterium avium-intracellulare, mycobacterium scrofulaceum ), and mycobacterium tuberculosis ("scrofula") are important causes of subacute or chronic cervical lymphadenopathy [ 8 ] . chronic posterior cervical lymphadenitis is the most common form of acquired toxoplasmosis and is the sole presenting symptom in 50% of cases [ 1 ] . more than 25% of malignant tumors in children occur in the head and neck, and the cervical lymph nodes are the most common site [ 1 ] . during the fi rst 6 years of life, neuroblastoma and leukemia are the most common tumors associated with cervical lymphadenopathy, followed by rhabdomyosarcoma and non-hodgkin's lymphoma [ 1 ] . after 6 years of age, hodgkin's lymphoma is the most common tumor associated with cervical lymphadenopathy, followed by non-hodgkin's lymphoma and rhabdomyosarcoma. the presence of cervical lymphadenopathy is an important diagnostic feature for kawasaki disease. the other features include fever lasting 5 days or more, bilateral bulbar conjunctival injection, infl ammatory changes in the mucosa of the oropharynx, erythema or edema of the peripheral extremities, and polymorphous rash. generalized lymphadenopathy might be a feature of systemic-onset juvenile rheumatoid arthritis, systemic lupus erythematosus, or serum sickness. certain drugsnotably phenytoin, carbamazepine, hydralazine, and isoniazid-might cause generalized lymphadenopathy. cervical lymphadenopathy has been reported after immunization with diphtheria-pertussis-tetanus, poliomyelitis, or typhoid fever vaccine [ 1 ] . rosai-dorfman disease is a benign form of histiocytosis characterized by generalized proliferation of sinusoidal histiocytes. the disease usually manifests in the fi rst decade of life with massive and painless cervical lymphadenopathy, often accompanied by fever, malaise, weight loss, neutrophilic leukocytosis, elevated erythrocyte sedimentation rate, and polyclonal hypergammaglobulinemia. kikuchi-fujimoto disease (histocytic necrotizing lymphadenitis) is a benign cause of lymph node enlargement, usually in the posterior cervical triangle [ 9 ] . the condition primarily affects young females. fever, nausea, weight loss, night sweats, arthralgia, myalgia, or hepatosplenomegaly might be present. the etiology of kikuchi-fujimoto disease is unknown, but a viral cause has been implicated [ 9 ] . classical pathologic fi ndings include patchy areas of necrosis in the cortical and paracortical areas of the enlarged lymph nodes and a histiocytic infi ltrate [ 9 ] . the differential diagnosis of neck masses is different in children due to a higher incidence of infectious diseases and congenital anomalies and the relative rarity of malignancies in the pediatric age group. cervical masses in children might be mistaken for enlarged cervical lymph nodes. in general, congenital lesions are painless and are present at birth or identifi ed soon thereafter [ 10 ] . clinical features that may help distinguish the various conditions from cervical lymphadenopathy are as follows. the swelling of mumps parotitis crosses the angle of the jaw. on the other hand, cervical lymph nodes are usually below the mandible [ 1 ] . a thyroglossal cyst is a mass that can be distinguished by its midline location between the hyoid bone and suprasternal notch and the upward movement of the cyst when the child swallows or sticks out his or her tongue. a branchial cleft cyst is a smooth and fl uctuant mass located along the lower anterior border of the sternomastoid muscle. a sternocleidomastoid tumor is a hard, spindle-shaped mass in the sternocleidomastoid muscle possibly resulting from perinatal hemorrhage into the muscle with subsequent healing by fi brosis [ 1 ] . the tumor can be moved from side to side but not upward or downward. torticollis is usually present. cervical ribs are orthopedic anomalies that are usually bilateral, hard, and immovable. diagnosis is established with a radiograph of the neck. a cystic hygroma is a multiloculated, endothelial-lined cyst that is diffuse, soft, and compressible, contains lymphatic fl uid, and typically transilluminates brilliantly. a hemangioma is a congenital vascular anomaly that often is present at birth or appears shortly thereafter. the mass is usually red or bluish. a laryngocele is a soft, cystic, compressible mass that extends out of the larynx and through the thyrohyoid membrane and becomes larger with the valsalva maneuver. there might be associated stridor or hoarseness. a radiograph of the neck might show an air fl uid level in the mass. a dermoid cyst is a midline cyst that contains solid and cystic components. it seldom transilluminates as brilliantly as a cystic hygroma. a radiograph might show that it contains calcifi cations. a detailed history and a thorough physical examination are essential in the evaluation of the child with cervical lymphadenopathy. age of the child some organisms have a predilection for specifi c age groups. s. aureus and group b streptococci have a predilection for neonates; s. aureus , group b streptococci, and kawasaki disease for infants; viral agents, s. aureus , group a β -hemolytic streptococci, and atypical mycobacteria for children from 1 to 4 years of age; and anaerobic bacteria, toxoplasmosis, cat-scratch disease, and tuberculosis for children from 5 to 15 years of age. most children with cervical lymphadenitis are 1 to 4 years of age. the prevalence of various childhood neoplasms changes with age. in general, lymphadenopathy secondary to neoplasia increases in the adolescent age group [ 4•• ] . acute bilateral cervical lymphadenitis is usually caused by a viral upper respiratory tract infection or pharyngitis due to s. pyogenes [ 1 , 11 ] . acute unilateral cervical lymphadenitis is caused by s. pyogenes or s. aureus in 40% to 80% of cases [ 6 , 12 ] . the classical cervical lymphadenopathy in kawasaki disease is usually acute and unilateral. typically, acute suppurative lymphadenitis is caused by s. aureus or s. pyogenes [ 13 ] . subacute or chronic cervical lymphadenitis is often caused by b. henselae , toxoplasma gondii , ebv, cmv, nontuberculosis mycobacteria, and m. tuberculosis [ 1 , 11 ] . less common causes include syphilis, nocardia brasiliensis , and fungal infection. fever, sore throat, and cough suggest an upper respiratory tract infection. fever, night sweats, and weight loss suggest lymphoma or tuberculosis. recurrent cough and hemoptysis are indicative of tuberculosis. unexplained fever, fatigue, and arthralgia raise the possibility of collagen vascular disease or serum sickness. preceding tonsillitis suggests streptococcal infection. recent facial or neck abrasion or infection suggests staphylococcal infection. periodontal disease might indicate infections caused by anaerobic organisms. a history of cat-scratch raises the possibility of b. henselae infection. a history of dog bite or scratch suggests specifi c causative agents such as pasteurella multocida and s. aureus . lymphadenopathy resulting from cmv, ebv, or hiv might follow a blood transfusion. the immunization status of the child should be determined. immunization-related lymphadenopathy might follow diphtheria-pertussis-tetanus, poliomyelitis, or typhoid fever vaccination. the response of cervical lymphadenopathy to specifi c antimicrobial therapies might help to confi rm or exclude a diagnosis. lymphadenopathy might follow the use of medications such as phenytoin and isoniazid. exposure to a person with an upper respiratory tract infection, streptococcal pharyngitis, or tuberculosis suggests the corresponding disease. a history of recent travel should be sought. general malnutrition or poor growth suggests chronic disease such as tuberculosis, malignancy, or immunodefi ciency. all accessible node-bearing areas should be examined to determine whether the lymphadenopathy is generalized. the nodes should be measured for future comparison [ 1 ] . fluctuation in size of the nodes suggests a reactive process, whereas relentless increase in size indicates a serious pathology [ 1 , 14 ] . tenderness, erythema, warmth, mobility, fl uctuance, and consistency should be assessed. the location of involved lymph nodes often gives clues to the entry site of the organism and should prompt a detailed examination of that site. submandibular and submental lymphadenopathy is most often caused by an oral or dental infection, although this feature may also be seen in cat-scratch disease and non-hodgkin's lymphoma. acute posterior cervical lymphadenitis is classically seen in persons with rubella and infectious mononucleosis [ 1 , 11 ] . supraclavicular or posterior cervical lymphadenopathy carries a much higher risk for malignancy than does anterior cervical lymphadenopathy. cervical lymphadenopathy associated with generalized lymphadenopathy is often caused by a viral infection. malignancies (eg, leukemia or lymphoma), collagen vascular diseases (eg, juvenile rheumatoid arthritis or systemic lupus erythematosus), and some medications are also associated with generalized lymphadenopathy. in lymphadenopathy resulting from a viral infection, the nodes are usually bilateral and soft and are not fi xed to the underlying structure. when a bacterial pathogen is present, the nodes can be either unilateral or bilateral, are usually tender, might be fl uctuant, and are not fi xed. the presence of erythema and warmth suggests an acute pyogenic process, and fl uctuance suggests abscess formation. a "cold" abscess is characteristic of infection caused by mycobacteria, fungi, or b. henselae . in patients with tuberculosis, the nodes might be matted or fl uctuant, and the overlying skin might be erythematous but is typically not warm [ 8 ] . clinical features that help differentiate nontuberculosis mycobacterial cervical lymphadenitis from m. tuberculosis cervical lymphadenitis are summarized in table 2 [ 3 , 15 ] . approximately 50% of patients with lymphadenitis caused by nontuberculosis mycobacteria develop fl uctuance of the lymph node and spontaneous drainage; sinus tract formation occurs in 10% of affected patients [ 4•• , 16•• ]. in lymphadenopathy resulting from malignancy, signs of acute infl ammation are absent, and the lymph nodes are hard and often fi xed to the underlying tissue. a thorough examination of the ears, eyes, nose, oral cavity, and throat is necessary. acute viral cervical lymphadenitis is variably associated with fever, rhinorrhea, conjunctivitis, pharyngitis, and sinus congestion [ 4•• ] . a beefy red throat, exudate on the tonsils, petechiae on the hard palate, and a strawberry tongue suggest infection caused by s. pyogenes [ 1 ] . unilateral facial or submandibular swelling, erythema, tenderness, fever, and irritability in an infant suggest group b streptococcal infection [ 13 ] . diphtheria is associated with edema of the soft tissues of the neck, often described as "bull-neck" appearance. the presence of gingivostomatitis suggests infection with hsv, whereas herpangina suggests infection with coxsackievirus [ 11 ] . rash and hepatosplenomegaly suggest ebv or cmv infection [ 4•• ] . the presence of pharyngitis, maculopapular rash, and splenomegaly suggest ebv infection [ 17 ] . conjunctivitis and koplik spots are characteristics of rubeola. the presence of pallor, petechiae, bruises, sternal laboratory tests are not necessary in most children with cervical lymphadenopathy. a complete blood cell count might help to suggest a bacterial lymphadenitis, which is often accompanied by leukocytosis with a shift to the left and toxic granulations. atypical lymphocytosis is prominent in infectious mononucleosis [ 17 ] . pancytopenia, leukocytosis, or the presence of blast cells suggests leukemia. the erythrocyte sedimentation rate and c-reactive protein are usually signifi cantly elevated in persons with bacterial lymphadenitis. blood culture should be obtained if the child appears toxic. a rapid streptococcal antigen test or a throat culture might be useful to confi rm a streptococcal infection [ 18 ] . an electrocardiogram and echocardiogram are indicated if kawasaki disease is suspected. skin tests for tuberculosis should be performed in patients with subacute or chronic adenitis. chest radiography should be performed if the tuberculin skin test is positive or if an underlying chest pathology is suspected, especially in the child with chronic or generalized lymphadenopathy. serologic tests for b. henselae , ebv, cmv, brucellosis, syphilis, and toxoplasmosis should be performed when indicated. if the serology is positive, the diagnosis can be established and excision biopsy can be avoided [ 19• ] . ultrasonography (us) is the most useful diagnostic imaging modality in the assessment of cervical lymph nodes. us may help to differentiate a solid mass from a cystic mass and to establish the presence and extent of suppuration or infi ltration. high-resolution and color us can provide detailed information on the longitudinal and transverse diameter, morphology, texture, and vascularity of the lymph node [ 4•• , 14 ] . a long-to-short axis ratio greater than 2 suggests benignity, whereas a ratio less than 2 suggests malignancy [ 14 ] . in lymphadenitis caused by an infl ammatory process, the intranodal vasculature is dilated, whereas in lymphadenopathy secondary to neoplastic infi ltration, the intranodal vasculature is usually distorted. absence of an echogenic hilus and overall lymph node hyperechogenicity are suggestive of malignancy [ 20• ] . us can also be used to guide core-needle biopsy for diagnosing the cause of cervical lymphadenopathy in patients without known malignancy and may obviate unnecessary excisional biopsy [ 21• ] . advantages of us include cost-effectiveness, noninvasiveness, and absence of radiation hazard. a potential drawback is its lack of absolute specifi city and sensitivity in ruling out neoplastic processes as the cause of lymphadenopathy [ 4•• ] . diffusion-weighted mri with apparent diffusion coeffi cient mapping can be helpful to differentiate malignant from benign lymph nodes and delineate the solid, viable part of the lymph node for biopsy [ 22 ] . the technique also allows detection of small lymphadenopathies. fine-needle aspiration and culture of a lymph node is a safe and reliable procedure to isolate the causative organism and to determine the appropriate antibiotic when bacterial infection is the cause [ 23 ] . failure to improve or worsening of the patient's condition while on antibiotic treatment is an indication for fi ne-needle aspiration and culture [ 4•• ] . all aspirated material should be sent for gram and acid-fast stain and cultures for aerobic and anaerobic bacteria, mycobacteria, and fungi [ 4•• , 24 ] . if the gram stain is positive, only bacterial cultures are mandatory. polymerase chain reaction testing is a fast and useful technique for the demonstration of mycobacterial dna fragments [ 15 ] . an excisional biopsy with microscopic examination of the lymph node might be necessary to establish the diagnosis if symptoms or signs of malignancy are present or if the lymphadenopathy persists or enlarges in spite of appropriate antibiotic therapy and the diagnosis remains in doubt [ 5 ] . the biopsy should be performed on the largest and fi rmest node that is palpable, and the node should be removed intact with the capsule [ 1 , 10 ] . treatment of cervical lymphadenopathy depends on the underlying cause. most cases are self-limited and require no treatment other than observation. this applies especially to small, soft, and mobile lymph nodes associated with upper respiratory infections, which are often viral in origin. these children require follow-up in 2 to 4 weeks. the treatment of acute bacterial cervical lymphadenitis without a known primary infectious source should provide adequate coverage for both s. aureus and s. pyogenes , pending the results of the culture and sensitivity tests [ 5 ] . appropriate oral antibiotics include cloxacillin, cephalexin, cefprozil, or clindamycin [ 6 ] . children with cervical lymphadenopathy and periodontal or dental disease should be treated with clindamycin or a combination of amoxicillin and clavulanic acid, which provide coverage for anaerobic oral fl ora [ 6 , 25 ] . referral to a pediatric dentist for treatment of the underlying periodontal or dental disease is warranted. antimicrobial therapy may have to be modifi ed once a causative agent is identifi ed, depending on the clinical response of the existing treatment. because of its proven effi cacy, safety, and narrow spectrum of antimicrobial activity, penicillin remains the drug of choice for adenitis caused by s. pyogenes , except in patients allergic to penicillin [ 7 ] . methicillin-resistant s. aureus is resistant to many kinds of antibiotics. currently, vancomycin is the drug of choice for complicated cases, although trimethoprimsulfamethoxazole or clindamycin is often adequate for uncomplicated outpatient management [ 26 ] . in most patients, symptomatic improvement should be noted after 48 to 72 hours of therapy. fine-needle aspiration and culture should be considered if there is no clinical improvement or if the patient's condition deteriorates. if the lymph nodes become fl uctuant, incision and drainage should be performed. failure of regression of lymphadenopathy after 4 to 6 weeks might be an indication for a diagnostic biopsy [ 12 ] . indications for early excision biopsy for histology include lymph node in the supraclavicular area, lymph node larger than 3 cm, lymph nodes in children with a history of malignancy, and clinical fi ndings of fever, night sweats, weight loss, and hepatosplenomegaly [ 19• ] . toxic or immunocompromised children and those who do not tolerate, will not take, or fail to respond to oral medication should be treated with intravenous nafcillin, cefazolin, or clindamycin [ 6 ] . oral analgesia with medication such as acetaminophen might help to relieve associated pain. the current recommendation for the treatment of isolated cervical tuberculosis lymphadenitis is 2 months of isoniazid, rifampin, and pyrazinamide, followed by 4 months of isoniazid and rifampin by directly observed therapy for drug-susceptible m. tuberculosis [ 27 ] . if possible drug resistance is a concern, ethambutol or an aminoglycoside should be added to the initial three-drug combination until drug susceptibilities are determined, and an infectious disease specialist should be consulted [ 27 ] . nontuberculosis mycobacterial lymphadenitis is best treated with surgical excision of all visibly infected nodes [ 16•• ] . a recent randomized, controlled trial enrolled 100 children with nontuberculous cervical adenitis to receive surgical excision ( n = 50) or antibiotic therapy with clarithromycin and rifabutin ( n = 50) [ 16•• ] . based on intention-to-treat analysis, the surgical cure rate was 96% versus 66% in the medical arm after 6 months. furthermore, there were more adverse events in the medical arm. however, the major complication of surgery is permanent damage to the facial nerve, which occurred in about 2% of patients. transient facial nerve involvement occurred in another 12% [ 16•• ] . thus, careful consideration must be given to the location of the adenitis in the determination of node removal. when surgery is not feasible due to risk to the facial nerve, a two-drug antimycobacterial regimen that includes a macrolide should be considered [ 6 , 16•• , 28 ] . failure of medical therapy usually cannot be explained as a result of development of resistant organisms [ 16•• ] . cervical lymphadenopathy is a common and usually benign fi nding in children. in most cases, it is infectious in origin secondary to a viral upper respiratory tract infection. a good history and thorough physical examination are usually all that is necessary to establish a diagnosis. most children with cervical lymphadenopathy require no specifi c treatment, but do need follow-up in 2 to 4 weeks. the treatment of acute bacterial cervical lymphadenitis without a known primary infectious source should provide adequate coverage for both s. aureus and s. pyogenes . particular interest, published recently, have been highlighted as: • of importance •• of major importance childhood cervical lymphadenopathy palpable lymph nodes of the neck in swedish schoolchildren lymphadenopathy, lymphadenitis and lymphangitis acute, subacute, and chronic cervical lymphadenitis in children this is an excellent article that addresses the current approaches to the diagnosis and management of cervical lymphadenitis in children cervical lymphadenopathy in children cervical lymphadenopathy and adenitis group a b-hemolytic streptococcal pharyngitis in children mycobacterial cervical lymphadenitis in children: clinical and laboratory factors of importance for differential diagnosis kikuchi's disease: an important cause of cervical lymphadenopathy assessment of lymphadenopathy in children cervical lymphadenitis, suppurative parotitis, thyroiditis, and infected cysts cervical lymphadenopathy management of common head and neck masses cervical lymphadenopathy in children-incidence and diagnostic management mycobacterial cervical lymphadenitis surgical excision versus antibiotic treatment for nontuberculous mycobacterial cervicofacial lymphadenitis in children: a multicenter, randomized, controlled trial this multicenter, randomized, controlled trial compared surgical excision versus antibiotic treatment for nontuberculous myocobacterial cervicofacial lymphadenitis in children infectious mononucleosis rapid antigen detection testing in diagnosing group a b-hemolytic streptococcal pharyngitis a child with cervical lymphadenopathy this excellent article offers practical guidelines on the management of childhood cervical lymphadenopathy ultrasonography of abnormal neck lymph nodes sonographically guided core needle biopsy of cervical lymphadenopathy in patients without known malignancy this retrospective study showed a high yield and accuracy of sonographically guided core-needle biopsy for diagnosing the cause of cervical lymphadenopathy tawfi k a: role of diffusion-weighted mr imaging in cervical lymphadenopathy fine needle aspiration in the evaluation of children with lymphadenopathy cervical lymphadenopathy in children microbiology of cervical lymphadenitis in adults methicillin-resistant staphylococcus aureus: how best to treat now? report of the committee on infectious diseases lymphadenitis due to nontuberculous mycobacteria in children: presentation and response to therapy this article was published in part by leung and robson [ 1 ] in the journal of pediatric health care , with permission from elsevier. it has been signifi cantly updated for the current article. no potential confl icts of interest relevant to this article were reported. key: cord-010739-28qfmj9x authors: sherborne, n.; blyuss, k. b.; kiss, i. z. title: bursting endemic bubbles in an adaptive network date: 2018-04-09 journal: nan doi: 10.1103/physreve.97.042306 sha: doc_id: 10739 cord_uid: 28qfmj9x the spread of an infectious disease is known to change people's behavior, which in turn affects the spread of disease. adaptive network models that account for both epidemic and behavioral change have found oscillations, but in an extremely narrow region of the parameter space, which contrasts with intuition and available data. in this paper we propose a simple susceptible-infected-susceptible epidemic model on an adaptive network with time-delayed rewiring, and show that oscillatory solutions are now present in a wide region of the parameter space. altering the transmission or rewiring rates reveals the presence of an endemic bubble—an enclosed region of the parameter space where oscillations are observed. the spread of an infectious disease changes the behavior of individuals, and this, in turn, affects the spread of the disease [1] . broadly speaking, responses to an epidemic fall into two categories: coordinated and uncoordinated. coordinated responses include vaccination and quarantine schemes, travel restrictions, and information spread through mass media. uncoordinated responses cover individuals adapting their behavior based on their own perceived risk; this includes improved hygiene regimens and avoiding crowded places and public transport during outbreaks. surveys consistently identify such precautionary measures taken by individuals during epidemic outbreaks [2, 3] . fear of becoming infected during the 2003 sars epidemic in hong kong caused huge behavioral shifts; air travel into hong kong dropped by as much as 80% [4] . responses to a large study covering numerous european and asian regions revealed that, in the event of an influenza pandemic, 75% of people would avoid public transport, and 20-30% would try to stay indoors [5] . these behavioral shifts change the potential routes for transmission and can alter the size and time scale of an epidemic [6] . in the context of epidemic models on networks, perhaps, the most widespread approach to couple epidemics and behavior is by using adaptive networks, where behavioral changes are captured by link rewiring based on the disease status of nodes [6, 7] . gross et al. [8, 9] considered a simple susceptible-infectedsusceptible (sis) model with rewiring, in which susceptible nodes disconnect from infected neighbors at rate ω, and immediately reconnect to a randomly chosen susceptible node. this simple model led to bistability and to oscillatory solutions, albeit with oscillations limited to an extremely narrow region of the parameter space. this rewiring procedure has since been extended to consider scenarios where both the susceptible and * k.blyuss@sussex.ac.uk infected nodes can rewire, and diseases with a latent period [10] . zhang et al. [11] presented a further alternative, where news about past prevalence influences whether nodes choose to disconnect edges. the authors found an estimate of the critical delay that induces a hopf bifurcation, thus causing periodicity. tunc et al. [12] studied a network model with temporary deactivation of edges between susceptible and infected individuals. on a growing network, zhou et al. [13] showed that cutting links between susceptible and infected individuals can lead to epidemic reemergence, with long periods of low disease prevalence punctuated by large outbreaks. periodic cycles and disease reemergence are evident in realworld data. many diseases are subject to seasonal peaks, which have been studied extensively [14, 15] . often a sinusoidal or other form of time-varying transmission parameter is used to imitate seasonality, which can lead to multiennial peaks [16] . a number of models have identified other possible causes of periodicity in epidemic dynamics. to give one example, hethcote et al. [17] showed that in a well-mixed population temporary immunity in susceptible-infectedrecovered-susceptible (sirs)-or susceptible-exposedinfected-recovered-susceptible (seirs)-type models as represented by a time delay can result in the emergence of periodic solutions when the immunity period exceeds some critical value. one should note that seasonality alone cannot explain all cases of oscillations. in both the united kingdom and the united states, the 2009 h1n1 pandemic occurred in two distinct waves separated by a few months [18, 19] . other diseases have shown more long-term trends. incidence reports of mycoplasma pneumonia have found evidence of epidemic cycles in many different countries, with periodicity of three to five years [20, 21] . recently, it has been suggested that syphilis exhibits periodic cycling [22] , although these findings have been subsequently questioned [23] . while it is difficult to pinpoint the specific causes of periodicity in the dynamics of these diseases, if syphilis epidemics are indeed cyclical, then changes in human behavior have been proposed as the likely explanation [24] . intuitively, and as shown by empirical observations, one would expect oscillations to appear in epidemic models where behavior is considered. if an individual is aware of the state of their neighbors and responds accordingly, then times of high prevalence will be associated with greater caution, curbing further spread. conversely, without advance warning, behavior will return to normal as prevalence wanes, enabling a second wave of the epidemic. despite this intuition, adaptive network models have so far not been able to show such robust oscillations over reasonable regions of the parameter space. to tackle this problem, we introduce a simple sis model on an adaptive network with n nodes. infected nodes transmit the disease to susceptible neighbors at rate β across links, and recover and become susceptible again at rate γ , independently of the network. susceptible nodes cut links that connect them to infected neighbours at rate ω and, after a fixed time delay of length τ , reconnect to susceptible nodes chosen uniformly at random from all such available nodes. the delay between cutting and reconnecting is crucial. it is unrealistic to expect that alternative contacts can be identified and established arbitrarily quickly. the delay represents both people's hesitance to make new contacts and also the potential lack of availability of such new contacts when an epidemic is spreading thorough a population [5] . to construct the mean-field model, we use the pairwise approximation method [25] . the number of nodes in the susceptible or infected state at time t is denoted by [s] and [i ], respectively; [ss], [si ], and [i i ] denote the number of connected pairs of nodes in the respective states, with all pairs being doubly counted. the explicit dependence on time is dropped for simplicity. for the moment closure approximation we use the assumption that once a node is fixed, typically a susceptible node, then the states of the neighbors are poisson distributed [26] . this leads to to express the number of connected triples [8, 25] . the delay before an s-i edge is rewired to an s-s edge introduces a complication, as not all newly formed edges will be between two susceptible nodes. to see this, consider an example of a susceptible node with two or more infected neighbors. at some time t 1 it disconnects from one of these neighbors. then, in the interval (t 1 ,t 1 + τ ) another infected neighbor transmits the disease to it. if it then remains infected until time t 1 + τ , the new edge will be of an i -s type rather than s-s. to deal with this issue we use a technique similar to that used by kiss et al. [27] for a pairwise model with an infectious period of fixed length. consider y p (t) to be the cohort of susceptible nodes that have cut a link at time t − τ and are waiting to reconnect. the expected number of infected neighbors a susceptible node has is approximated by [si ]/[s]. therefore, the rate at which nodes in the cohort become infected over the interval (t − τ,t) iṡ the solution to this ordinary differential equation is a member of the cohort infected at some time u ∈ (t − τ,t) may recover before time t. to ensure that we only consider nodes which remain infected, we must include the probability that a node infected at time u remains infected until time t in the integral term of (2). this is the survival probability of the recovery process, and it is given by e −γ (t−u) . therefore, the rate at which new s-s edges are formed is if the exponential term in (3) is denoted by x(t), the rate at which new i -s edges are formed is with this in mind, the mean-field model iṡ comparison between the solution of (4) and numerical simulation. three sets of results are shown, ω = 0 (top), ω = 1 (middle), and ω = 1.4 (bottom). other parameters are β = 0.6, γ = 1, τ = 6, and k = 10. simulation results are averaged across 100 iterations on random networks of 1000 nodes. all simulations begin by randomly selecting a node to infect at time t = 0. simulation runs which die out are discarded and performed again. when τ = 0, the dynamics of (4) are equivalent to the wellknown model of gross et al. [8] . figure 1 shows a comparison between the solution of the new model (4) and numerical simulation. the agreement is excellent despite the simplicity of the model and the fact that the moment closures do not reflect the changing network structure. in particular, both the solution and simulation results exhibit similar oscillatory behavior for the same parameter values. these results validate the model and allow us to analyze its behavior. first, consider the basic reproductive ratio, r 0 , defined as the expected number of secondary infections caused by a single typical infectious node in an otherwise wholly susceptible population. one can find r 0 for the delayed rewiring model (4) n,0, k n,0,0,1) . performing this analysis gives note that increasing the rewiring rate decreases the epidemic threshold r 0 , but the length of the delay, τ , has no effect on the threshold. however, as we will show later, it does affect the final outcome of the epidemic. system (4) also has an endemic steady state, but its value is determined by a transcendental equation which can only be solved numerically. using this result in the numerical linear stability analysis of (4) allows us to analyze the stability of the endemic equilibrium. as shown in fig. 2(a) , changes to both τ and ω are capable of destabilizing the endemic equilibrium. regardless of the value of τ , eventually high values of the rewiring rate make the dfe stable again. for most values of τ this coincides with the point where the endemic steady state becomes biologically infeasible (less than or equal to zero), leaving the dfe as the only plausible steady state for the system. however, for sufficiently small values of τ , the endemic steady state remains feasible, and there is a small region of bistability. qualitatively, this behavior is the same for any choice of the other parameters, as long as the endemic steady state remains biologically feasible, as illustrated for other parameters are the same as in fig. 2(a) . the endemic equilibrium is unstable in the red/yellow region, stable in the green/blue region, and biologically infeasible in the white region. different values of β in fig. 3 . this figure shows that increasing the disease transmission rate allows the endemic steady state to be feasible for a wider range of link-cutting rate ω, and it also lowers the critical time delay τ , at which this steady state becomes unstable. figure 2 (b) shows the endemic equilibrium, as well as the minima and maxima of oscillations for a range of β and ω values, with oscillations being observed in a significant part of the parameter space. one can clearly see the formation of an endemic bubble that has been discovered earlier in other epidemic models [28, 29] . interestingly, both ω and β appear to play similar roles in the formation of the endemic bubble, namely, they open it through a supercritical hopf bifurcation of the endemic equilibrium and then close it through a subcritical hopf bifurcation. increasing the length of the delay can only induce a supercritical hopf bifurcation, resulting in the emergence of stable oscillations, beyond which point larger values of τ only increase the amplitude of oscillations until it settles on some steady level, as shown in fig. 2 (c). one should note that the minima of oscillations get closer to zero for larger τ , suggesting that for large rewiring times there are periods of time with negligible disease prevalence, followed by major outbreaks, as illustrated in fig. 2(d) . in the limit τ → ∞, disconnected edges are never redrawn and the epidemic dies out, partially due to the network becoming sparser. for the case without time delay, gross et al. [8] found bistability in a large region of the parameter space, and periodic oscillations in a much smaller region. by contrast, results shown in fig. 2 demonstrate a large region in the parameter space with oscillatory behavior. delay differential equation (ddes) are known to often produce oscillatory dynamics, and bubbles similar to those shown in fig. 2(b) have been reported in other biological and epidemic models [28, 29] . let us now discuss the origins of oscillatory behavior in our model. the delay between disconnecting an edge and drawing a new one means that the total number of edges, and thus also the mean degree, is not constant. whenever a susceptible node chooses to rewire, the total number of edges in the network decreases by two (since all edges are bidirectional) until time τ passes, and the edge is redrawn. the mean degree k(t) at any time t can be calculated directly from this argument as follows: figure 2(d) shows that oscillations are driven by the dynamics of k(t). during the early stages of an outbreak with a high rewiring rate, k(t) falls rapidly, as susceptible nodes cut links in response to the propagation of the disease. if the value of τ is large enough, then after a certain time the number of edges in the network is small enough to effectively starve the disease of transmission routes, and prevalence falls. these edges are then redrawn at the same rate as they were cut τ time ago, and k(t) grows, which allows the disease to spread again. figure 2 (d) illustrates this behavior both in simulation and in the meanfield model (4) , showing how after the initial outbreak each new wave of infection is preceded by the recovery of network connectivity. the effect of oscillatory interactions between network connectivity and the propagating disease may be more pronounced in network simulations. gross et al. [8] found that adaptive rewiring without delay can lead to the formation of highly connected clusters of susceptible nodes that are vulnerable to disease once any one node becomes infected. since the model (4) does not account for changes in network structure, i.e., the closure is the same for all times and it does not depend on the average degree or degree distribution, this can potentially explain the small discrepancy between the solution of the deterministic and simulation models observed in fig. 1 . to get a better understanding of the interplay between network topology and dynamics, it is worth looking at how delayed rewiring alters degree distribution. time snapshots of several large networks in fig. 4 show the evolution of the degree distribution at various key points of an epidemic in an oscillatory regime. the initial network topology (black solid lines) is quickly reorganized to a peaked distribution. the oscillations in prevalence cause slight but repeated changes in the degree distribution. unsurprisingly, when prevalence is at or near its peak, nodes with a lower degree are more common. when the prevalence falls, the distribution curves shift to the right, and the shape of the distribution flattens slightly. when the endemic steady state is stable, the degree distribution stabilizes to a peaked distribution between the two extremes of the oscillatory regime. a very important observation is that, irrespective of the initial network topology, due to rewiring different networks eventually settle on a very similar skewed degree distribution. this implies that earlier conclusions derived for the specific closure (1) appropriate for erdős-rényi graphs are actually applicable to modeling long-term dynamics of different types of networks, for which the influence of the initial topology is low since a significant amount of rewiring has already taken place. the particular strength of this model lies in its ability to exhibit rich behavior from a simple system of ddes. time delay captures the fact that finding alternative contacts takes time, and also during an epidemic many people try to temporarily reduce the number of their contacts. such behavior can be modeled using this delayed rewiring process. previous work separated the processes of edge destruction and creation, and with edge creation occurring at a fixed rate the number of edges in the network was bounded only by the network size [24, 30] . in the model presented above, edge creation is reduced to replenishing global network connectivity towards its original level. therefore, this model is fundamentally different from earlier models, even when parameters are matched. during the initial growth phase it is the rate at which potential transmission is avoided by cutting a link, not the delay before drawing a new edge, that determines whether a major outbreak will occur. although the delay does not affect the basic reproductive ratio r 0 , it does impact the outcome of the epidemic [see fig. 2(c) ]. the result of introducing the delay is that oscillations occur in a large region of the parameter space. this happens due to the interplay between the spread of the disease and the behavioral changes in response to the epidemic. when the length of the delay is significant, the network becomes more sparse, healthy individuals are at lower risk of infection, and over time the prevalence falls. when the new edges are then formed, the disease is once again able to spread, and the cycle repeats. understanding the nature and cause of oscillations may provide opportunities to eradicate the disease. for example, if public awareness campaigns can lead to an increase in the length of the delay, the prevalence of the disease will naturally fall close to zero, at which time a relatively minor intervention, such as quarantining those who remain infected, may be enough to eradicate the disease from the population entirely. currently, the model assumes that only susceptible nodes rewire. however, in reality, infected nodes are also likely to change their behavior. risau-gusman and zanette [10] considered a model of rewiring where infected nodes rewire with a given probability. it would be of great value to examine a similar situation under delayed rewiring, with time delay representing the time for which infected nodes partially isolate themselves before rewiring, in accordance with advice given by public health authorities. this would alter the nature of the variable x(t) in the model. for example, if only infected nodes rewire, x(t) ≈ e −γ τ . preliminary tests of this rewiring scheme show behavior similar to the present model. numerical simulations have shown that a similar oscillatory behavior is observed for other initial network topologies, including scale-free networks. furthermore, since rewiring nodes choose their new neighbors uniformly at random from all available susceptible nodes, the initial network topology itself is transient, as shown in fig. 4 , and, as a result, over time our model becomes more relevant. future work will look at how the degree distribution and oscillations are affected in the case when the network links are rewired not randomly but according to a preferential attachment or some fitness-based rule. this could result in some interesting new dynamics due to the competition between the increased probability of highly connected nodes receiving new links, and the increased probability of infection. proc. natl. acad. sci. usa correlation equations and pair approximations for spatial ecologies the authors are grateful to anonymous reviewers for their helpful comments and suggestions. n.s. acknowledges funding for his ph.d. studies from the engineering and physical sciences research council, grant no. ep/m506667/1, and the university of sussex. key: cord-102588-vpu5w9wh authors: le, trang t.; moore, jason h. title: treeheatr: an r package for interpretable decision tree visualizations date: 2020-07-10 journal: biorxiv doi: 10.1101/2020.07.10.196352 sha: doc_id: 102588 cord_uid: vpu5w9wh summary treeheatr is an r package for creating interpretable decision tree visualizations with the data represented as a heatmap at the tree’s leaf nodes. the integrated presentation of the tree structure along with an overview of the data efficiently illustrates how the tree nodes split up the feature space and how well the tree model performs. this visualization can also be examined in depth to uncover the correlation structure in the data and importance of each feature in predicting the outcome. implemented in an easily installed package with a detailed vignette, treeheatr can be a useful teaching tool to enhance students’ understanding of a simple decision tree model before diving into more complex tree-based machine learning methods. availability the treeheatr package is freely available under the permissive mit license at https://trang1618.github.io/treeheatr and https://cran.r-project.org/package=treeheatr. it comes with a detailed vignette that is automatically built with github actions continuous integration. contact ttle@pennmedicine.upenn.edu decision tree models comprise a set of machine learning algorithms widely used for predicting an outcome from a set of predictors or features. for specific problems, a single decision tree can provide predictions at desirable accuracy while remaining easy to understand and interpret (yan et al., 2020) . these models are also important building blocks of more complex tree-based structures such as random forests and gradient boosted trees. the simplicity of decision tree models allows for clear visualizations that can be incorporated with rich additional information such as the feature space. however, existing software frequently treats all nodes in a decision tree similarly, leaving limited options for improving information presentation at the leaf nodes. specifically, the r library rpart.plot displays at each node its characteristics including the number of observations falling in that node, the proportion of those observations in each class, and the node's majority vote. despite being potentially helpful, these statistics may not immediately convey important information about the tree such as its overall performance. function vistree() from the r package visnetwork draws trees that are aesthetically pleasing but lack general information about the data and are difficult to interpret. the state-of-the-art python's dtreeviz produces decision trees with detailed histograms at inner nodes but still draw pie chart of different classes at leaf nodes. ggparty is a flexible r package that allows the user to have full control of the representation of each node. however, this library fixes the leaf node widths, which limits its ability to show more collective visualizations. we have developed the treeheatr package to incorporate the functionality of ggparty but also utilize the leaf node space to display the data as a heatmap, a popular visualization that uncovers groups of samples and features in a dataset (wilkinson and friendly, 2009, galili,t. et al., 2018) . a heatmap also displays a useful general view of the dataset, e.g., how large it is or whether it contains any outliers. integrated with a decision tree, the samples in each leaf node are ordered based on an efficient seriation method. after simple installation, the user can apply treeheatr on their classification or regression tree with a single function: heat_tree(x, target_lab = 'outcome') this one line of code above will produce a decision tree-heatmap as a ggplot object that can be viewed in rstudio's viewer pane, saved to a graphic file, or embedded in an rmarkdown document. this example assumes a classification problem, but one can also apply treeheatr on a regression problem by setting task = 'regression' . this article is organized as follows. in section 2, we present an example treeheatr application by employing its functions on a real-world clinical dataset from a study of covid-19 patient outcome in wuhan, china (yan ✐ ✐ "output" -2020/7/10 -14: et al., 2020). in section 3, we describe in detail the important functions and corresponding arguments in treeheatr. we demonstrate the flexibility the user has in tweaking these arguments to enhance understanding of the tree-based models applied on their dataset. finally, we discuss general guidelines for creating effective decision tree-heatmap visualization. this example visualizes the conditional inference tree model built to predict whether or not a patient survived from covid-19 in wuhan, china (yan et al., 2020) . the dataset contains blood samples of 351 patients admitted to tongji hospital between january 10 and february 18, 2020. three features were selected based on their importance score from a multi-tree xgboost model, including lactic dehydrogenase (ldh), lymphocyte levels and high-sensitivity c-reactive protein (hs_crp). detailed characteristics of the samples can be found in the original publication (yan et al., 2020) . the following lines of code compute and visualize the conditional decision tree along with the heatmap containing features that are important for constructing this model ( fig. 1) : the heat_tree() function takes a party or partynode object representing the decision tree and other optional arguments such as the outcome label mapping. if instead of a tree object, x is a data.frame representing a dataset, heat_tree() automatically computes a conditional tree for visualization, given that an argument specifying the column name associated with the phenotype/outcome, target_lab, is provided. in the decision tree, the leaf nodes are labeled based on their majority votes and colored to correlate with the true outcome. on the right split of hs_crp (hs_crp ≤ 52.5 and hs_crp > 52.5), although individuals of both branches are all predicted to survive by majority voting, the leaf nodes have different purity, indicating different confidence levels the model has in classifying samples in the two nodes. these seemingly non-beneficial splits present an opportunity to teach machine learning novices the different measures of node impurity such as the gini index or cross-entropy (hastie et al., 2009 ). in the heatmap, each (very thin) column is a sample, and each row represents a feature or the outcome. for a specific feature, the color shows the relative value of a sample compared to the rest of the group on that feature; higher values are associated with lighter colors. within the heatmap, similar color patterns between ldh and hs_crp suggest a positive correlation between these two features, which is expected because they are both systemic inflammation markers. together, the tree and heatmap give us an approximation of the proportion of samples per leaf and the model's confidence in its classification of samples in each leaf. three main blocks of different lymphocyte levels in the heatmap illustrate its importance as a determining factor in predicting patient outcome. when this value is below 12.7 but larger than 5.5 (observations with dark green lymphocyte value), hs_crp helps further distinguish the group that survived from the other. here, if we focus on the hs_crp > 35.5 branch, we notice that the corresponding hs_crp colors range from light green to yellow (> 0.5), illustrating that the individuals in this branch have higher hs_crp than the median of the group. this connection is immediate with the two components visualized together but would not have been possible with the tree model alone. in summary, the tree and heatmap integration provides a comprehensive view of the data along with key characteristics of the decision tree. when the first argument x is a data.frame object representing the dataset instead of the decision tree, treeheatr automatically computes a conditional tree with default parameters for visualization. conditional decision trees (hothorn et al., 2006) are nonparametric models performing recursive binary partitioning with well-defined theoretical background. conditional trees support unbiased selection among covariates and produce competitive prediction accuracy for many problems (hothorn et al., 2006) . the default parameter setting often results in smaller trees that are less prone to overfit. treeheatr utilizes the partykit r package to fit the conditional tree and ggparty r package to compute its edge and node information. while ggparty assumes fixed leaf node widths, treeheatr employs a flexible node layout to accommodate the different number of samples shown in the heatmap at each leaf node. this new node layout structure supports various leaf node widths, prevents crossings of different tree branches, and generalizes as the trees grow in size. this new layout weighs the x-coordinate of the parent node according to the levels of the child nodes in order to avoid branch crossing. this relative weight can be adjusted with the lev_fac parameter in heat_tree(). lev_fac = 1 sets the parent node's xcoordinate perfectly in the middle of those of its child nodes. the default level_fac = 1.3 seems to provide optimal node layout independent of the tree size. the user can define a customized layout for a specific set of nodes and combine that layout with the automatic layout for the remaining nodes. by default, heatmap samples (columns) are automatically reordered within each leaf node using a seriation method (hahsler et al., 2008) using all features and outcome label, unless clust_target = false. treeheatr uses the daisy() function in the cluster r package with the gower metric (gower, 1971) to compute the dissimilarity matrix of a dataset that may have both continuous and nominal categorical feature types. heatmap features (rows) are ordered in a similar manner. we note that, while there is no definitive guideline for proper weighting of features of different types, the goal of the seriation step is to reduce the amount of stochasticity in the heatmap and not to make precise inference about each grouping. in a visualization, it is difficult to strike the balance between enhancing understanding and overloading information. we believe showing a heatmap at the leaf node space provides additional information of the data in an elegant way that is not overwhelming and may even simplify the model's interpretation. we left it for the user to decide what type of information to be displayed at the inner nodes via different geom objects (e.g., geom_node_plot, geom_edge_label, etc.) in the ggparty package. for example, one may choose to show at these decision nodes the distribution of the features or their corresponding bonferroni-adjusted p values computed in the conditional tree algorithm (hothorn et al., 2006) . striving for simplicity, treeheatr utilizes direct labeling to avoid unnecessary legends. for example, in classification, the leaf node labels have colors corresponding with different classes, e.g., purple for deceased and green for survived in the covid-19 dataset (fig. 1) . as for feature values, by default, the color scale ranges from 0 to 1 and indicates the relative value of a sample compared to the rest of the group on each feature. linking the color values of a particular feature to the corresponding edge labels can reveal additional information that is not available with the decision tree alone. in addition to the main dataset, the user can supply to heat_tree() a validation dataset via the data_test argument. as a result, heat_tree() will train the conditional tree on the original training dataset, draw the decision tree-heatmap on the testing dataset, and, if desired, print next to the tree its performance on the test set according to specified metrics (e.g., balanced accuracy for classification or root mean squared error for regression problem). the integration of heatmap nicely complements the current techniques of visualizing decision trees. node purity, a metric measuring the tree's performance, can be visualized from the distribution of true outcome labels at each leaf node in the first row. comparing these values with the leaf node label gives a visual estimate of how accurate the tree predictions are. further, without explicitly choosing two features to show in a 2-d scatter plot, we can infer correlation structures among features in the heatmap. the additional seriation may also reveal sub-structures within a leaf node. in this paper, we presented a new type of integrated visualization of decision trees and heatmaps, which provides a comprehensive data overview as well as model interpretation. we demonstrated that this integration uncovers meaningful patterns among the predictive features and highlights the important elements of decision trees including feature splits and several leaf node characteristics such as prediction value, impurity and number of leaf samples. its detailed vignette makes treeheatr a useful teaching tool to enhance students' understanding of this fundamental model before diving into more complex tree-based machine learning methods. treeheatr is scalable to large datasets. for example, heat_tree() runtime on the waveform dataset with 5000 observations and 40 features was approximately 80 seconds on a machine with a 2.2 ghz intel core i7 processor and 8 gb of ram. however, as with other visualization tools, the tree's interpretation becomes more difficult as the feature space expands. thus, for high dimensional datasets, it's potentially beneficial to perform feature selection to reduce the number of features or random sampling to reduce the number of observations prior to plotting the tree. moreover, when the single tree does not perform well and the average node purity is low, it can be challenging to interpret the heatmap because clear signal cannot emerge if the features have low predictability. future work on treeheatr includes enhancements such as support for left-to-right orientation and highlighting the tree branches that point to a specific sample. we will also investigate other data preprocess and seriation options that might result in more robust models and informative visualizations. heatmaply: an r package for creating interactive cluster heatmaps for online publishing a general coefficient of similarity and some of its properties getting things in order: an introduction to the r package seriation the elements of statistical learning: data mining, inference, and prediction 2nd ed unbiased recursive partitioning: a conditional inference framework partykit: a modular toolkit for recursive partytioning in r ggplot2: elegant graphics for data analysis the history of the cluster heat map an interpretable mortality prediction model for covid-19 patients the treeheatr package was made possible by leveraging integral r packages including ggplot2 (wickham, 2009) , partykit (hothorn and zeileis, 2015) , ggparty, heatmaply (galili et al., 2018) and many others. we would also like to thank daniel himmelstein for his helpful comments on the package's licensing and continuous integration configuration. finally, we thank two anonymous reviewers whose helpful feedback helped improve the package and clarify this manuscript. this work has been supported by the national institutes of health grant nos. lm010098 and ai116794. key: cord-010727-fiukemh3 authors: holme, petter title: three faces of node importance in network epidemiology: exact results for small graphs date: 2017-12-05 journal: phys rev e doi: 10.1103/physreve.96.062305 sha: doc_id: 10727 cord_uid: fiukemh3 we investigate three aspects of the importance of nodes with respect to susceptible-infectious-removed (sir) disease dynamics: influence maximization (the expected outbreak size given a set of seed nodes), the effect of vaccination (how much deleting nodes would reduce the expected outbreak size), and sentinel surveillance (how early an outbreak could be detected with sensors at a set of nodes). we calculate the exact expressions of these quantities, as functions of the sir parameters, for all connected graphs of three to seven nodes. we obtain the smallest graphs where the optimal node sets are not overlapping. we find that (i) node separation is more important than centrality for more than one active node, (ii) vaccination and influence maximization are the most different aspects of importance, and (iii) the three aspects are more similar when the infection rate is low. one of the central questions in theoretical epidemiology [1] [2] [3] is how to identify individuals that are important for an infection to spread [4, 5] . what "important" means depends on the particular scenario-what kind of disease spreads and what can be done about it. in the literature, three major aspects of importance have been discussed. first, influence maximization is aimed at identifying the nodes that, if they are sources of the outbreak, would maximize the expected outbreak size (the number of nodes infected at least once) [6, 7] . second, vaccination is aimed at finding the nodes that, if vaccinated (or, in practice, deleted from the network), would reduce the expected outbreak size the most [5] . third, sentinel surveillance is aimed at finding the nodes that are likely to get infected early [8, 9] . these three notions of importance do not necessarily yield the same answer as to which node is most important. in this work, we investigate how the ranking of important nodes for these three aspects differs and why (see fig. 1 ). in this paper, we evaluate the three aspects of importance with respect to the susceptible-infectious-removed (sir) disease-spreading model [1] [2] [3] 10] on small connected graphs (all connected graphs from three up to seven nodes). the main reason we restrict ourselves to small graphs is that it allows us to use symbolic algebra, and thus exact calculations [11] . in this way we can discover, e.g., the smallest graph where the three aspects of importance disagree about which node is most important; cf. ref. [12] . we argue that graphs of seven nodes are still large enough to illustrate the effects of distance. nevertheless, large networks are important to study. a possible future extension of this work will be to address the relationship between the three importance measures for larger networks. in the related ref. [13] , the difference between influence maximization and vaccination problems on (some rather large) empirical networks is studied. the authors compare the top results of heuristic algorithms to identify influential single nodes, whereas in this paper we will consider the influence of all nodes, and also all sets of two and three nodes. (the terminology of ref. [13] is a bit different from ours-they call important nodes for vaccination "blockers" and important nodes for influence maximization "spreaders".) * holme@cns.pi.titech.ac.jp we will proceed by discussing our setup in greater detail: our implementation of the sir model, how to analyze the three aspects of importance, network centrality measures that we need for our analysis, and our results, including the smallest networks where different nodes come out as most important. in this section, we provide the background to our analysis. the basis of our analysis is graphs g(v ,e) consisting of n nodes v and m links e. as mentioned earlier, there are three ways to think of importance in theoretical infectious disease epidemiology. influence maximization was first studied in computer science with viral marketing in mind [6, 7] . as was mentioned, a node is important for influence maximization if it is a seed of an infection that could cause a large outbreak. for epidemiological applications, therefore, it might be interesting if one could immunize people against a disease before an outbreak happens. we will simply measure the expected outbreak size (s) (the expected number of nodes to catch the disease), with s as the set of source nodes, and we will rank the set of nodes according to their . for vaccination, we will use the average outbreak size from one random seed node to estimate the importance of a node [1, 2, 14, 15] . one could rephrase it as a cost problem [16] . we assume the vaccinees are deleted from the network before the outbreak starts. the node with the smallest is the one that is most important for the vaccination problem. sentinel surveillance assumes a response after the outbreak already started (compared to influence maximization and vaccination, where the action affecting the nodes in question is assumed to take place before the outbreak happens). a node is important for sentinel surveillance if it gets infected early so that the health authorities can activate their countermeasures. this is usually determined by the lead time-the expected difference between the time a sentinel node gets infected, or the outbreaks dies out, and the infection time of any node in the graph [8] . we will instead measure the average discovery time time τ (i) from the beginning of the infection until a node i gets infected or the outbreak dies [9] . the node with the illustration of the three different notions of importance we explore in this work. panel (a) shows an example of an sir outbreak in a seven-node network. panels (b)-(d) show how this outbreak influences maximization (a), vaccination (b), and sentinel surveillance, respectively. the idea of influence maximization (b) is that a node is important if the outbreak originating at it is expected to be large. the idea of vaccination (c) is that a node is important if removing it would reduce significantly the average outbreak size. the idea of sentinel surveillance (d) is that a node is important if a sensor on it would detect the outbreak early. the shades of the nodes in (c) and (d) are proportional to their contribution. in a stochastic simulation, one would average the values over many runs and, for (c) and (d), many seeds of the outbreak. in this work, however, rather than running simulations, we calculate the exact expectation values of these quantities. smallest discovery time is then considered most important for sentinel surveillance. if the purpose of the surveillance is just to discover the outbreak-not to rid the population of the disease as early as possible-one could measure τ (i) conditioned on the outbreak reaching a sentinel before it dies out. we will briefly discuss such a conditioned τ and refer to it as τ . for all of the three problems mentioned above, one can consider sets of nodes rather than individuals. there can be more than one source (for influence maximization), vaccinee, or sentinel. we will, in general, call these sets active nodes and denote their number as n. we will try to find the optimal sets of active nodes (and call them optimal nodes). note that this is not the same as ranking the nodes in order of importance and taking the n most important ones-such a "greedy" approach can in many cases fail [7, 15] . note that for vaccination and sentinel surveillance, we use one source node of the infection. this is the standard approach in infectious disease epidemiology simply because most outbreaks are thought to start with one person [3, 17] . we will use the constant infection and recovery-rate version of the sir model [17] . in this formulation, if a susceptible node i is connected to an infectious node j , then i becomes infected at a rate β. infected nodes recover at a rate ν. without loss of generality, we can set ν = 1 (equivalently, this means we are measuring time in units of 1/ν). let c be a configuration (i.e., a specification of the state-s, i, or r-of every node), m si is the number of links between s and i nodes, and n i is the number of infected nodes. then, the rate of events (either infections or recoveries) is βm si + n i , which gives the expected duration of c as proceeding in the spirit of the gillespie algorithm, the probability of the next event being an infection event is βm si t , and the probability of a recovery event is n i t [2, 18] . exactly calculating the outbreak size and time to discovery or extinction is, in principle, straightforward. consider the change from configuration c into c by an infection event (changing node i from susceptible to infectious). this can happen in m i ways, where m i is the number of links between i and an infectious node. thus the probability for the transition from c to c is βm i t . the probability that the next event will be a recovery event is simply t . to compute the probability of a chain of events, one simply multiplies these probabilities over all transitions. to compute the expected time for a chain of events, one sums the t for all configurations of the chain. we will illustrate the description above with an example. see fig. 2 . the probability of the outbreak chain 7 is (multiply the probabilities of the transitions) the expected duration of the infection chain is giving a contribution of chain 7 to τ . then these contributions need to be summed up for all chains, and averaged over all starting configurations. for the example in fig. 2 , this gives the expressions of and τ are fractions of polynomials. for the largest networks we study (seven nodes), these polynomials can be of order up to 43 with up to 54-digit integer coefficients. calculating for the influence maximization or vaccination problems follows the same path as the τ calculation above. the difference is that instead of multiplying by the expected time of a chain, one would multiply by the number of recovered susceptible infectious recovered sentinel β/(2β+1) β / ( 2 β + 1 ) nodes in that branch. furthermore, there are no sentinels to stop outbreaks, so trees (like fig. 2 ) become larger. in practice, our approach to analyzing network epidemiological models is time-consuming. the major bottleneck is the polynomial algebra (to be precise, calculating the greatest common divisor needed to reduce the fractions of polynomials to their canonical form). because of this, we could not handle networks of more than seven nodes. the code was implemented in both python (with the sympy library [19] ) and c with the flint library [20] . it also uses the subgraph isomorphism algorithm vf2 [21] as implemented in the igraph c library [22] . our code is available at http://github.com/pholme/exact-importance, which also includes code to calculate τ (mentioned above but not investigated in the paper). to better understand how the network structure determines what nodes are most important, we measure the average values of static importance predictors. in general, there are many ways to be the central means for a node-is it a node often passed by things traveling over the network, or is it a node for which short paths exist to other nodes? different rationales give different measures. these are typically positively correlated, but they do not rank the nodes in the exact same way, and thus they can complement each other [23] . we focus on three measures: degree, closeness centrality, and vitality. degree centrality is simply the number of neighbors of a node. if a node has twice the neighbors of another, it has twice as many nodes to which to spread an infection. this makes it more important for influence maximization and vaccination. it also has twice as many nodes from which to get the infections, which contributes to its importance for vaccination and sentinel surveillance. on the other hand, degree is not a global quantity-it could happen that the neighbors of a high-degree node are so peripheral that a disease could easily die out there. the simplest way of modifying the degree to become a global measure is to operationalize the idea that a node is central if it is the neighbor of many central nodes. with the simplest possible assumptions, this reasoning leads to eigenvector centrality, i.e., the centrality of node i can be estimated as the ith entry of the leading eigenvector of the adjacency matrix [10] . for the small graphs that we consider, however, the eigenvector centrality is so strongly correlated with degree (intuitively so, because "everything is local" in a very small graph) that it makes little sense to include it in the analysis. many centrality measures are based on the shortest paths. perhaps the simplest of these measures is closeness centrality-using the idea that a node is central if it is on average close to other nodes [10, 23] . this leads to a measure of the centrality of i as the reciprocal distance to all other nodes in the network: the main problem, in general, with closeness centrality may be that it is ill-defined on disconnected graphs. in our work, however, we consider only connected graphs. we chose the third centrality measure-vitality-with the vaccination problem in mind. vitality is, in general, a class of measures that estimate node centrality based on its impact on the network if it is deleted [23] . in our work, we let vitality denote the particular metric where s(g) is the number of nodes in the largest component of g. this measure is thus in the interval [1,n − 1], and it increases with i's ability, if removed, to fragment the network. since vaccination is, in practice, like removing nodes from the network, we expect v to identify important nodes for β close to 1. for large graphs, we expect v to be very close to 1, so we only recommend it for small graphs such as the ones we used here. another popular centrality measure-betweenness centrality (roughly how many shortest paths there are in the network that passes a node) [10]-is very strongly correlated with vitality for our set of small graphs, and it is thus omitted from the analysis. in our work, we systematically evaluate small distinct (nonisomorphic) connected graphs. we use all such graphs with 3 n 7. there are two such graphs with n = 3, six with n = 4, 20 with n = 5, 112 with n = 6, and 853 with n = 7. to generate these, we use the program geng [24] . in our analysis, we will focus on when and why the three cases of node importance rank nodes differently. we will start with some extreme examples, and continue with general properties of all small graphs. inspired by ref. [12] , we will start with a special example (fig. 3) . this is the smallest graph where the most important single node (n = 1) is different for influence maximization, vaccination, and sentinel surveillance. for [(1 + √ 5)/2,(3 + √ 17)/4] ≈ (1.62,1.78), node 6 is the most important node for influence maximization, 5 is most important for vaccination, and 1 is most important for sentinel surveillance. for small β values, 6 is most important for all three aspects of importance. in this region, the outbreaks die out easily. the fact that 5 and 6 have a larger degree than the others is, of course, helpful for an outbreak to take hold in the population. node 6 is slightly more important as a seed node since the extra link in its neighborhood helps the outbreak to persist longer [there are the (6, 7, 4) and (6, 4, 7) infection paths that, although unlikely, do not exist for diseases starting at 5]. this reasoning also explains why 6 is most important for vaccination. for sentinel surveillance and for low enough β, the outbreak would typically end by the outbreak becoming extinct rather than hitting a sentinel. thus, for low β, when an outbreak has the highest chance of surviving if it starts at 6, then putting in a sentinel is good because an outbreak is either instantly discovered or will likely soon be extinct. with a conditional discovery time τ , the curves are strictly decreasing (since the early die-off is omitted), so 1 is the most important node for all β. for larger β, node 1 becomes, relatively speaking, more important for influence maximization and sentinel surveillance. this is the most central node in aspects other than degree. for vaccination, however, node 5 is most important as it fragments the network most [the vitality is the same for both nodes v(5) = v(1) = 2, but the size of the second biggest component is larger if 1 is deleted]. so since 1 becomes more important than 6 at a larger β value for influence maximization compared with sentinel surveillance, there is an interval of beta where the network of fig. 3 has three distinct most important nodes for the three aspects of importance that we investigate. for two active nodes (n = 2), the smallest network with no overlap between the optimal node sets is actually smaller than for n = 1. this network, displayed in fig. 4 , has six nodes and eight links. note that n = 6 is the smallest number of nodes to make three distinct sets of two nodes, so in that sense the n = 2 example seems more extreme than the n = 1 counterpart, fig. 3 . for large β values, 1 and 2 are the most important nodes for influence maximization, 5 and 6 are most important for vaccination, and 3 and 4 are most important for sentinel surveillance. 5 and 6 are the nodes that, if deleted, break the network into the smallest components, which explains why they are most important for vaccination (at least for large β). in addition to 5 and 6, 1 and 2 are the only pair of nodes whose neighborhoods contain all other nodes. nodes 1 and 2 both have degree 3, as opposed to 5 and 6, which have degrees 4 and 2, respectively. it is not clear whether that makes 1 and 2 better than 5 and 6 for influence maximization, or why. similarly, it is hard to understand why 3 and 4 are the best nodes for sentinel surveillance. the neighborhoods of these nodes do not even contain the entire graph. we can see that the optimal sets of nodes in fig. 4 do not have links within themselves. this seems natural for most networks and all three notions of importance. this means that as n grows, the distance between the optimal nodes will be larger than 1. this is an observation we will make more quantitatively in the next section. another such observation is that for small β, the optimal nodes for the three importance aspects are overlapping. in this parameter region, most outbreaks die out before they reach a sentinel. if the outbreak starts at a high-degree node in a highly connected neighborhood, there is a larger chance for it to survive. for all three importance aspects, it is important to have active nodes where an outbreak would be likely to survive. still, as evident from fig. 4 , there are examples where the optimal nodes are not overlapping. we will now move to a more statistical evaluation of all graphs with 3-7 nodes. we will present average quantities over all these graphs as functions of β. other summary statistics, including grouping the graphs according to size, give the same conclusions. let u a,b i be the optimal sets for a given network, β, and importance classes a and b. the first quantity we look at is the pairwise overlap of sets of optimal nodes as measured by the jaccard overlap, where for example, in fig. 4 at β = 2, we have where a is influence maximization and b represents sentinel surveillance, giving as seen in fig. 5 , for n = 1, the overlap between the optimal nodes for vaccination and sentinel surveillance has a minimum as a function of β. the same is true for sentinel surveillance versus influence maximization when n = 3. it is hard to say why, beyond that, for individual graphs the j (a,b,β) curves can of course be nonmonotonous as different aspects of the graph structure determine the role of the nodes. we note that (for a different spreading model and much larger networks), ref. [13] finds the jaccard similarity between influence maximization and vaccination to have a minimum as a function of β. next, we investigate the structural properties of the most influential nodes and how they depend on β. in fig. 6 , we plot the degree, closeness centrality, and vitality as a function of β for all aspects of importance and n ∈ {1,2,3}. we start by examining the case n = 1; see figs. 6(a), 6(b), and 6(c). the first thing to notice is the general impression that centralities of the optimal nodes decrease with β. the only case with an opposite trend is vitality [ fig. 6(c) ], where the curves are increasing monotonically. if we first focus on the case with one active node, this could be understood as the ability of nodes to (if removed) fragment the network. this ability is captured by vitality and becomes more important as β increases. continuing the analysis for n = 1, when β is low the most important thing is for the outbreak to persist in the population. if an active node has a high degree, it is likely to be the source of a large outbreak, meaning it is important for influence maximization (which was also concluded by ref. [13] ). if a high-degree node is deleted, it would remove many links that could spread the disease and thus be important for vaccination [25] . it would also be important not to put a sentinel on a low-degree node for sentinel surveillance and low β as diseases reaching low-degree nodes would likely die out. so panels figs. 6(a) and 6(c) can be understood as a shift from nodes of high degree to nodes of high vitality. closeness centrality-seen in fig. 6 (b)-is harder to explain. values of c increase with β for influence maximization but decrease for vaccination. one way of understanding this is from the observation that vitality is most important for vaccination [as evident from fig. 6(c) ], and degree is most important for influence maximization [as seen in fig. 6(a) ]. the results of fig. 6 (b) then suggest that the high vitality nodes optimizing the solution of the vaccination problem have a lower closeness centrality. indeed, for many of the graphs we study, the highest vitality node has many degree-1 neighbors-cf. node 5 in fig. 3 -which does not necessarily contribute to the closeness centrality. for influence maximization, it seems that the optimal nodes are central in the closeness sense-the closer to average the seed node is to the rest of the network, the higher is the chance for the outbreak to reach the entire network. for n = 2 and 3, the picture is somewhat different than for n = 1. in these cases, all centrality measures are decreasing monotonically. the order of importance measures are all the same, with vaccination having the largest values, and influence maximization the smallest. it is no longer the case for vaccination that the optimizing nodes have high vitality and low closeness centrality (as it was for n = 1). indeed, for the vaccination case, the optimal nodes are usually independent of β, which is why the curves for vaccination in figs. 3(d)-3(i) are almost straight. naively, one would think that some centrality measure needs to increase with β. however, as we will argue further below, the optimal nodes would usually not be close to each other. one could think of each node being responsible for (and centrally situated within) a region of the network, and that that tendency is so strong that it overrides all simple centrality measures. on the other hand, there are group centrality measures that could perhaps increase with β [26] (that could be a theme for another paper). the fact that all the curves of figs. 6(d)-6(i) are nonincreasing could be explained by the fact that the separation of the optimal nodes increases with β. in fig. 7 , we try to make this argument more quantitative by measuring the average (shortest path) distance d between the optimal nodes. in the limit of small β, these values come rather close to its minimum of 1, but as β increases, so does d. essentially, the pattern from fig. 7 is the reverse of figs. 6(d)-6(i)-the vaccination curve is almost constant, sentinel surveillance increases moderately, but influence maximization increases much more. a larger separation gives the sentinels the ability on average to be closer to outbreaks anywhere in the network, while for influence maximization a larger separation means that there are more susceptible-infectious links (fewer infectious-infectious links) in the incipient outbreak. for vaccination there is no such positive effect of a larger separation that we can think of, which is a part of the explanation as to why the optimal sets are relatively independent of β for n > 1. the rest of the explanation, i.e., why the trends for n = 1 are so much weaker when n > 1, is not clear to us, and it is something we will investigate further in the future. we investigated the average properties of the optimal nodes for all our graphs. we found that the overlap between the optimal nodes of the different importance aspects are largest for small β. in the small-β region, a high degree seems most important for all importance aspects. for larger β nodes, it becomes more important for them to be positioned such that they would fragment the network if they were removed, particularly for the vaccination problem (slightly less for the sentinel surveillance problem, and much less for influence maximization). on the other hand, when the number of active nodes increases, it becomes important for the nodes to be spread out-the average distance between them increases. this effect is large for influence maximization, intermediate for sentinel surveillance, and very small for vaccination. the small effect for vaccination can be understood since all that matters is to fragment the network, and for that purpose the vaccine does not necessarily have to be distant. most of the behavior discussed above seems quite natural. for small β, the dominant aspect of the dynamics is how fast an outbreak will die out. for large β, the outbreak will almost certainly reach all nodes. for vaccination and sentinel surveillance, this leads to a question of deleting nodes that would break the network into the smallest components. (in the former case, this is trivial since the size of the outbreak is almost surely the size of the connected component to which the seed node belongs. in the latter, we conclude this from the monotonically increasing vitality). as an extension, it would be interesting to confirm this work in larger networks using stochastic simulations. this would not allow for the discoveries of special graphs such as those in figs. 3 and 4, but it could reinforce the connection between the different notions of centrality. we believe that many of our conclusions hold for larger networks, an indication being that our results are consistent with the results of ref. [13] (comparing the vaccination and influence maximization for n = 1 in large empirical networks). a survey of models and algorithms for social influence analysis proceedings of the ninth acm sigkdd international conference on knowledge discovery and data mining networks: an introduction proceedings of the third international congress on mathematical software 3rd iapr-tc15 workshop on graph-based representations in pattern recognition network analysis: methodological foundations we thank petteri kaski, nelly litvak, and naoki masuda for helpful comments. key: cord-027178-tqj8jgem authors: tian, changbo; zhang, yongzheng; yin, tao title: modeling of anti-tracking network based on convex-polytope topology date: 2020-06-15 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50417-5_32 sha: doc_id: 27178 cord_uid: tqj8jgem anti-tracking network plays an important role in protection of network users’ identities and communication privacy. confronted with the frequent network attacks or infiltration to anti-tracking network, a robust and destroy-resistant network topology is an important prerequisite to maintain the stability and security of anti-tracking network. from the aspects of network stability, network resilience and destroy-resistance, we propose the convex-polytope topology (cpt) applied in the anti-tracking network. cpt has three main advantages: (1) cpt can easily avoid the threat of key nodes and cut vertices to network structure; (2) even the nodes could randomly join in or quit the network, cpt can easily keep the network topology in stable structure without the global view of network; (3) cpt can easily achieve the self-optimization of network topology. anti-tracking network based on cpt can achieve the self-maintenance and self-optimization of its network topology. we compare cpt with other methods of topology optimization. from the experimental results, cpt has better robustness, resilience and destroy-resistance confronted with dynamically changed topology, and performs better in the efficiency of network self-optimization. the rapid growth of internet access has made communication privacy an increasingly important security requirement. and the low threshold and convenience of the techniques about network monitoring and tracing also have posed a great threat on the online privacy network users [1] [2] [3] . anti-tracking network is proposed as the countermeasure to fight against the network monitoring and censorship [4] [5] [6] . so, the attacks on anti-tracking network increase rapidly in recent years, such as network paralysis, network infiltration, communication tracking and so on. especially, for the p2p-based anti-tracking network which takes the advantage of the wide distribution of nodes for tracking-resistant communication, it is important to keep a stable, secure and destroy-resistant topology structure. also, considered that the nodes in p2p-based anti-tracking network can join in or quit the network freely and randomly, it increases the uncertainty and insecurity for anti-tracking network to maintain a robust network topology. much research has been done regarding the management or optimization of network topology. but as for the aspect of anti-tracking network, current research have some limitations which are concluded as follows: (1) the cost of network maintenance. some of current research needs the global view of network to optimize the network topology [7] [8] [9] . confronted with dynamically changed network topology, the cost of network maintenance would be very high. (2) weak destroy-resistance. p2p-based anti-tracking network confronts not only the problem of communication tracing, but also the threats of network attacks [10] [11] [12] . especially, the attack to cut vertices and key nodes will destroy the network structure tremendously. (3) weak tracking-resistance. network infiltration is another threat on the security of anti-tracking network [13] [14] [15] . malicious nodes can be deployed in the network to measure the network topology and traceback the communication. to address the above problems, we propose the convex-polytope topology (cpt) which can be applied in anti-tracking network. the anti-tracking network based on cpt achieves the self-optimization of its network. the novelty and advantages of our proposal are summarized as follows: (1) we apply the convex-polytope topology in the anti-tracking network. the anti-tracking network based on cpt has better performance in network stability and resilience. (2) we propose a topology maintenance mechanism based on cpt. each node in network only needs to maintain its local topology to keep the whole network in stable convex-polytope structure. (3) we propose a network self-optimization mechanism based on cpt. with the self-optimization mechanism, anti-tracking network can optimize its network topology as the network topology changes. convex-polytope topology (cpt) is a structured topology in which all nodes are constructed into the shape of convex-polytope, as illustrated in fig. 1 . cpt has the following advantages to maintain the robust and destroy-resistant topology of anti-tracking network. -stability: the structure of cpt has the advantage in avoiding the cut vertices, except the extreme cases in which the connectivity of cpt is too sparse, such as ring structure. so, the maintenance of network topology is just to keep the convex-polytope structure. -resilience: confronted with the dynamically changed topology, each node in cpt only needs to maintain its local topology to accord with the properties of convex-polytope, then the whole network will be in the convex-polytope structure. -self-optimization: the network stability, robustness and destroy-resistance benefit from the balanced distribution of nodes' connectivity. cpt can achieve the self-optimization of the whole network through the optimization of each node's local topology. consider the topology structure of cpt illustrated in fig. 1 , the whole network is constructed in a convex-polytope structure logically. to maintain this structure, each node has to record both of the neighbouring nodes and the corresponding surfaces in the convex-polytope. let v denotes a node, cn denotes its neighbouring node collection, and cs denotes its surface collection. we take the example of node v shown in fig. 1 , then so, if one neighbouring node of v disconnects, then node v knows which surfaces are affected by the lost neighbouring node, and adjusts its local topology accordingly. as we can see from the fig. 1 , each surface of convex-polytope can be different polygons. for a fixed number of nodes, if each surface has more edges, the connectivity of the whole network becomes more sparse. on the contrary, each surface has less edges, the connectivity of the whole network becomes more dense. if all surfaces have three edges, then the connectivity of the network reaches the maximum. so, in our work, we construct cpt of which each surface is triangle to provide better stability and resilience of anti-tracking network. assume there are n nodes to construct cpt with triangle surfaces, then the number of surfaces and edges in this kind of cpt is fixed. theorem 1 gives the details about the calculation of the number of this cpt's surfaces and edges. theorem 1. if a convex-polytope has n nodes, and each surface is triangle, then the number of surfaces and edges of this convex-polytope is fixed. the number of edges is: l = 3 × (n − 2), the number of surfaces is: m = 2 × (n − 2). proof. because each surface of convex-polytope with n nodes is triangle, and each edge is shared by two surfaces, assume the number of edges is l, and the number of surfaces is m, then 3 × m = 2 × l. according to euler theorem of polyhedron, we have the equation n + m − l = 2. even though cpt has the advantages in the maintenance of network topology, in some extreme cases shown in fig. 2 , cpt still has the potential threat in topology structure, such as key nodes. key nodes would have big influence on the stability, robustness and security of network topology. in the construction process of cpt, the balanced distribution of each node's connectivity is the priority to construct a stable and robust anti-tracking network. we propose the construction algorithm to construct a cpt with balanced distribution of nodes' connectivity. the construction algorithm takes two main steps shown as follows: (1) constructing a circuit: all nodes construct a circuit. (2) adding edges iteratively: each node takes turns at selecting a node which has the span of 2 and allow new connections along with one direction of the circuit, and adding an edge with it to generate a triangle surface. if all surfaces of one node are triangles, it will refuse new connection with it in order to keep the whole topology in convex-polytope structure. after all nodes have triangle surfaces, the cpt with triangle surfaces has been constructed. input: c = [v1, v2, ..., vn]: node collection. all nodes construct a circuit. get a node u which has the span of 2 with v. 6 : if t he surfaces of v are all triangles then 8: if n ≤ 0 then 11: break. 12: end if 13: end while algorithm 1 gives the pseudocode for the construction algorithm of cpt. the time complexity of the construction algorithm mainly depends on the executions of the process of building node circuit (line 1) and while loop (line 3∼6). the time complexity of building node circuit is o(n). because the edges of cpt is fixed which has been proved in theorem 1, and the node circuit has formed n edges. then, the while loop in algorithm 1 only needs to be executed (2 × n − 6) times. so, the time complexity of algorithm 1 is o(n). to make straightforward sense of cpt construction algorithm, fig. 3 illustrates the construction process of cpt with 9 nodes, labeled v 1 , v 2 , ..., v 9 . firstly, these 9 nodes construct a circuit. we take v 1 as the first node to add an edge with v 3 , then v 1 has a triangle surface (v 1 , v 2 , v 3 ). iteratively, v 2 adds an edge with v 4 , and v 3 adds an edge with v 5 , untill the surfaces of all nodes are triangles. in fig. 3(f) , v 2 needs to select a node with the span of 2 to add an edge, there are three nodes, v 5 , v 6 , v 8 . because all the surfaces of v 5 and v 8 are triangles, the extra edges may break the structure of convex-polytope. so, v 2 can only add an edge with v 6 . as the nodes can join in or quit the anti-tracking network freely, the frequent change of network topology inevitably has a big influence on the stability and security of network structure. so, an effective maintenance mechanism of network topology guanrantees the robustness and usability of anti-tracking network. in this section, we will discuss the maintenance mechanism in which the relevant nodes only update its local topology to keep the anti-tracking network in convexpolytope structure when confronted with the dynamically changed topology. in the maintenance mechanism of cpt, two cases are taken into consideration: that of the nodes join in the network and that of the nodes quit the network. . so, the node join of cpt only changes the topology of one surface. the whole topology can be easily maintained in the convex-polytope structure. nodes quit cpt. if a node quits cpt, all the surfaces of this node will merge into one surface. if the quitting node has a high degree, cpt will generate a large surface which is detrimental to the stable and robust network topology. for example, fig. 5 (a) shows a node with degree of 3 quits cpt. when v 1 quits cpt, its three surfaces (v 2 , v 3 ), (v 2 , v 4 ) and (v 3 , v 4 ) merge into one surface which is still a triangle. in this case, cpt need no additional actions to maintain the topology. fig. 5(b) , node v 1 with degree of 5 quits cpt, the surface generated by quitting v 1 is pentagon. in this case, the relevant nodes have to update its local topology to keep convex-polytope structure with all triangle surfaces. the maintenance process only exists in the nodes of the generated pentagon surface. through the connection between these nodes, the pentagon surface can be divided into different triangle surfaces. the maintenance of the generated pentagon surface is similiar with the construction algorithm of cpt. if all surfaces of a node are triangles, it will refuse any connections. each node in the pentagon surface connects with a node which has the span of 2 and allow new connection iteratively until all nodes in this surface has triangle surfaces. for example shown in fig. 5(b) , after v 1 quits cpt, v 2 firstly establishes a connection with v 5 to generate a triangle surface (v 3 , v 5 ). then v 3 has all triangle surfaces and refuse any connection with it. next, v 5 establishes a connection with v 4 to generate a triangle. then v 6 has all triangle surfaces and refuse any connections. at last, the maintenance of cpt is finished. as we have discussed above, cpt can keep the network topology in the stable and resilient convex-polytope structure. but the convex-polytope structure still has some extreme cases shown in fig. 2 which may pose a potential threat on the structure stability and communication efficiency of anti-tracking network because of the key nodes. according to the properties of cpt, we can achieve the self-optimization of anti-tracking network to balance the distribution of nodes' connectivity. as we have mentioned in theorem 1 that the number of edges in convexpolytope of which all surfaces are triangles is: l = 3 × (n − 2), and n denotes the number of the nodes. then, we can calculate the average degree of cpt as shown in eq. 2. when n is very large, the average degree of cptd is close to 6. so, we usually set the upper limit of nodes' degree as 10 in cpt. the nodes of which the degree is bigger than 10 should adjust its local topology to reduce its connectivity. every node in network has the ability to control its connectivity, then cpt has the ability to optimize the network topology. theorem 1 has proved that the number of edges are fixed for a cpt of which all surfaces are triangles, if the degree of some nodes are reduced, the degree of other nodes should be increased to keep the convex-polytope structure. so, the self-optimization of cpt can be viewed as a transfer process of node degree from the high-degree nodes to the low-degree nodes. assume that node v 0 has n neighbouring nodes which are labeled in sequence from v 1 to v n . because all the neighbouring nodes of v 0 form a circuit, we use v i−1 and v i+1 respectively represent the adjacent nodes of v i in the node circuit. if node v 0 needs to reduce its degree, firstly node v 0 has to request for the degree d i of its each neighbouring node v i (1 ≤ i ≤ n). according to the degree of its neighbouring nodes, node v 0 selects a suitable neighbouring node to disconnect. to maintain the convex-polytope structure, the corresponding two neighbouring nodes of node v 0 have to connect with each other. as illustrated in fig. 6 , node v 0 disconnects with node v 2 , a quadrilateral surface (v 0 , v 1 , v 2 , v 3 ) is formed. after v 1 connects with v 3 , the degree of node v 0 and v 2 is reduced and the degree of v 1 and v 3 is increased. to balance the distribution of nodes' connectivity, it is better to guarantee that two disconnected nodes are with high-degree and the two connected nodes are with low-degree. so, before node v 0 decides the disconnected node, it needs to calculate which node is more suitable to disconnect according to the degree of each neighbouring node. for each neighbouring node v i of node v 0 , we calculate the fitness value c i of node v i as shown in eq. 3. if the degree of node v i is smaller than one of its two adjacent nodes, the fitness value of node v i is 0. because after node v 0 disconnects with this candidate node, the degree of its adjacent nodes will become bigger. if the degree of node v i is bigger than both of its adjacent nodes, node v 0 calculate the fitness value by the multiplication of (d i − d i−1 ) and (d i − d i+1 ). at last, node v 0 selects the candidate node with the biggest fitness value to disconnect. then, node v 0 instructs the two adjacent nodes of candidate node to connect with each other to maintain the topology stable. algorithm 2 gives the pseudocode for the self-optimization algorithm of each node. once the degree of each node is bigger than the upper limit of degree (we usually set the upper limit of degree is 10), the current node begins the self-optimization process to adjust its local topology. and the basic idea of selfoptimization is to transfer the degree from the high-degree node to low-degree node. with the collaboration of all nodes in anti-tracking network, cpt can achieve the self-optimization of network topology. in this section, we compare cpt with two self-organizing methods of network topology, one is a based on the neural network (nn) [16] and the other is based on distributed hash tables (dht) [17] . nn achieves the self-optimization of topology by the neural network algorithm deployed in each node. dht achieves the self-organizing network based on the hierarchical aggregation structure. we evaluate the three methods from the following aspects: (1)network robustness, (2)maintenance efficiency, and (3)communication efficiency. firstly, we seperately simulate three networks with 5000 nodes by cpt, nn and dht, calculate the distribution of nodes' degree d d and the distribution of nodes' density d s to compare the difference of the three network topologies. we use the number of nodes which have the span not more than 2 with v i to represent the density of v i . as illustrated in fig. 7(a) , d d of cpt is mainly distributed in the interval [3, 10] , because of the convex-polytope structure limitation. d d of nn is mainly distributed in the interval [3, 15] . d d of dht is distributed widely. the distribution of nodes' density has the similar tendency with the distribution of nodes' degree. in general, the local density of nodes is in direct proportion to their degree, and the nodes with high density becomes the key nodes easily. in fig. 7(b) , d s of cpt is mainly distributed in the low density area which means the topology structure of cpt is more stable than nn and dht. to evaluate the network resilience, we firstly give a metric to quantify the performance of network resilience. as shown in eq. 4, we use a graph g to represent the original network, g(p) denotes the subgraph after p percent of nodes is removed from g, mcs(g(p)) denotes the maximum connected subgraph of g(p), num(g) denotes the node number of a graph g, n denotes the node number of g. the metric β measures the maximum connectivity of the network after some nodes are removed from the network. the β is higher, the network resilience is better. based on the above simulated networks constructed by cpt, nn and dht, we remove p percent nodes from the three networks each time until all nodes are removed. in each round, we calculate the β to evaluate network resilience. in the experiments, we use two different ways to remove nodes shown as follows. -random-p removal. in each round of nodes removal, we remove p percent of nodes from the network randomly. -top-p removal. in each round of nodes removal, we remove p percent of nodes with the highest degree. from the experimental results shown in fig. 8 , cpt has a better performance in network resilience and β keeps higher than nn and dht in both randomp removal and top-p removal. in random-p removal, even β of cpt is always higher than nn and dht, the difference of the three methods is not very high. the performance of the three methods in network resilience begins to decrease when β is less than 40%. but in top-p removal, the performance of dht degrades sharply. the performance of nn in network resilience begins decrease when β is less than 20%. but cpt still maintains good performance in network resilience under top-p removal. the experiments show that top-p removal causes bigger damage to the the real network environment is complex and changeable, the maintenance efficiency has a direct influence on the performance of network self-optimization. if the network maintenance process costs too much time or computing power, it is ineffective in the practical application. in this section, we evaluate the maintenance efficiency of cpt, nn and dht from two aspects. -efficiency of network construction. according to the netword construction methods of cpt, nn and dht, we construct a network in simulation environment and count the time requirement for comparison. -efficiency of network self-optimization. for each network constructed by cpt, nn and dht, we randomly remove p percent of nodes and count the time requirement of network self-optimization as the metric. by increasing the percentage of removed nodes, we compare the change of time requirement to evaluate the efficiency of network self-optimization. figure 9 (a) illustrates the time requirement of network construction with cpt, nn and dht. the value of x-axis denotes the node number of the constructing network. with the increase of the network node number, the time requirement of cpt increases slower than nn and dht. limited by the computation of neural network, the time requirement of nn increases sharply. dht relays too much on the agent nodes to construct and manage the network topology. benefited from the convex-polytope structure, the whole network of cpt keeps the balanced distribution of nodes' connectivity. without the too much computation like nn and too many connections like dht, cpt can achieve better efficiency of network construction. figure 9 (b) illustrates the time requirement of network self-optimization after p percent of nodes are removed from the network randomly. when more than 50% nodes are removed randomly, the network will be split into different subgraphs. in this case, we need to connect the different subgraphs into one whole network, then continue the network self-optimization process and count the time requirement. as we can see in fig. 9(b) , the curves of three methods increase at the beginning, and then decrease when p is more than 50%. because along with the increase of the percentage of removed nodes, the network size becomes smaller. so, when the network size reduces to a certain degree, the time requirement of network self-optimization will decrease. in this experiment, cpt performs better in both network construction and network self-optimization. the simple structured topology of cpt improves the efficiency of network construction, maintenance and self-optimization. to evaluate the communication efficiency, we randomly select two nodes to communicate in each test and count the average time requirement after 100 round tests. with the increase of network size, we compare the communication efficiency of cpt, nn and dht through the average time requirement. the network diameter and average path length increase with the increase of network size. the randomly selected two nodes would spend more time to communicate in each network. as illustrated in fig. 10 , the average time requirement of cpt is always higher than nn and dht. but, the difference of communication efficiency between cpt, nn and dht is not very big and it is also acceptable for anti-tracking network. the reason of cpt's low communication efficiency blames for the convexpolytope structure. to keep the convex-polytope structure, cpt improves the network robustness and destroy-resistance at the cost of communication efficiency. and the research has proved that anti-tracking network can not achieve all the three properties: strong tracking-resistance, low bandwidth overhead, and low latency overhead [18] . but from the tracking-resistant standpoint, the in this paper, we propose convex-polytope topology (cpt) which can be applied in anti-tracking network. we construct the anti-tracking network according to the properties of convex-polytope, and maintain the network topology in the convex-polytope structure. when the node join in or quit the network, cpt can still maintain the convex-polytope topology to keep a stable and resilient network. based on the convex-polytope topology, we design a self-optimization mechanism for anti-tracking network. the experimental results show that cpt has better performance in network robustness, maintenance efficiency than current works. a survey on routing in anonymous communication protocols raptor: routing attacks on privacy in tor development of measures of online privacy concern and protection for use on the internet achieving dynamic communication path for anti-tracking network a sybil attack detection scheme for a forest wildfire monitoring application a review on sybil and sinkhole of service attack in vanet. recent trends electron a fuzzy particle swarm optimization algorithm for computer communication network topology design topology management in unstructured p2p networks using neural networks self-organizing network services with evolutionary adaptation a loss-tolerant mechanism of message segmentation and reconstruction in multi-path communication of antitracking network de-anonymizing and countermeasures in anonymous communication networks how to block tor's hidden bridges: detecting methods and countermeasures information leaks in structured peer-to-peer anonymous communication systems detecting sybil nodes in anonymous communication systems seina: a stealthy and effective internal attack in hadoop systems a smart topology construction method for anti-tracking network based on the neural network a p2p computing based self-organizing network routing model anonymity trilemma: strong anonymity, low bandwidth overhead, low latency-choose two acknowledgments. we thank the anonymous reviewers for their insightful comments. this research was supported in part by the national natural science foundation of china under grant no. u1736218 and no.61572496. yongzheng zhang is the corresponding author. key: cord-020885-f667icyt authors: sharma, ujjwal; rudinac, stevan; worring, marcel; demmers, joris; van dolen, willemijn title: semantic path-based learning for review volume prediction date: 2020-03-17 journal: advances in information retrieval doi: 10.1007/978-3-030-45439-5_54 sha: doc_id: 20885 cord_uid: f667icyt graphs offer a natural abstraction for modeling complex real-world systems where entities are represented as nodes and edges encode relations between them. in such networks, entities may share common or similar attributes and may be connected by paths through multiple attribute modalities. in this work, we present an approach that uses semantically meaningful, bimodal random walks on real-world heterogeneous networks to extract correlations between nodes and bring together nodes with shared or similar attributes. an attention-based mechanism is used to combine multiple attribute-specific representations in a late fusion setup. we focus on a real-world network formed by restaurants and their shared attributes and evaluate performance on predicting the number of reviews a restaurant receives, a strong proxy for popularity. our results demonstrate the rich expressiveness of such representations in predicting review volume and the ability of an attention-based model to selectively combine individual representations for maximum predictive power on the chosen downstream task. multimodal graphs have been extensively used in modeling real-world networks where entities interact and communicate with each other through multiple information pathways or modalities [1, 23, 31] . each modality encodes a distinct view of the relation between nodes. for example, within a social network, users can be connected by their shared preference for a similar product or by their presence in the same geographic locale. each of these semantic contexts links the same user set with a distinct edge set. such networks have been extensively used for applications like semantic proximity search in existing interaction networks [7] , augmenting semantic relations between entities [36] , learning interactions in an unsupervised fashion [3] and augmenting traditional matrix factorization-based collaborative filtering models for recommendation [27] . each modality within a multimodal network encodes a different semantic relation and exhibits a distinct view of the network. while such views contain relations between nodes based on interactions within a single modality, observed outcomes in the real-world are often a complex combination of these interactions. therefore, it is essential to compose these complementary interactions meaningfully to build a better representation of the real world. in this work, we examine a multimodal approach that attempts to model the review-generation process as the end-product of complex interactions within a restaurant network. restaurants share a host of attributes with each other, each of which may be treated as a modality. for example, they may share the same neighborhood, the same operating hours, similar kind of cuisine, or the same 'look and feel'. furthermore, each of these attributes only uncovers a specific type of relation. for example, a view that only uses the location-modality will contain venues only connected by their colocation in a common geographical unit and will prioritize physical proximity over any other attribute. broadly, each of these views is characterized by a semantic context and encodes modality-specific relations between restaurants. these views, although informative, are complementary and only record associations within the same modality. while each of these views encodes a part of the interactions within the network, performance on a downstream task relies on a suitable combination of views pertinent to the task [5] . in this work, we use metapaths as a semantic interface to specify which relations within a network may be relevant or meaningful and worth investigating. we generate bimodal low-dimensional embeddings for each of these metapaths. furthermore, we conjecture that their relevance on a downstream task varies with the nature of the task and that this task-specific modality relevance should be learned from data. in this work, -we propose a novel method that incorporates restaurants and their attributes into a multimodal graph and extracts multiple, bimodal low dimensional representations for restaurants based on available paths through shared visual, textual, geographical and categorical features. -we use an attention-based fusion mechanism for selectively combining representations extracted from multiple modalities. -we evaluate and contrast the performance of modality-specific representations and joint representations for predicting review volume. the principle challenge in working with multimodal data revolves around the task of extracting and assimilating information from multiple modalities to learn informative joint representations. in this section, we discuss prior work that leverages graph-based structures for extracting information from multiple modalities, focussing on the auto-captioning task that introduced such methods. we then examine prior work on network embeddings that aim to learn discriminative representations for nodes in a graph. graph-based learning techniques provide an elegant means for incorporating semantic similarities between multimedia documents. as such, they have been used for inference in large multimodal collections where a single modality may not carry sufficient information [2] . initial work in this domain was structured around the task of captioning unseen images using correlations learned over multiple modalities (tag-propagation or auto-tagging). pan et al. use a graph-based model to discover correlations between image features and text for automatic image-captioning [21] . urban et al. use an image-context graph consisting of captions, image features and images to retrieve relevant images for a textual query [32] . stathopoulos et al. [28] build upon [32] to learn a similarity measure over words based on their co-occurrence on the web and use these similarities to introduce links between similar caption words. rudinac et al. augment the image-context graph with users as an additional modality and deploy it for generating visual-summaries of geographical regions [25] . since we are interested in discovering multimodal similarities between restaurants, we use a graph layout similar to the one proposed by pan et al. [21] for the image auto-captioning task but replace images with restaurants as central nodes. other nodes containing textual features, visual features and users are retained. we also add categorical information like cuisines as a separate modality, allowing them to serve as semantic anchors within the representation. graph representation learning aims to learn mappings that embed graph nodes in a low-dimensional compressed representation. the objective is to learn embeddings where geometric relationships in the compressed embedding space reflect structural relationships in the graph. traditional approaches generate these embeddings by finding the leading eigenvectors from the affinity matrix for representing nodes [16, 24] . with the advent of deep learning, neural networks have become increasingly popular for learning such representations, jointly, from multiple modalities in an end-to-end pipeline [4, 11, 14, 30, 34] . existing random walk-based embedding methods are extensions of the random walks with restarts (rwr) paradigm. traditional rwr-based techniques compute an affinity between two nodes in a graph by ascertaining the steadystate transition probability between them. they have been extensively used for the aforementioned auto-captioning tasks [21, 25, 28, 32] , tourism recommendation [15] and web search as an integral part of the pagerank algorithm [20] . deep learning-based approaches build upon the traditional paradigm by optimizing the co-occurrence statistics of nodes sampled from these walks. deepwalk [22] uses nodes sampled from short truncated random walks as phrases to optimize a skip-gram objective similar to word2vec [17] . similarly, node2vec augments this learning paradigm with second-order random walks parameterized by exploration parameters p and q which control between the importance of homophily and structural equivalence in the learnt representations [8] . for a homogeneous network, random walk based methods like deepwalk and node2vec assume that while the probabilities of transitioning from one node to another can be different, every transition still occurs between nodes of the same type. for heterogeneous graphs, this assumption may be fallacious as all transitions do not occur between nodes of the same type and consequently, do not carry the same semantic context. indeed, our initial experiments with node2vec model suggest that it is not designed to handle highly multimodal graphs. clements et al. [5] demonstrated that in the context of content recommendation, the importance of modalities is strongly task-dependent and treating all edges in heterogeneous graphs as equivalent can discard this information. metapath2vec [6] remedies this by introducing unbiased walks over the network schema specified by a metapath [29] , allowing the network to learn the semantics specified by the metapath rather than those imposed purely by the topology of the graph. metapath-based approaches have been extended to a variety of other problems. hu et al. use an exhaustive list of semantically-meaningful metapaths for extracting top-n recommendations with a neural co-attention network [10] . shi et al. use metapath-specific representations in a traditional matrix factorization-based collaborative filtering mechanism [27] . in this work, we perform random walks on sub-networks of a restaurant-attribute network containing restaurants and attribute modalities. these attribute modalities may contain images, text or categorical features. for each of these sub-networks, we perform random walks and use a variant of the heterogeneous skip-gram objective introduced in [6] to generate low-dimensional bimodal embeddings. bimodal embeddings have several interesting properties. training relations between two modalities provide us with a degree of modularity where modalities can be included or held-out from the prediction model without affecting others. it also makes training inexpensive as the number of nodes when only considering two modalities is far lower than in the entire graph. in this section, we begin by providing a formal introduction to graph terminology that is frequently referenced in this paper. we then move on to detail our proposed method illustrated in fig. 1 . formally, a heterogeneous graph is denoted by g = (v, e, φ, σ) where v and e denote the node and edge sets respectively. for every node and edge, there exists mapping functions φ(v) → a and σ(e) → r where a and r are sets of node types and edge types respectively such that |a + r| > 2. for a heterogeneous graph g = (v, e, φ, σ), a network schema is a metagraph m g = (a, r) where a is the set of node types in v and r is the set of edge types in e. a network schema enumerates the possible node types and edge types that can occur within a network. a metapath m(a 1 , a n ) is a path on the network schema m g consisting of a sequence of ordered edge transitions: we use tripadvisor to collect information for restaurants in amsterdam. each venue characteristic is then embedded as a separate node within a multimodal graph. in the figure above r nodes denote restaurants, i nodes denote images for a restaurant, d nodes are review documents, a nodes are categorical attributes for restaurants and l nodes are locations. bimodal random walks are used to extract pairwise correlations between nodes in separate modalities which are embedded using a heterogeneous skip-gram objective. finally, an attention-based fusion model is used to combine multiple embeddings together to regress the review volume for restaurants. let g = (v, e) be the heterogeneous graph with a set of nodes v and edges e. we assume the graph to be undirected as linkages between venues and their attributes are inherently symmetric. below, we describe the node types used to construct the graph (cf. figs. 1 and 2 and use the penultimate layer output as a compressed low-dimensional representation for the image. since the number of available images for each venue may vary dramatically depending on its popularity, adding a node for every image can lead to an unreasonably large graph. to mitigate this issue, we cluster image features for each restaurant using the k-means algorithm and use the cluster centers as representative image features for a restaurant, similar to zahálka et al. [35] . we chose k = 5 as a reasonable trade-off between the granularity of our representations and tractability of generating embeddings for this modality. the way patrons write about a restaurant and the usage of specialized terms can contain important information about a restaurant that may be missing from its categorical attributes. for example, usage of the indian cottage cheese 'paneer' can be found in similar cuisine types like nepali, surinamese, etc. and user reviews talking about dishes containing 'paneer' can be leveraged to infer that indian and nepali cuisines share some degree of similarity. to model such effects, we collect reviews for every restaurant. since individual reviews may not provide a comprehensive unbiased picture of the restaurant, we chose not to treat them individually, but to consider them as a single document. we then use a distributed bag-ofwords model from [13] to generate low-dimensional representations of these documents for each restaurant. since the reviews of a restaurant can widely vary based on its popularity, we only consider the 10 most recent reviews for each restaurant to prevent biases from document length getting into the model. 6. users: since tripadvisor does not record check-ins, we can only leverage explicit feedback from users who chose to leave a review. we add a node for each of the users who visited at least two restaurants in amsterdam and left a review. similar to [25, 28, 32] , we introduce two kinds of edges in our graph: 1. attribute edges: these are heterogeneous edges that connect a restaurant node to the nodes of its categorical attributes, image features, review features and users. in our graph, we instantiate them as undirected, unweighted edges. 2. similarity edges: these are homogeneous edges between the feature nodes within a single modality. for image features, we use a radial basis function as a non-linear transformation of the euclidean distances between image feature vectors. for document vectors, we use cosine similarity to find restaurants with similar reviews. adding a weighted similarity edge between every node in the same modality would yield an extremely dense adjacency matrix. to avoid this, we only add similarity links between a node and its k nearest neighbors in each modality. by choosing the nearest k neighbors, we make our similarity threshold adaptive allowing it to adjust to varying scales of distance in multiple modalities. metapaths can provide a modular and simple interface for injecting semantics into the network. since metapaths, in our case, are essentially paths over the modality set, they can be used to encode inter-modality correlations. in this work, we generate embeddings with two specific properties: 1. all metapaths are binary and only include transitions over 2 modalities. since venues/restaurants are always a part of the metapath, we only include one other modality. 2. during optimization, we only track the short-range context by choosing a small window size. window size is the maximum distance between the input node and a predicted node in a walk. in our model, walks over the metapath only capture short-range semantic contexts and the choice of a larger window can be detrimental to generalization. for example, consider a random walk over the restaurant -cuisine -restaurant metapath. in the sampled nodes below, restaurants are in red while cuisines are in blue. optimizing over a large context window can lead to mcdonald's (fast-food cuisine) and kediri (indonesian cuisine) being placed close in the embedding space. this is erroneous and does not capture the intended semantics which should bring restaurants closer only if they share the exact attribute. we use the metapaths in table 1 to perform unbiased random walks on the graph detailed in sect. 3.2. each of these metapaths enforces similarity based on certain semantics. we train separate embeddings using the heterogeneous skip-gram objective similar to [6] . for every metapath, we maximize the probability of observing the heterogeneous context n a (v) given the node v. in eq. (3) , a m is the node type-set and v m is the node-set for metapath m. arg max θ v∈vm a∈am ca∈na (v) log p(c a |v; θ) the original metapath2vec model [6] uses multiple metapaths [29] to learn separate embeddings, some of which perform better than the others. on the dblp metapath-specific embeddings fig. 3 . attention-weighted modality fusion: metapath-specific embeddings are fed into a common attention mechanism that generates an attention vector. each modality is then reweighted with the attention vector and concatenated. this joint representation is then fed into a ridge regressor to predict the volume of ratings for each restaurant. bibliographic graph that consists of authors (a), papers (p) and venues (v), the performance of their recommended metapath 'a-p-v-p-a' was empirically better than the alternative metapath 'a-p-a' on the node classification task. at this point, it is important to recall that in our model, each metapath extracts a separate view of the same graph. these views may contain complementary information and it may be disadvantageous to only retain the best performing view. for an optimal representation, these complementary views should be fused. in this work, we employ an embedding-level attention mechanism similar to the attention mechanism introduced in [33] that selectively combines embeddings based on their performance on a downstream task. assuming s to be the set of metapath-specific embeddings for metapaths m 1 , m 2 , . . . , m n , following the approach outlined in fig. 3 , we can denote it as: we then use a two-layer neural network to learn an embedding-specific attention a mn for metapath m n : further, we perform a softmax transformation of the attention network outputs to an embedding-specific weight finally, we concatenate the attention-weighted metapath-specific embeddings to generate a fused embedding we evaluate the performance of the embedding fusion model on the task of predicting the volume (total count) of reviews received by a restaurant. we conjecture that the volume of reviews is an unbiased proxy for the general popularity and footfall for a restaurant and is more reliable than indicators like ranking or ratings which may be biased by tripadvisor's promotion algorithms. we use the review volume collected from tripadvisor as the target variable and model this task as a regression problem. data collection. we use publicly-available data from tripadvisor for our experiments. to build the graph detailed in sect. 3.2, we collect data for 3,538 restaurants in amsterdam, the netherlands that are listed on tripadvisor. we additionally collect 168,483 user-contributed restaurant reviews made by 105,480 unique users, of which only 27,318 users visit more than 2 restaurants in the city. we only retain these 27,318 users in our graph and drop others. we also collect 215,544 user-contributed images for these restaurants. we construct the restaurant network by embedding venues and their attributes listed in table 1 as nodes. bimodal embeddings. we train separate bimodal embeddings by optimizing the heterogeneous skip-gram objective from eq. (3) using stochastic gradient descent and train embeddings for all metapaths enumerated in table 1 . we use restaurant nodes as root nodes for the unbiased random walks and perform 80 walks per root node, each with a walk length of 80. each embedding has a dimensionality of 48, uses a window-size of 5 and is trained for 200 epochs. embedding fusion models. we chose two fusion models in our experiments to analyze the efficacy of our embeddings: 1. simple concatenation model: we use a model that performs a simple concatenation of the individual metapath-specific embeddings detailed in sect. 3.4 to exhibit the baseline performance on the tasks detailed in sect. 4. simple concatenation is a well-established additive fusion technique in multimodal deep learning [18, 19] . each of the models uses a ridge regression algorithm to estimate the predictive power of each metapath-specific embedding on the volume regression task. this regressor is jointly trained with the attention model in the attention-weighted model. all models are optimized using stochastic gradient descent with the adam optimizer [12] with a learning rate of 0.1. in table 2 , we report the results from our experiments on the review-volume prediction task. we observe that metapaths with nodes containing categorical attributes perform significantly better than vector-based features. in particular, categorical attributes like cuisines, facilities, and price have a significantly higher coefficient of determination (r 2 ) as compared to visual feature nodes. it is interesting to observe here that nodes like locations, images, and textual reviews are far more numerous than categorical nodes and part of their decreased performance may be explained by the fact that our method of short walks may not be sufficiently expressive when the number of feature nodes is large. in addition, as mentioned in related work, we performed these experiments with the node2vec model, but since it is not designed for heterogeneous multimodal graphs, it yielded performance scores far below the weakest single modality. a review of the fusion models indicates that taking all the metapaths together can improve performance significantly. the baseline simple concatenation fusion model, commonly used in literature, is considerably better than the best-performing metapath (venues -facilities -venues). the attention basedmodel builds significantly over the baseline performance and while it employs a similar concatenation scheme as the baseline concatenation model, the introduction of the attention module allows it to handle noisy and unreliable modalities. the significant increase in the predictive ability of the attention-based model can be attributed to the fact that while all modalities encode information, some of them may be less informative or reliable than others, and therefore contribute less to the performance of the model. our proposed fusion approach is, therefore, capable of handling weak or noisy modalities appropriately. in this work, we propose an alternative, modular framework for learning from multimodal graphs. we use metapaths as a means to specify semantic relations between nodes and each of our bimodal embeddings captures similarities between restaurant nodes on a single attribute. our attention-based model combines separately learned bimodal embeddings using a late-fusion setup for predicting the review volume of the restaurants. while each of the modalities can predict the volume of reviews to a certain extent, a more comprehensive picture is only built by combining complementary information from multiple modalities. we demonstrate the benefits of our fusion approach on the review volume prediction task and demonstrate that a fusion of complementary views provides the best way to learn from such networks. in future work, we will investigate how the technique generalises to other tasks and domains. mantis: system support for multimodal networks of in-situ sensors hyperlearn: a distributed approach for representation learning in datasets with many modalities interaction networks for learning about objects, relations and physics heterogeneous network embedding via deep architectures the task-dependent effect of tags and ratings on social media access metapath2vec: scalable representation learning for heterogeneous networks m-hin: complex embeddings for heterogeneous information networks via metagraphs node2vec: scalable feature learning for networks deep residual learning for image recognition leveraging meta-path based context for top-n recommendation with a neural co-attention model multimodal network embedding via attention based multi-view variational autoencoder adam: a method for stochastic gradient descent distributed representations of sentences and documents deep collaborative embedding for social image understanding how random walks can help tourism image labeling on a network: using social-network metadata for image classification distributed representations of words and phrases and their compositionality multimodal deep learning multi-source deep learning for human pose estimation the pagerank citation ranking: bringing order to the web gcap: graph-based automatic image captioning deepwalk: online learning of social representations the visual display of regulatory information and networks nonlinear dimensionality reduction by locally linear embedding generating visual summaries of geographic areas using community-contributed images imagenet large scale visual recognition challenge heterogeneous information network embedding for recommendation semantic relationships in multi-modal graphs for automatic image annotation pathsim: meta path-based top-k similarity search in heterogeneous information networks line: large-scale information network embedding study on optimal frequency design problem for multimodal network using probit-based user equilibrium assignment adaptive image retrieval using a graph model for semantic feature integration heterogeneous graph attention network network representation learning with rich text information interactive multimodal learning for venue recommendation metagraph2vec: complex semantic path augmented heterogeneous network embedding key: cord-024499-14jlk5tv authors: balalau, oana; goyal, sagar title: subrank: subgraph embeddings via a subgraph proximity measure date: 2020-04-17 journal: advances in knowledge discovery and data mining doi: 10.1007/978-3-030-47426-3_38 sha: doc_id: 24499 cord_uid: 14jlk5tv representation learning for graph data has gained a lot of attention in recent years. however, state-of-the-art research is focused mostly on node embeddings, with little effort dedicated to the closely related task of computing subgraph embeddings. subgraph embeddings have many applications, such as community detection, cascade prediction, and question answering. in this work, we propose a subgraph to subgraph proximity measure as a building block for a subgraph embedding framework. experiments on real-world datasets show that our approach, subrank, outperforms state-of-the-art methods on several important data mining tasks. in recent years we have witnessed the success of graph representation learning in many tasks such as community detection [8, 19] , link prediction [10, 20] , graph classification [3] , and cascade growth prediction [13] . a large body of work has focused on node embeddings, techniques that represent nodes as dense vectors that preserve the properties of nodes in the original graph [5, 9] . representation learning of larger structures has generally been associated with embedding collections of graphs [3] . paths, subgraphs and communities embeddings have received far less attention despite their importance in graphs. in homogeneous graphs, subgraph embeddings have been used in community prediction [1, 8] , and cascade growth prediction [6, 13] . in heterogeneous graphs, subgraphs embedding have tackled tasks such as semantic user search [14] and question answering [4] . nevertheless, the techniques proposed in the literature for computing subgraph embeddings have at least one of the following two drawbacks: i ) they are supervised techniques and such they are dependent on annotated data and do not generalize to other tasks; ii ) they can tackle only a specific type of subgraph. approach. in this work, we tackle the problem of computing subgraph embeddings in an unsupervised setting, where embeddings are trained for one task and will be tested on different tasks. we propose a subgraph embedding method based on a novel subgraph proximity measure. our measure is inspired by the random walk proximity measure personalized pagerank [11] . we show that our subgraph embeddings are comprehensive and achieve competitive performance on three important data mining tasks: community detection, link prediction, and cascade growth prediction. contributions. our salient contributions in this work are: • we define a novel subgraph to subgraph proximity measure; • we introduce a framework that learns comprehensive subgraphs embeddings; • in a thorough experimental evaluation, we highlight the potential of our method on a variety of data mining tasks. node embeddings. methods for computing node embeddings aim to represent nodes as low-dimensional vectors that summarize properties of nodes, such as their neighborhood. the numerous embedding techniques differ in the computational model and in what properties of nodes are conserved. for example, in matrix factorization approaches, the goal is to perform dimension reduction on a matrix that encodes the pairwise proximity of nodes, where proximity is defined as adjacency [2] , k-step transitions [7] , or katz centrality [16] . random walk approaches have been inspired by the important progress achieved in the nlp community in computing word embeddings [15] . these techniques optimize node embeddings such that nodes co-occurring in short random walks in the graph have similar embeddings [10, 18] . another successful technique is to take as input a node and an embedding similarity distribution and minimizes the kl-divergence between the two distributions [19, 20] . subgraph embeddings. a natural follow-up question is how to compute embeddings for larger structures in the graph, such as paths, arbitrary subgraphs, motifs or communities. in [1] , the authors propose a method inspired by paragraphvector [12] , where each subgraph is represented as a collection of random walks. subgraph and node embeddings are learned such that given a subgraph and a random walk, we can predict the next node in the walk using the subgraph embedding and the node embeddings. the approach is tested on link prediction and on community detection, using ego-networks to represent nodes. in [13] , the authors present an end-to-end neural framework that given in input the cascade graph, predicts the future growth of the cascade for a given time period. a cascade graph is sampled for a set of random walks, which are given as input to a gated neural network to predict the future size of the cascade. [6] is similarly an end-to-end neural framework for cascade prediction, but based on the hawkes process. the method transforms the cascade into diffusion paths, where each path describes the process of information propagation within the observation time-frame. another very important type of subgraph is a community and in [8] community embeddings are represented as multivariate gaussian distributions. graph embeddings. given a collection of graphs, a graph embedding technique will learn representations for each graph. in [3] , the authors propose an inductive framework for computing graph embeddings, based on training an attention network to predict a graph proximity measure, such as graph edit distance. graph embeddings are closely related to graph kernels, functions that measure the similarity between pairs of graphs [21] . graph kernels are used together with kernel methods such as svm to perform graph classification [22] . preliminaries. pagerank [17] is the stationary distribution of a random walk in which, at a given step, with a probability α, a surfer teleports to a random node and with probability 1 − α, moves along a randomly chosen outgoing edge of the current node. in personalized pagerank (ppr) [11] , instead of teleporting to a random node with probability α, the surfer teleports to a randomly chosen node from a set of predefined seed nodes. let p r(u) be the pagerank of node u and p p r(u, v) be the pagerank score of node v personalized for seed node u. problem statement. given a directed graph g = (v, e), a set of subgraphs s 1 , s 2 , · · · , s k of g and an integer d, compute the d-dimensional embeddings of the subgraphs. we define a subgraph proximity measure inspired by personalized pagerank. let s i and s j be two subgraphs in a directed graph g. their proximity in the graph is: where p r si (v i ) represents the pagerank of node v i in the subgraph s i , and p p r(v i , v j ) the pagerank of node v j personalized for node v i in the graph g. when considering how to define proximity between subgraphs, our intuition is as follows: important nodes in subgraph s i should be close to important nodes in subgraph s j . this condition is fulfilled as pagerank will give high scores to important nodes in the subgraphs and personalized pagerank will give high scores to nodes that are "close" or "similar". we note that our measure is a similarity measure, hence subgraphs that are similar will receive a high proximity score. we choose the term proximity to emphasis that our measure relates to nearness in the graph, as it is computed using random walks. we can interpret eq. 1 using random walks, as follows: alice is a random surfer in the subgraph s i , bob is a random surfer in the subgraph s j , and carol is a random surfer in graph g. alice decides to send a message to bob via carol. carol starts from the current node alice is visiting (p r si (v i )) and she will reach a node v j ∈ s j with probability p p r(v i , v j ). bob will be there to receive the message with probability p r sj (v j ). normalized proximity. given a collection of subgraphs s = {s 1 , s 2 , · · · s k }, we normalize the proximity px(s i , s j ), ∀j ∈ 1, k such that it can be interpreted as a probability distribution. the normalized proximity for a subgraph s i is: rank of a subgraph. similarly to pagerank, our proximity can inform us of the importance of a subgraph. the normalized proximity given a collection of subgraphs s 1 , s 2 , · · · s k can be expressed as a stochastic matrix, where each row i encodes the normalized proximity given subgraph s i . the importance of subgraph s i can be computed by summing up the elements of column i. sampling according to the proximity measure. given a subgraph s i in input, we present a procedure for efficiently sampling px(s i , ·) introduced in eq. 1. we suppose that all the pagerank vectors of the subgraphs {s 1 , s 2 , · · · s k } have been precomputed. we first select a node n i in s i according to distribution p r si . secondly, we start a random walk from n i in the graph g and we select n j , the last node in the walk before the teleportation. lastly, node n j may belong to several subgraphs s j 1 , s j 2 · · · . we return a subgraph s j according to the normalized distribution p r s j 1 (n j ), p r s j 2 (n j ), · · · . the procedure doesn't require computing the personalized pagerank vectors, which saves us o(n 2 ) space. we shall use this procedure for computing embeddings, thus avoiding computing and storing the full proximity measure px. given a graph g = (v, e) and set of subgraphs of g, s = {s 1 , s 2 , · · · , s k }, we learn their representations as dense vectors, i.e. as embeddings. we extend the framework in [20] proposed for computing node embeddings to an approach for subgraph embeddings. in [20] , the authors propose to learn node embeddings such that the embeddings preserve an input similarity distribution between nodes. the similarities of a node v to any other node in the graph are represented by the similarity distribution sim g , where w∈v sim g (v, w) = 1. the corresponding embedding similarity distribution is sim e . the optimization function of the learning algorithm minimizes the kullback-leibler (kl) divergence between the two proximity distributions: the authors propose several options for instantiating sim g , such as personalized pagerank and adjacency similarity. the similarity between embeddings, sim e , is the normalized dot product of the vectors. in order to adapt this approach to our case, we define the subgraph-tosubgraph proximity sim g to be the normalized proximity presented in eq. 2. the embedding similarity sim e is computed in the same manner and the optimization function now minimizes the divergence between distributions defined on our input subgraphs, i.e. sim g , sim e : s × s → [0, 1]. in our experimental evaluation we use this method, which we refer to as subrank. we note that sim g will not be fully computed, but approximated using the sampling procedure presented in sect. 3.1. proximity of ego-networks. two very important tasks in graph mining are community detection and link prediction. suppose alice is a computer scientist and she joins twitter. she starts following the updates of andrew ng, but also the updates of her friends, diana and john. bob is also a computer scientist on twitter and he follows andrew ng, jure leskovec and his friend julia. as shown in fig. 1 , there is no path in the directed graph between alice and bob. a pathbased similarity measure between nodes alice and bob, such as personalized pagerank, will return similarity 0, while it will return high values between alice and andrew ng and between bob and andrew ng. an optimization algorithm for computing node embeddings will have to address this trade-off, with a potential loss in the quality of the representations. thus, we might miss that both alice and bob are computer scientists. to address this issue we capture the information stored in the neighbors of the nodes by considering ego-networks. therefore in our work, we represent a node v as its ego network of size k (the nodes reachable from v in k steps). in sect. 4, we perform quantitative analysis to validate our intuition. proximity of cascade subgraphs. in a graph, an information cascade can be modeled as a directed tree, where the root represents the original content creator, and the remaining nodes represent the content reshares. when considering the task of predicting the future size of the cascade, the nodes already in the cascade are important, as it very likely their neighbors will be affected by the information propagation. however, nodes that have reshared more recently the information are more visible to their neighbors. when running pagerank on a directed tree, we observe that nodes on the same level have the same score, and the score of nodes increases as we increase the depth. hence, two cascade trees will have a high proximity scorepx if nodes that have joined later the cascades (i.e. are on lower levels in the trees) are "close" or "similar" according to personalized pagerank. in sect. 5, we perform quantitative analysis and we show that our approach gives better results than a method that gives equal importance to all nodes in the cascade. datasets. we perform experiments on five real-world graphs, described below. we report their characteristics in table 1 . • citeseer 1 is a citation network created from the citeseer digital library. nodes are publications and edges denote citations. the node labels represent fields in computer science. • cora (see footnote 1) is also a citation network and the node labels represent subfields in machine learning. • polblogs 2 is a directed network of hyperlinks between political blogs discussing us politics. the labels correspond to republican and democrat blogs. competitors. we evaluate our method, subrank, against several state-of-theart methods for node and subgraph embedding computation. for each method, we used the code provided by the authors. we compare with: • deepwalk [18] learns node embeddings by sampling random walks, and then applying the skipgram model. the parameters are set to the recommended values, i.e. walk length t = 80, γ = 80, and window size w = 10. • node2vec [10] is a hyperparameter-supervised approach that extends deep-walk. we fine-tuned the hyperparameters p and q on each dataset and task. in addition, r = 10, l = 80, k = 10, and the optimization is run for an epoch. • line [19] proposes two proximity measures for computing two d-dimensional vectors for each node. in our experiments, we use the second-order proximity, as it can be used for both directed and undirected graphs. we run experiments with t = 1000 samples and s = 5 negative samples, as described in the paper. • verse [20] learns node embeddings that preserve the proximity of nodes in the graph. we use personalized pagerank as a proximity measure, the default option proposed in the paper. we run the learning algorithm for 10 5 iterations. • verseavg is a adaption of verse, in which the embedding of a node is the average of the verse embeddings of the nodes in its ego network. • sub2vec [1] computes subgraph embeddings and for the experimental evaluation, we compute the embeddings of the ego networks. using the guidelines of the authors, for cora, citeseer and polblogs we select ego networks of size 2 and for the denser networks cithep and dblp, ego networks of size 1. for the first four methods, node embeddings are used to represent nodes. for sub2vec, subrank and verseavg, the ego network embedding is the node representation. the embeddings are used as node features for community detection and link prediction. we compute 128 dimensional embeddings. parameter setting for subrank. we represent each node by its ego network of size 1. we run the learning algorithm for 10 5 iterations. our code is public. we assess the quality of the embeddings in terms of their ability to capture communities in a graph. for this, we use the k-means algorithm to cluster the nodes embedded in the d-dimensional space. in table 2 we report the normalized mutual information (nmi) with respect to the original label distribution. on polblogs, subrank has a low nmi, while on citeseer and cora it outperforms the other methods. on dblp it has a comparative performance with verse. node classification is the task of predicting the correct node labels in a graph. for each dataset, we try several configurations by varying the percentage of nodes used in training. we evaluate the methods using the micro and macro f 1 score, and we report the micro f 1, as both measures present similar trends. the results are presented in table 3 . on citeseer and cora subrank significantly outperforms the other methods. on polblogs, subrank performs similarly to the other baselines, even though the embeddings achieved a low nmi score. on dblp, subrank is the second best method. to create training data for link prediction, we randomly remove 10% of edges, ensuring that each node retains at least one neighbor. this set represents the ground truth in the test set, while we take the remaining graph as the training set. in addition, we randomly sample an equal number of node pairs that have no edge connecting them as negative samples in our test set. we then learn embeddings on the graph without the 10% edges. next, for each edge (u, v) in the training or the test set, we obtain the edge features by computing the hadamard product of the embeddings for u and v. the hadamard product has shown a better performance than other operators for this task [10, 20] . we report the accuracy of the link prediction task in table 4 . our method achieves the best performance on 4 out of 5 datasets. given in input: i ) a social network g = (v, e), captured at a time t 0 , ii ) a set of information cascades c that appear in g after the timestamp t 0 , and that are captured after t 1 duration from their creation, iii ) a time window t 2 , our goal is to predict the growth of a cascade, i.e. the number of new nodes a cascade acquires, at t 1 + t 2 time from its creation. note that given a cascade c = (v c , e c ) ∈ c, we know that the nodes v c are present in v , however c can contain new edges not present in e. datasets. we select for evaluation two datasets from the literature: • aminer [13] represents cascades of scientific citations. we use the simplified version made available by the authors 5 . the dataset contains a global citation graph and the cascades graphs. a node in a graph represents an author and an edge from a 1 to a 2 represents the citation of a 2 in an article of a 1 . a cascade shows all the citations of a given paper. competitors. we compare subrank with the following state-of-the-art methods for the task of predicting the future size of cascades: • deepcas [13] is an end-to-end neural network framework that given in input the cascade graph, predicts the future growth of the cascade for a given period. the parameters are set to the values specified in the paper: k = 200, t = 10, mini-batch size is 5 and α = 0.01. • deephawkes [6] is similarly an end-to-end deep learning framework for cascade prediction based on the hawkes process. we set the parameters to the default given by the authors: the learning rate for user embeddings is 5×10 −4 and the learning rate for other variables is 5 × 10 −3 . • in addition, we consider the node embedding method verse [20] , as one of the top-performing baseline in the previous section. the node embeddings are learned on the original graph and a cascade is represented as the average of the embeddings of the nodes it contains. we then train a multi-layer perceptron (mlp) regressor to predict the growth of the cascade. parameter setting for subrank. we recall that our subgraph proximity measure requires the computation of ppr of nodes in the graph and the pr of nodes in the subgraphs. for this task, we consider the ppr of nodes in the global graph and the pr of nodes in the cascades. we obtain the cascade embeddings which are then used to train an mlp regressor. for both verse and subrank we perform a grid search for the optimal parameters of the regressor. we report the mean squared error (mse) on the logarithm of the cascade growth value, as done in previous work on cascade prediction [6, 13] in table 5 . we observe that subrank out-performs verse thus corroborating our intuition that nodes appearing later in a cascade should be given more importance. the best mse overall is obtained by the end-to-end framework deephawkes which is expected as the method is tailored for the task. we note, however, that subrank achieves the best results on aminer. in this work, we introduce a new measure of proximity for subgraphs and a framework for computing subgraph embeddings. in a departure from previous work, we focus on general-purpose embeddings, and we shed light on why our method is suited for several data mining tasks. our experimental evaluation shows that the subgraph embeddings achieve competitive performance on three downstream applications: community detection, link prediction, and cascade prediction. sub2vec: feature learning for subgraphs distributed large-scale natural graph factorization unsupervised inductive graph-level representation learning via graph-graph proximity question answering with subgraph embeddings a comprehensive survey of graph embedding: problems, techniques, and applications deephawkes: bridging the gap between prediction and understanding of information cascades grarep: learning graph representations with global structural information learning community embedding with community detection and node embedding on graphs graph embedding techniques, applications, and performance: a survey node2vec: scalable feature learning for networks topic-sensitive pagerank: a context-sensitive ranking algorithm for web search distributed representations of sentences and documents deepcas: an end-to-end predictor of information cascades subgraph-augmented path embedding for semantic user search on heterogeneous social network distributed representations of words and phrases and their compositionality asymmetric transitivity preserving graph embedding the pagerank citation ranking: bringing order to the web deepwalk: online learning of social representations line: large-scale information network embedding verse: versatile graph embeddings from similarity measures graph kernels retgk: graph kernels based on return probabilities of random walks key: cord-306727-2c1m04je authors: pandey, prateek; litoriya, ratnesh title: promoting trustless computation through blockchain technology date: 2020-05-20 journal: natl acad sci lett doi: 10.1007/s40009-020-00978-0 sha: doc_id: 306727 cord_uid: 2c1m04je records, irrespective of their nature (whether electronic or paper-based), are vulnerable to fraud. people's hard-earned money, their personal information, identity, and health are at a higher risk than ever due to the misuse of technology in doing forgery. however, the technology can also be used as an answer to counteracting against fraudulence prevalent in affairs from every walk of life. this short paper attempts to present the blockchain technology as a solution to overcome the menace of forgery by promoting trustless computing in business transactions. the paper explains the blockchain technology and a variety of its implementation through five different use cases in the field of drug supply chain, health insurance, land record management, courier services, and immigration records. the immigration blockchain is also proposed as a solution to check pandemic like the coronavirus (covid-19) effectively. the implementation of the blockchain is performed using a locally built ibm’s hyper-ledger fabric-based platform, and ethereum public platform. the results are encouraging enough to substitute existing business operations using blockchain-based solutions. blockchain, in a straightforward way, can be described as a linked-list data structure where each node (or a computer) is addressed using a hash function of the current and all the historical nodes arranged chronologically [1] . as a consequence of this arrangement, if any node, intermediate or terminal, is tampered with, the decentralized network thus formed by making replicas of the blockchain shall not accept it by a democratic mechanism called consensus. therefore, the data rest inside the blockchain is tamperproof. although the cost of maintaining the blockchain network is too high, yet it turns out to be a promising area in which organizations and nations are investing heavily [2] . the most significant selling point of the blockchain technology is that it does not require the participants of the network to trust each other; thus, the necessity of a transaction arbitrator becomes unwarranted. therefore, virtually all the use cases, from money transaction to mushroom supply chain, and from agricultural commodity market place to electronic health records (ehr) are a potential client for blockchain technology. blockchain is broadly categorized in two ways: public blockchain and private blockchain. public blockchain is characterized by the fact that anybody can actively participate in the network. private blockchain, on the other hand, is a restricted network where all the active participants are authorized by some governing authority to make transactions. apart from permission, the consensus is another vital parameter on which blockchains can be differentiated. in the case of public blockchains, the consensus is achieved using time-taking and computationally intensive algorithms like proof-of-work (pow). pow is used as a disincentive for the attackers, and thus provide security to the network [3] . due to the high cost of computing power involved in pow, the industry is looking towards other consensus approaches like proof-of-stake (pos). private blockchain, on the other hand, becomes selective in giving permissions to participate, and resort to a mechanism called a selective endorsement, in which some predetermined endorsers are made responsible for validating or invalidating a transaction. performance-wise, private blockchains have higher transaction throughput than public blockchains, which need more time and computation power to approve a transaction owing to pow. in this article, the authors propose the use of blockchain in five different use cases, namely drug supply chain, health insurance, land record management, courier services, and immigration records. before setting up a blockchain network, it is essential to ascertain the type of blockchain which varies from use cases to use cases; therefore, a discussion on the problems and solution objectives for the use cases is warranted to determine the blockchain type and is presented in the subsequent paragraphs. the drug supply chain is vulnerable to encroachment by fraudsters who introduce counterfeit medicines through various access points. medical store owners can purchase fake drugs from unauthorized dealers and can make a bounty. another related problem is the reselling of the already sold medicine. this happens when a patient is admitted to the hospital, and the caretakers are told to purchase unnecessary medication [4] . later those medicines are smuggled back to the drug store, by the staff, in the hospital premises. consider a blockchain-based solution, where each drug sachet bears a qr code that can be scanned through an authorized mobile app. any customer may check the authenticity of the drug he or she purchases, as the provenance of the drug is traceable. also, once a drug is sold against a prescription, the same is recorded on a public ledger, making it impossible to resell with a bill. around 100 billion inr annual losses are suffered by the insurance industry in india, owing to fake claims [5] . generally, insurance frauds are done in two ways: (1) people get admitted to the hospitals for at least 24 h to claim insurance; however, actually, they are not seriously ill. (2) people hide their old diseases and treatments undergone at the time of buying new health policies, in order to avoid the increased premium cost. if a nation-wide blockchain is prepared that will register the birth, vaccination, medical treatments, and even death registration, the chances to hide pre-existing diseases and treatment would become difficult. also, the quality of treatment would be improved, as a treating physician would know the historical health-related details of the patient on a mouse click. unlike developed countries, india put forth a strong case for blockchaining land record keeping. civil courts and newspapers are filled up with a number of cases where perpetrators have forged the records and rob the owners of their rightful properties [6] . keeping land records on blockchain not only secure the data from tampering but can also be helpful in land provenance at the time of selling or purchasing an immovable asset. with the growing availability and popularity of the internet, b2c online market is also on the rise. this rise in online business also provides a boost to the supporting services like a courier. though delivering a consignment late is a cause of bad user experience, but getting a rock in the box while you are expecting a mobile phone is a shock [7] . india has been witnessing such incidents of fraud delivery since the inception of the e-commerce market. on complaining, both the courier partner and the seller deny any involvement in the forgery, and often the customer or the online business facilitator has to pay for it. by keeping the consignment handovers over a blockchain and using x-ray imagery smartly can deter the forgery to a great extent and thus help to restore the customers' trust in e-commerce activities. nowadays, exponentially spreading pandemic covid-19 (or coronavirus) presents a potential use case for recording a trail of individual's immigration data [8] . various south asian countries allowed immigrants to enter without screening that had a recent travel history to china, south korea, and badly hit european countries. had any reliable and handy way to verify visitors' travel trail were present, it would have been an effective measure to check the diffusion of the virus. the above-discussed problems can also be solved using a centralized system approach, but centralized systems possess vulnerability to attack; therefore, a decentralized approach is suitable. also, the issue of trust among stakeholders advocates the use of blockchain in a decentralized system. whether a particular use case should be implemented as a public or a private blockchain is a crucial decision to make. immigration information is a kind of personal record, which is vulnerable to breach of privacy if caught in the wrong hands; thus, immigration blockchain should be made privately available to only a few identified government authorities, of course, apart from the immigrant. courier tracking is mostly an internal affair of the logistics company involved and does not have to include other parties invalidating the consignment at every handover stage. thus, keeping the courier management use case on a private blockchain ought to serve the purpose. land records application, on the other hand, is made for the public in general, and therefore, using a public blockchain platform. drug supply and health insurance use cases ought to be built around a private blockchain owing to the selective participation of the stakeholders. the blockchain-based pilot solutions for each use case are implemented on the ethereum platform and a locally built architecture based on ibm's hyperledger fabric [9] . ethereum is an excellent platform for public blockchains, and hyper ledger fabric is generally used for private blockchains. therefore, drug supply, health insurance, and land records blockchains are implemented over the ethereum network, whereas immigration and courier blockchains are implemented on hyper ledger fabric. for the ethereum setup, the rinkeby public network is used for deploying the smart contracts using infura api. for locally testing the contracts, the ganache server is used for testing. all the contracts are written in node.js, and the communication to the ethereum world is made using the web3 module. the locally built hyperledger fabric-based blockchain architecture use four types of nodes called clients, endorsers, organizer, and the committers (fig. 1) . the architecture is decentralized because the same blockchain is replicated over multiple nodes, which participate in computing. the replica of the blockchain is available at endorser nodes, and committing nodes; however, no blockchain is maintained on client nodes whose sole purpose is to request transactions. endorsers receive transaction requests from clients and either endorse or reject those requests. ordering service collects these transactions to assign their respective blocks and sending them to the committers and endorsers. three endorsers (e 0 , e 1 , and e 2 ), two committer nodes, and five ordering nodes are used in this experiment. while rinkeby public network uses pow as consensus, the locally built architecture uses pbft [10] , and no-op (no consensus needed) approaches. ethereum works on the concept of smart contracts, which are nothing but computer codes that execute on the network when some criteria are met. a typical use case that takes place in land asset transfer is the transfer of entitlement from the landowner to the buyer. code snippet 1 shows this sample contract written in solidity programming language for ethereum. the above landtransfer contract can be stated as follows: 1. a seller has to pay 100 ether for entering into contact, whereas buyer just pay the buying price 2. registry charges include 100 ether plus 5% of the total selling price as commission 3. the only registrar can perform the transaction for implementing private blockchain on the locally build system, we take the example of an immigration application. since this is a private network, therefore, only those who are permitted can participate or view the network information. authentication and authority agency (aaa) performs this operation of giving authority. we assume that immigration checkpoints are located at all international airports, seaports, and roadway borders. whenever an immigrant enters, their entry is recorded on a public ledger used by all the participating countries. every such entry is a new transaction requested by the client node. this request goes to the endorsing nodes (see fig. 1 ) where smart contracts get executed, and the request is then sent back to the client nodes with an endorsement. by this time, no ledger update takes place. the client then sends the endorsed request to the ordering nodes, where pbft or noop algorithm works to assign an existing block or a new block to a transaction. afterward, the news about the newly created block is broken out to the adjoining peers (committing nodes and endorsers), and all the nodes update their blockchain accordingly; thus, a consistent ledger is 1 1 1 1 10 9 9 3 1 1 9 9 2 2 2 1 8 7 10 3 2 2 7 8 3 2 3 1 9 7 11 3 2 1 6 7 4 2 4 2 5 6 12 3 3 2 6 6 5 2 4 1 5 5 13 3 3 1 6 6 6 2 5 2 4 5 14 3 4 2 5 6 7 2 5 1 4 5 15 3 4 1 5 5 8 3 1 2 8 maintained at every participating node. the test network was laid using five computer systems (nodes) reserved for ordering service. three nodes were designated as endorsing nodes, and two nodes were given the task of committing the transactions on the ledger. endorsing nodes and committing nodes to maintain their copies of blockchains, which, of course, have to be kept consistent by the consensus algorithms running on the network. each block was fixed at 20 kb size and can contain ten transactions, roughly. a time limit of 5 min was also assigned to each shaping block; otherwise, a block might have to wait for too long to commit itself in the blockchain if no new transactions are coming up. the outcome of the discussed exercise is that whenever a foreign visitor or a resident citizen enters into the national territory, the immigration trail of the said person can be verified, and if required, the same person can be quarantined then and there without giving a chance to contract the disease to others. while implementing the private blockchain for immigration records and courier tracking, we observe the performance of the built network in terms of throughput, which is the number of transactions written in the ledger per second. throughput is a concern because consensus algorithms like pow or pbft take time to verify a transaction. we created 15 different configurations of nodes and test the throughput of immigration records and courier tracking implementations on each configuration. the results are tabulated in table 1 and plotted in figs. 2, 3, 4 and 5. table 1 and the performance graphs, it can be concluded that the network throughput is mainly influenced by the ordering nodes and by choice of consensus algorithm involved. the performance of public blockchains we created is not observed because the pilot applications were deployed on rinkeby public blockchain network, and the typical network latency for rinkeby is 15-30 s. in a nutshell, this article presented a holistic approach to answering the fraudulence prevalent in a number of businesses-public or privately owned. the solutions that are proposed are all based on using blockchain as an underlying platform to perform transactions, thus providing security and transparency to counteract against encroachments of all forms, be in lands, drugs, insurance claims, courier delivery, or the most deadly of all covid-19. the authors presented the performance observed after the implementation of the said applications and noticed that the throughput is a significant concern in blockchain implementations. in the case of private blockchains, the number of validating nodes (ordering nodes) plays an important part in deciding the performance of the network. in this testing period, when the entire world is severely affected by the coronavirus, the suggested use case of the public versus private: what to know before getting started with blockchain the blockchain arms race: america vs. china. the national interest proof of work (pow) consensus white coated corruption implementing healthcare services on a large scale: challenges and remedies based on blockchain technology bigger than vyapam, madhya pradesh land record 'scam' cry in digital drive. the times of india online shopping goes bizarre: customer buys smartphone on amazon, gets a stone delivered instead the ministry of home affairs (2020) advisory: travel and visa restrictions related to covid-19. bureau of immigration hyperledger fabric: the flexible blockchain framework that's changing the business world dynamic practical byzantine fault tolerance key: cord-026306-mkmrninv authors: lepskiy, alexander; meshcheryakova, natalia title: belief functions for the importance assessment in multiplex networks date: 2020-05-15 journal: information processing and management of uncertainty in knowledge-based systems doi: 10.1007/978-3-030-50143-3_22 sha: doc_id: 26306 cord_uid: mkmrninv we apply dempster-shafer theory in order to reveal important elements in undirected weighted networks. we estimate cooperation of each node with different groups of vertices that surround it via construction of belief functions. the obtained intensities of cooperation are further redistributed over all elements of a particular group of nodes that results in pignistic probabilities of node-to-node interactions. finally, pairwise interactions can be aggregated into the centrality vector that ranks nodes with respect to derived values. we also adapt the proposed model to multiplex networks. in this type of networks nodes can be differently connected with each other on several levels of interaction. various combination rules help to analyze such systems as a single entity, that has many advantages in the study of complex systems. in particular, dempster rule takes into account the inconsistency in initial data that has an impact on the final centrality ranking. we also provide a numerical example that illustrates the distinctive features of the proposed model. additionally, we establish analytical relations between a proposed measure and classical centrality measures for particular graph configurations. dempster-shafer theory of belief functions [1, 2] is a widely used tool to measure belief or conflict between elements in a considered system [1, 2] . recently it has also found use in the field of social network analysis [3] . social networks represent interactions that are met between people, countries, in transportation systems, etc. one of the core problems in network science is the detection of central elements. in [4] a modified evidential centrality and evidential semi-local centrality in weighted network are proposed. the measures use the combination of "high", "low" and "(high, low)" probabilities of the influence based on weighted and unweighted degrees of nodes via dempster's rule. in [5] the same rule is applied in order to combine different node-to-node interactions in a network. the proposed measures that are able to detect social influencers were applied to twitter data. the theory of belief functions can be also adapted to the problem of community detection, i.e. the partition of nodes into tightly connected groups. for instance, in [6] the author proposed a novel method based on local density measures assigned to each node that are further used for the detection density peaks in a graph. in the frame of the recent work we mostly focus on the problem of the detection of the most influential as well as the most affected elements in networks. the knowledge about the position of nodes plays a significant role in understanding of structural properties of complex systems. there exist several networking approaches that aim to assess the importance of nodes in graphs. the first class of the methods refers to classical centrality measures [7] . it includes degree centrality measure that prioritizes over nodes with the largest number of neighbors or with the largest sum of incoming/outcoming weights. the eigenvector group of centralities, that includes eigenvector centrality itself, bonacich, pagerank, katz, hubs and authorities, alpha centrality, etc., takes into account the importance of neighbors of a node, i.e. the centrality of a vertex depends on centralities of the adjacent nodes [8] [9] [10] [11] [12] . closeness and betweenness centralities consider the distance between nodes and the number of the shortest paths that go through nodes in a network [13, 14] . another class of measures, that detect the most important elements, employs cooperative game theoretical approach. it includes the estimation of myerson values, that is similar to shapley-shubik index calculation [15] . it also requires the introduction of nodes set functions, that can vary depending on the problem statement. in [16] the hoede-bakker index is adjusted to the estimation of the influence elements in social networks. in [17] long-range interaction centrality (lric) is proposed, that estimates node-to-node influence with respect to individual attributes of nodes, the possibility of the group influence and indirect interactions through intermediate nodes. however, all the approaches described above are designed for so-called monoplex networks and require adaptation to complex structures with many types of interactions between adjacent nodes (so-called multilayer networks [18] ). in recent years multilayer networks became one of the central topics in the field of network science. a multilayer network where the set of nodes (or a part of nodes) remains the same through all layers is called multiplex network, which is the object of the research in this work. there exist several ways for the assessment of central elements in multiplex networks. firstly, one can calculate centralities for each layer separately and further aggregate the obtained values through all considered networks. secondly, one can aggregate connections between pairs of nodes to obtain monoplex network and then apply centrality measures to a new weighted graph. the mod-ification of classical centrality measures to interconnected multilayer networks is described in [18, 19] . in [20] social choice theory rules are applied to multiplex networks in order to detect key elements. however, the final results for these approaches are calculated from the secondary data. in this work we propose a novel technique of the key elements assessment. we construct a mapping between each node and sets of other nodes, which is a mass function. in case of several layers we combine mass functions on each layer to a unique function that can be used for the centrality estimation in the whole system. the key advantages of our approach are that we take into account interactions with different groups of nodes and we are able to estimate node-to-node influence within the whole network structure. we also take into account the consistency on connections on different network layers. this paper is organized as follows: in sect. 2 we describe some basic concepts from belief functions theory. in sect. 3 we propose a centrality measure for onelayer network and apply it to a toy example. in sect. 4 we develop an approach to elucidate important elements in networks with several layers. in the same section we apply the proposed method to two-layers network. section 5 contains a discussion of our approach as well as conclusion to the work. in this section we will remind some basic definitions and notions from dempster-shafer theory of belief functions [1, 2] that are further employed in this work. let x be a finite set that is called frame of discernment and 2 x is a set of all subsets of x. function m : 2 x → [0; 1] that meets the requirements of normalization condition, i.e. m(∅) = 0 and a∈2 x m(a) = 1, is called basic probability assignment or a mass function. all a ∈ 2 x such that m(a) > 0 are called focal elements and the family of all focal elements is called the body of evidence. mass function m can be associated with two set functions namely a belief function denoted by g(a) = b⊂a m(b) and a plausibility function denoted g(a) = b:a∩b =∅ m(b), that is dual to belief function g(a). these two functions can be considered as lower and upper bounds for the probability estimation of event a : g(a) ≤ p (a) ≤ḡ(a), a ∈ 2 x . the value of function g(a) reflects the belief level to the fact that x ∈ a ⊆ x, where x from x. we denote by bel(x) a set of all belief functions g on set x. belief function g can be also represented as a convex combination of categor. note that η x describes vacuous evidence that x ∈ x. thus, we call this function as vacuous belief function. additionally, mass function m(a) can be also expressed from belief function g with möbius transformation as m(a) = b⊂a (−1) |a\b| g (b) . in this work we mainly focus on combination techniques adopted from dempster-shafer theory. by combination we mean some operator r : bel(x) × bel(x) → bel(x) that transforms two belief functions into one belief function. we denote by m = m 1 ⊗ r m 2 the combinations of two mass functions m 1 and m 2 under rule r. there exist various combination rules that are widely used in the theory and applications of belief functions. for instance, dempster rule [1] , that is regarded as the pioneered and the most popular combination technique in dempster-shafer theory, is calculated as follows: indicates the level of conflict between two evidences. if k = 1 then the level of conflict is the highest and rule (1) is not applicable in this case. another combination technique that is similar to demster rule is yager combination rule [21] that is defined as according to this rule, the value of conflict k is reallocated among the mass of ignorance m(x). other combination rules are also described in [22] , some generalizations can be found in [23, 24] , axiomatics and the description of conflict rules are reviewed in [25] [26] [27] [28] . additionally, discounted technique proposed in [1] can be applied to mass functions in case when various sources of information that are determined by their belief functions have different levels of reliability or different priority. discounting of mass functions can be performed with the help of parameter α ∈ [0; 1] as follows: if α = 0 then the source of information is regarded as thoroughly reliable and m α (a) = m(a) ∀a ∈ 2 x . conversely, if α = 1 then m α (x) = 1 and the related belief function is vacuous. in this section we describe a graph model with one layer of interaction as well as the construction of centrality measure based on a mass function for a network. we consider connected graph as tuple g = (v, e, w ), where v = {v 1 , ..., v n } is a set of nodes, |v | = n, and e = {e(v i , v j )} as a set of edges. for the simplicity, we associate v k with number k, k = 1, ..., n and denote e(v i , v j ) as e ij . in this work we consider undirected network, i.e. e ij ∈ e implies that e ji ∈ e. we also analyze weighted networks, i.e. each edge e ij in network g associates with weight w ij ∈ w . without loss of generality, we assume that all weights w ij ∈ [0; 1] and w ij = 0 implies that e ij ∈ e. weight w ij between nodes v i and v j indicates the degree of interaction between corresponding nodes. our main focus is to range nodes with respect their importance in a network. we assume that a node is considered to be pivotal if it actively interacts with other nodes in a graph. in our analysis we take into account the connections with distant nodes as well as the cooperation with group of other nodes. more precisely, we suppose that centrality of a node depends on relative aggregated weight of adjacent subgraphs to the considered node. at the same time, the aggregated weight of a subgraph can be estimated with the help of monotonic measures including such measures as belief functions. we consider a family of belief functions we denote by |w | = i 0 and discordant if z < 0. 3) spatial interpolation: it is not practical to deploy and measure pm values at every location in the area of interest. however, using nearest measurement point to approximate the pm value at a location of interest may lead to erroneous results given the variability of pollution levels and weather in different locations in an urban environment. this can be mitigated by using spatial interpolation to estimate the pm values at unmeasured locations using known values at the measurement locations. in this paper, we have used idw, which is one of the simplest and popular deterministic spatial interpolation technique [17] . idw follows the principle that the nodes that are closer to the location of estimation will have more impact than the ones which are farther away. idw uses linearly weighted combination of the measured values at the nodes to estimate the parameter at the location of interest. the weight corresponding to a node is a function of inverse distance between the location of the node and the location of the estimate. in this paper, weights have been chosen to be inverse distance squared. the following analyses were applied on the obtained data set after cleaning and preprocessing: qq plots, time series plots, correlation analysis and spatial analysis. qq plots have been used on the two co-located nodes node1-airveda and node2-maingate to verify the distribution similarity. node1-airveda is an air quality monitoring device from airveda which has been tested against the standard pm sensor bam monitor and node2-maingate is the sensor node developed at iiit-h. qqplots have been plotted with one-hour averaged data for pm2.5 in fig. 5(a) and for pm10 in fig. 5(b) with node1-airveda on horizontal axis and node2-maingate on the vertical axis. the plots show linearity for most part with most of the sample points close to straight line with high density and very few points deviating from the linear relationship for both pm2.5 and pm10 samples. in the case of pm2.5 few deviating points belong to the higher end of the distribution while in the case of pm10 samples, few deviations can be seen at both lower and higher ends of the distribution. from the plots, it is safe to assume that the populations of the data samples of node1-airveda and node2-maingate follow a similar distribution with very few samples deviating. figs. 8(a) and 8(b) show idw based interpolation maps for pm10 plotted at timestamps 19:00:00 (before burning crackers) and 22:40:00 (after burning crackers) on the day of diwali. in fig. 8(a) , the hot-spot of the pm10 values is at the node1-airveda and node2-maingate which are placed near a six-lane highway and exposed directly to vehicular pollution. spatial variation can be clearly seen in fig. 8(a) between the nine points in an area of only 66 acres (0.267 km2) with node6-ftbg, node7-kcis and node8-library showing comparatively lower values being in the center of the campus. in fig. 8(b) , which shows the values at 22:40 after bursting of crackers, the values increase dramatically by 10 to 25 times. now the number of hot-spots has increased to four, of which node5-flyg and node4-fcyq are the sites for bursting crackers while node1-airveda, node2-maingate and node3-bakul are affected by both vehicular pollution and crackers burned outside the campus. node9-obh was off due to some technical issue on the evening of diwali, which has affected the interpolated values at that point and resulting in lower values than the actual. fig. 8(b) shows three nodes and the area in the center of the campus which are surrounded by the pollution hot-spots but yet show significantly lower values of pm. the spatial variation within the nine nodes is dominantly seen and hence demonstrates the need for local deployment of sensor nodes for accurate monitoring of the air quality conditions locally. fig. 8 also show the temporal variation of the values within a small time period of five hours an increase in value from 13 to around 360 at the node5-flyg and node4-fcyq. although similar results have been obtained for pm2.5, they are not shown here for brevity. in this paper, the dense deployment of iot nodes has been evaluated for monitoring pm values in urban indian setting. for this, nine nodes have been deployed in a small campus of iiit-h. a web-based dashboard has been developed for real time pm monitoring. the measurements done over the period of more than five months clearly show significant increase in pm values during diwali as well as the noticeable reduction in pm values during national lockdown during covid-19. it has been shown that correlation coefficient between some nodes in the same campus have low values demonstrating that the pm values across a small region may be significantly different. moreover, the idw-based spatial interpolation results on the day of diwali show significant spatial variation in pm values in the campus ranging from 96 to 382 for locations just a few hundred meters apart for pm10. the results also show notable temporal variations with pm values rising up to 25 times at the same spot in few hours. thus, there is sufficient motivation to use dense deployment of iot nodes for improved spatiotemporal monitoring of pm values. the lancet commission on pollution and health field performance of a low-cost sensor in the monitoring of particulate matter in monitoring particulate matter in india: recent trends and future outlook exposure to air pollution and covid-19 mortality in the united states a low cost sensing system for coopertaive air quality monitoring in urban areas evaluation of low-cost sensors for ambient pm2.5 monitoring city scale particulate matter monitoring using lorawan based air quality iot devices sds011 nova sensor specifications dht22 sensor specifications jiofi portable router jmr1040 javascript library for mobile-friendly interactive maps, accessed anomaly detection in temperature data using dbscan algorithm principles of geographical information systems key: cord-028688-5uzl1jpu authors: li, peisen; wang, guoyin; hu, jun; li, yun title: multi-granularity complex network representation learning date: 2020-06-10 journal: rough sets doi: 10.1007/978-3-030-52705-1_18 sha: doc_id: 28688 cord_uid: 5uzl1jpu network representation learning aims to learn the low dimensional vector of the nodes in a network while maintaining the inherent properties of the original information. existing algorithms focus on the single coarse-grained topology of nodes or text information alone, which cannot describe complex information networks. however, node structure and attribution are interdependent, indecomposable. therefore, it is essential to learn the representation of node based on both the topological structure and node additional attributes. in this paper, we propose a multi-granularity complex network representation learning model (mnrl), which integrates topological structure and additional information at the same time, and presents these fused information learning into the same granularity semantic space that through fine-to-coarse to refine the complex network. experiments show that our method can not only capture indecomposable multi-granularity information, but also retain various potential similarities of both topology and node attributes. it has achieved effective results in the downstream work of node classification and the link prediction on real-world datasets. complex network is the description of the relationship between entities and the carrier of various information in the real world, which has become an indispensable form of existence, such as medical systems, judicial networks, social networks, financial networks. mining knowledge in networks has drown continuous attention in both academia and industry. how to accurately analyze and make decisions on these problems and tasks from different information networks is a vital research. e.g. in the field of sociology, a large number of interactive social platforms such as weibo, wechat, facebook, and twitter, create a lot of social networks including relationships between users and a sharp increase in interactive review text information. studies have shown that these large, sparse new social networks at different levels of cognition will present the same smallworld nature and community structure as the real world. then, based on these interactive information networks for data analysis [1] , such as the prediction of criminal associations and sensitive groups, we can directly apply it to the real world. network representation learning is an effective analysis method for the recognition and representation of complex networks at different granularity levels, while preserving the inherent properties, mapping high-dimensional and sparse data to a low-dimensional, dense vector space. then apply vector-based machine learning techniques to handle tasks in different fields [2, 3] . for example, link prediction [4] , community discovery [5] , node classification [6] , recommendation system [7] , etc. in recent years, various advanced network representation learning methods based on topological structure have been proposed, such as deepwalk [8] , node2vec [9] , line [10] , which has become a classical algorithm for representation learning of complex networks, solves the problem of retaining the local topological structure. a series of deep learning-based network representation methods were then proposed to further solve the problems of global topological structure preservation and high-order nonlinearity of data, and increased efficiency. e.g., sdne [13] , gcn [14] and dane [12] . however, the existing researches has focused on coarser levels of granularity, that is, a single topological structure, without comprehensive consideration of various granular information such as behaviors, attributes, and features. it is not interpretable, which makes many decision-making systems unusable. in addition, the structure of the entity itself and its attributes or behavioral characteristics in a network are indecomposable [18] . therefore, analyzing a single granularity of information alone will lose a lot of potential information. for example, in a job-related crime relationship network is show in fig. 1 , the anti-reconnaissance of criminal suspects leads to a sparse network than common social networks. the undiscovered edge does not really mean two nodes are not related like p2 and p3 or (p1 and p2), but in case detection, additional information of the suspect needs to be considered. the two without an explicit relationship were involved in the same criminal activity at a certain place (l1), they may have some potential connection. the suspect p4 and p7 are related by the attribute a4, the topology without attribute cannot recognize why the relation between them is generated. so these location attributes and activity information are inherently indecomposable and interdependence with the suspect, making the two nodes recognize at a finer granularity based on the additional information and relationship structure that the low-dimensional representation vectors learned have certain similarities. we can directly predict the hidden relationship between the two suspects based on these potential similarities. therefore, it is necessary to consider the network topology and additional information of nodes. the cognitive learning mode of information network is exactly in line with the multi-granularity thinking mechanism of human intelligence problem solving, data is taken as knowledge expressed in the lowest granularity level of a multiple granularity space, while knowledge as the abstraction of data in coarse granularity levels [15] . multi-granularity cognitive computing fuses data at different granularity levels to acquire knowledge [16] . similarly, network representation learning can represent data into lower-dimensional granularity levels and preserve underlying properties and knowledge. to summarize, complex network representation learning faces the following challenges: information complementarity: the node topology and attributes are essentially two different types of granular information, and the integration of these granular information to enrich the semantic information of the network is a new perspective. but how to deal with the complementarity of its multiple levels and represent it in the same space is an arduous task. in complex networks, the similarity between entities depends not only on the topology structure, but also on the attribute information attached to the nodes. they are indecomposable and highly non-linear, so how to represent potential proximity is still worth studying. in order to address the above challenges, this paper proposes a multigranularity complex network learning representation method (mnrl) based on the idea of multi-granularity cognitive computing. network representation learning can be traced back to the traditional graph embedding, which is regarded as a process of data from high-dimensional to lowdimensional. the main methods include principal component analysis (pca) [19] and multidimensional scaling (mds) [21] . all these methods can be understood as using an n × k matrix to represent the original n × m matrix, where k m. later, some researchers proposed isomap and lle to maintain the overall structure of the nonlinear manifold [20] . in general, these methods have shown good performance on small networks. however, the time complexity is extremely high, which makes them unable to work on large-scale networks. another popular class of dimensionality reduction techniques uses the spectral characteristics (e.g. feature vectors) of a matrix that can be derived from a graph to embed the nodes. laplacian eigenmaps [22] obtain low-dimensional vector representations of each node in the feature vector representation graph associated with its k smallest non-trivial feature values. recently, deepwalk was inspired by word2vec [24] , a certain node was selected as the starting point, and the sequence of the nodes was obtained by random walk. then the obtained sequence was regarded as a sentence and input to the word2vec model to learn the low-dimensional representation vector. deep-walk can obtain the local context information of the nodes in the graph through random walks, so the learned representation vector reflects the local structure of the point in the network [8] . the more neighboring points that two nodes share in the network, the shorter the distance between the corresponding two vectors. node2vec uses biased random walks to make a choose between breadthfirst (bfs) and depth-first (dfs) graph search, resulting in a higher quality and more informative node representation than deepwalk, which is more widely used in network representation learning. line [10] proposes first-order and secondorder approximations for network representation learning from a new perspective. harp [25] obtains a vector representation of the original network through graph coarsening aggregation and node hierarchy propagation. recently, graph convolutional network (gcn) [14] significantly improves the performance of network topological structure analysis, which aggregates each node and its neighbors in the network through a convolutional layer, and outputs the weighted average of the aggregation results instead of the original node's representation. through the continuous stacking of convolutional layers, nodes can aggregate high-order neighbor information well. however, when the convolutional layers are superimposed to a certain number, the new features learned will be over-smoothed, which will damage the network representation performance. multi-gs [23] combines the concept of multi-granularity cognitive computing, divides the network structure according to people's cognitive habits, and then uses gcn to convolve different particle layers to obtain low-dimensional feature vector representations. sdne [13] directly inputs the network adjacency matrix to the autoencoder [26] to solve the problem of preserving highly nonlinear first-order and second-order similarity. the above network representation learning methods use only network structure information to learn low-dimensional node vectors. but nodes and edges in real-world networks are often associated with additional information, and these features are called attributes. for example, in social networking sites such as weibo, text content posted by users (nodes) is available. therefore, the node representation in the network also needs to learn from the rich content of node attributes and edge attributes. tadw studies the case where nodes are associated with text features. the author of tadw first proved that deepwalk essentially decomposes the transition probability matrix into two low-dimensional matrices. inspired by this result, tadw low-dimensionally represents the text feature matrix and node features through a matrix decomposition process [27] . cene treats text content as a special type of node and uses node-node structure and node-content association for node representation [28] . more recently, dane [12] and can [34] uses deep learning methods [11] to preserve poten-tially non-linear node topology and node attribute information. these two kinds of information provide different views for each node, but their heterogeneity is not considered. anrl optimizes the network structure and attribute information separately, and uses the skip-gram model to skillfully handle the heterogeneity of the two different types of information [29] . nevertheless, the consistent and complementary information in the topology and attributes is lost and the sensitivity to noise is increased, resulting in a lower robustness. to process different types of information, wang put forward the concepts of "from coarse to fine cognition" and "fine to coarse" fusion learning in the study of multi-granularity cognitive machine learning [30] . people usually do cognition at a coarser level first, for example, when we meet a person, we first recognize who the person is from the face, then refine the features to see the freckles on the face. while computers obtain semantic information that humans understand by fusing fine-grained data to coarse-grained levels. refining the granularity of complex networks and the integration between different granular layers is still an area worthy of deepening research [17, 31] . inspired by this, divides complex networks into different levels of granularity: single node and attribute data are microstructures, meso-structures are role similarity and community similarity, global network characteristics are extremely macro-structured. the larger the granularity, the wider the range of data covered, the smaller the granularity, the narrower the data covered. our model learns the semantic information that humans can understand at above mentioned levels from the finest-grained attribute information and topological structure, finally saves it into low-dimensional vectors. let g = (v, e, a) be a complex network, where v represents the set of n nodes and e represents the set of edges, and a represents the set of attributes. in detail, a ∈ n×m is a matrix that encodes all node additional attributes information, and a i ∈ a describes the attributes associated with node represents an edge between v i and v j . we formally define the multi-granularity network representation learning as follows: , we represent each node v i and attribute a i as a low-dimensional vector y i by learning a functionf g : |v | and y i not only retains the topology of the nodes but also the node attribute information. definition 2. given network g = (v, e, a). semantic similarity indicates that two nodes have similar attributes and neighbor structure, and the lowdimensional vector obtained by the network representation learning maintains the same similarity with the original network. e.g., if v i ∼ v j through the mapping function f g to get the low-dimensional vectors y i = f g (v i ), y j = f g (v j ), y i and y j are still similar, y i ∼ y j . complex networks are composed of node and attribute granules (elementary granules), which can no longer be decomposed. learning these grains to get different levels of semantic information includes topological structure (micro), role acquaintance (meso) and global structure (macro). the complete low-dimensional representation of a complex network is the aggregation of these granular layers of information. in order to solve the problems mentioned above, inspired by multi-granularity cognitive computing, we propose a multi-granularity network representation learning method (mnrl), which refines the complex network representation learning from the topology level to the node's attribute characteristics and various attachments. the model not only fuses finer granular information but also preserves the node topology, which enriches the semantic information of the relational network to solve the problem of the indecomposable and interdependence of information. the algorithm framework is shown in fig. 2 . firstly, the topology and additional information are fused through the function h, then the variational encoder is used to learn network representation from fine to coarse. the output of the embedded layer are low-dimensional vectors, which combines the attribute information and the network topology. to better characterize multiple granularity complex networks and solve the problem of nodes with potential associations that cannot be processed through the relationship structure alone, we refine the granularity to additional attributes, and designed an information fusion method, which are defined as follows: where n (v i ) is the neighbors of node v i in the network, a i is the attributes associated with node v i . w ij > 0 for weighted networks and w ij = 1 for unweighted networks. d(v j ) is the degree of node v j . x i contains potential information of multiple granularity information, both the neighbor attribute information and the node itself. to capture complementarity of different granularity hierarchies and avoid the effects of various noises, our model in fig. 1 is a variational auto-encoder, which is a powerful unsupervised deep model for feature learning. it has been widely used for multi-granularity cognitive computing applications. in multi-granularity complex networks, auto-encoders fuse different granularity data to a unified granularity space from fine to coarse. the variational auto-encoder contains three layers, namely, the input layer, the hidden layer, and the output layer, which are defined as follows: here, k is the number of layers for the encoder and decoder. σ (·) represents the possible activation functions such as relu, sigmod or tanh. w k and b k are the transformation matrix and bias vector in the k-th layer, respectively. y k i is the unified vector representation that learning from model, which obeys the distribution function e, reducing the influence of noise. e ∼ (0, 1) is the standard normal distribution in this paper. in order to make the learned representation as similar as possible to the given distribution,it need to minimize the following loss function: to reduce potential information loss of original network, our goal is to minimize the following auto-encoder loss function: wherex i is the reconstruction output of decoder and x i incorporates prior knowledge into the model. to formulate the homogeneous network structure information, skip-gram model has been widely adopted in recent works and in the field of heterogeneous network research, skip-grams suitable for different types of nodes processing have also been proposed [32] . in our model, the context of a node is the low-dimensional potential information. given the node v i and the associated reconstruction information y i , we randomly walk c ∈ c by maximizing the loss function: where b is the size of the generation window and the conditional probability p (v i+j |y i ) is defined as the softmax function: in the above formula, v i is the node context representation of node v i , and y i is the result produced by the auto-encoder. directly optimizing eq. (6) is computationally expensive, which requires the summation over the entire set of nodes when computing the conditional probability of p (v i+j |y i ). we adopt the negative sampling approach proposed in metapath2vec++ that samples multiple negative samples according to some noisy distributions: where σ(·) = 1/(1 + exp(·)) is the sigmoid function and s is the number of negative samples. we set p n (v) ∝ d 3 4 v as suggested in wode2vec, where d v is the degree of node v i [24, 32] . through the above methods, the node's attribute information and the heterogeneity of the node's global structure are processed and the potential semantic similarity kept in a unified granularity space. multi-granularity complex network representation learning through the fusion of multiple kinds of granularity information, learning the basic granules through an autoencoder, and representing different levels of granularity in a unified low-dimensional vector solves the potential semantic similarity between nodes without direct edges. the model simultaneously optimizes the objective function of each module to make the final result robust and effective. the function is shown below: in detail, l re is the auto-encoder loss function of eq. (4), l kl has been stated in formula (3), and l hs is the loss function of the skip-gram model in eq. (5) . α, β, ψ, γ are the hyper parameters to balance each module. l v ae is the parameter optimization function, the formula is as follows: where w k ,ŵ k are weight matrices for encoder and decoder respectively in the kth layer, and b k ,b k are bias matrix. the complete objective function is expressed as follows: mnrl preserves multiple types of granular information include node attributes, local network structure and global network structure information in a unified framework. the model solves the problems of highly nonlinearity and complementarity of various granularity information, and retained the underlying semantics of topology and additional information at the same time. finally, we optimize the object function l in eq. (10) through stochastic gradient descent. to ensure the robustness and validity of the results, we iteratively optimize all components at the same time until the model converges. the learning algorithm is summarized in algorithm 1. algorithm1. the model of mnrl input: graph g = (v, e, a), window size b, times of walk p, walk length u, hyperparameter α, β, ψ, γ, embedding size d. output: node representations y k ∈ d . 1: generate node context starting p times with random walks with length u at each node. 2: multiple granularity information fusion for each node by function h (·) 3: initialize all parameters 4: while not converged do 5: sample a mini-batch of nodes with its context 6: compute the gradient of ∇l 7: update auto-encoder and skip-gram module parameters 8: end while 9: save representations y = y k datasets: in our experiments, we employ four benchmark datasets: facebook 1 , cora, citeseer and pubmed 2 . these datasets contain edge relations and various attribute information, which can verify that the social relations of nodes and individual attributes have strong dependence and indecomposability, and jointly determine the properties of entities in the social environment. the first three datasets are paper citation networks, and these datasets are consist of bibliography publication data. the edge represents that each paper may cite or be cited by other papers. the publications are classified into one of the following six classes: agents, ai, db, ir, ml, hci in citeseer and one of the three classes (i.e., "diabetes mellitus experimental", "diabetes mellitus type 1", "diabetes mellitus type 2") in pubmed. the cora dataset consists of machine learning papers which are classified into seven classes. facebook dataset is a typical social network. nodes represent users and edges represent friendship relations. we summarize the statistics of these benchmark datasets in table 1 . to evaluate the performance of our proposed mnrl, we compare it with 9 baseline methods, which can be divided into two groups. the former category of baselines leverage network structure information only and ignore the node attributes contains deepwalk, node2vec, grarep [33] , line and sdne. the other methods try to preserve node attribute and network structure proximity, which are competitive competitors. we consider tadw, gae, vgae, dane as our compared algorithms. for all baselines, we used the implementation released by the original authors. the parameters for baselines are tuned to be optimal. for deepwalk and node2vec, we set the window size as 10, the walk length as 80, the number of walks as 10. for grarep, the maximum transition step is set to 5. for line, we concatenate the first-order and second-order result together as the final embedding result. for the rest baseline methods, their parameters are set following the original papers. at last, the dimension of the node representation is set as 128. for mnrl, the number of layers and dimensions for each dataset are shown in table 2 . table 2 . detailed network layer structure information. citeseer 3703-1500-500-128-500-1500-3703 pubmed 500-200-128-200-500 cora 1433-500-128-500-1433 facebook 1238-500-128-500-1238 to show the performance of our proposed mnrl, we conduct node classification on the learned node representations. specifically, we employ svm as the classifier. to make a comprehensive evaluation, we randomly select 10%, 30%, 50% nodes as the training set and the rest as the testing set respectively. with these randomly chosen training sets, we use five-fold cross validation to train the classifier and then evaluate the classifier on the testing sets. to measure the classification result, we employ micro-f1 (mi-f1) and macro-f1 (ma-f1) as metrics. the classification results are shown in table 3 , 4, 5 respectively. from these four tables, we can find that our proposed mnrl achieves significant improvement compared with plain network embedding approaches, and beats other attributed network embedding approaches in most situations. experimental results show that the representation results of each comparison algorithm perform well in node classification in downstream tasks. in general, a model that considers node attribute information and node structure information performs better than structure alone. from these three tables, we can find that our proposed mnrl achieves significant improvement compared with single granularity network embedding approaches. for joint representation, our model performs more effectively than most similar types of algorithms, especially in the case of sparse data, because our model input is the fusion information of multiple nodes with extra information. when comparing dane, our experiments did not improve significantly but it achieved the expected results. dane uses two auto-encoders to learn and express the network structure and attribute information separately, since the increase of parameters makes the optimal selection in the learning process, the performance will be better with the increase of training data, but the demand for computing resources will also increase and the interpretability of the algorithm is weak. while mnrl uses a variational auto-encoder to learn the structure and attribute information at the same time, the interdependence of information is preserved, which handles heterogeneous information well and reduces the impact of noise. in this subsection, we evaluate the ability of node representations in reconstructing the network structure via link prediction, aiming at predicting if there exists an edge between two nodes, is a typical task in networks analysis. following other model works do, to evaluate the performance of our model, we randomly holds out 50% existing links as positive instances and sample an equal number of non-existing links. then, we use the residual network to train the embedding models. specifically, we rank both positive and negative instances according to the cosine similarity function. to judge the ranking quality, we employ the auc to evaluate the ranking list and a higher value indicates a better performance. we perform link prediction task on cora datasets and the results is shown in fig. 3 . compared with traditional algorithms that representation learning from a single granular structure information, the algorithms that both on structure and attribute information is more effective. tadw performs well, but the method based on matrix factorization has the disadvantage of high complexity in large networks. gae and vgae perform better in this experiment and are suitable for large networks. mnrl refines the input and retains potential semantic information. link prediction relies on additional information, so it performs better than other algorithms in this experiment. in this paper, we propose a multi-granularity complex network representation learning model (mnrl), which integrates topology structure and additional information, and presents these fused information learning into the same granularity semantic space that through fine-to-coarse to refine the complex network. the effectiveness has been verified by extensive experiments, shows that the relation of nodes and additional attributes are indecomposable and complementarity, which together jointly determine the properties of entities in the network. in practice, it will have a good application prospect in large information network. although the model saves a lot of calculation cost and well represents complex networks of various granularity, it needs to set different parameters in different application scenarios, which is troublesome and needs to be optimized in the future. the multi-granularity complex network representation learning also needs to consider the dynamic network and adapt to the changes of network nodes, so as to realize the real-time information network analysis. social structure and network analysis network representation learning: a survey virtual network embedding: a survey the link-prediction problem for social networks community discovery using nonnegative matrix factorization node classification in social networks recommender systems deepwalk: online learning of social representations node2vec: scalable feature learning for networks line: large-scale information network embedding deep learning deep attributed network embedding structural deep network embedding semi-supervised classification with graph convolutional networks dgcc: data-driven granular cognitive computing granular computing data mining, rough sets and granular computing structural deep embedding for hypernetworks principal component analysis the isomap algorithm and topological stability laplacian eigenmaps for dimensionality reduction and data representation network representation learning based on multi-granularity structure word2vec explained: deriving mikolov et al'.s negativesampling word-embedding method harp: hierarchical representation learning for networks sparse autoencoder network representation learning with rich text information a general framework for content-enhanced network representation learning anrl: attributed network representation learning via deep neural networks granular computing with multiple granular layers for brain big data processing an approach for attribute reduction and rule generation based on rough set theory metapath2vec: scalable representation learning for heterogeneous networks grarep: learning graph representations with global structural information co-embedding attributed networks key: cord-000196-lkoyrv3s authors: salathé, marcel; jones, james h. title: dynamics and control of diseases in networks with community structure date: 2010-04-08 journal: plos comput biol doi: 10.1371/journal.pcbi.1000736 sha: doc_id: 196 cord_uid: lkoyrv3s the dynamics of infectious diseases spread via direct person-to-person transmission (such as influenza, smallpox, hiv/aids, etc.) depends on the underlying host contact network. human contact networks exhibit strong community structure. understanding how such community structure affects epidemics may provide insights for preventing the spread of disease between communities by changing the structure of the contact network through pharmaceutical or non-pharmaceutical interventions. we use empirical and simulated networks to investigate the spread of disease in networks with community structure. we find that community structure has a major impact on disease dynamics, and we show that in networks with strong community structure, immunization interventions targeted at individuals bridging communities are more effective than those simply targeting highly connected individuals. because the structure of relevant contact networks is generally not known, and vaccine supply is often limited, there is great need for efficient vaccination algorithms that do not require full knowledge of the network. we developed an algorithm that acts only on locally available network information and is able to quickly identify targets for successful immunization intervention. the algorithm generally outperforms existing algorithms when vaccine supply is limited, particularly in networks with strong community structure. understanding the spread of infectious diseases and designing optimal control strategies is a major goal of public health. social networks show marked patterns of community structure, and our results, based on empirical and simulated data, demonstrate that community structure strongly affects disease dynamics. these results have implications for the design of control strategies. mitigating or preventing the spread of infectious diseases is the ultimate goal of infectious disease epidemiology, and understanding the dynamics of epidemics is an important tool to achieve this goal. a rich body of research [1, 2, 3] has provided major insights into the processes that drive epidemics, and has been instrumental in developing strategies for control and eradication. the structure of contact networks is crucial in explaining epidemiological patterns seen in the spread of directly transmissible diseases such as hiv/aids [1, 4, 5] , sars [6, 7] , influenza [8, 9, 10, 11] etc. for example, the basic reproductive number r 0 , a quantity central to developing intervention measures or immunization programs, depends crucially on the variance of the distribution of contacts [1, 12, 13] , known as the network degree distribution. contact networks with fat-tailed degree distributions, for example, where a few individuals have an extraordinarily large number of contacts, result in a higher r 0 than one would expect from contact networks with a uniform degree distribution, and the existence of highly connected individuals makes them an ideal target for control measures [7, 14] . while degree distributions have been studied extensively to understand their effect on epidemic dynamics, the community structure of networks has generally been ignored. despite the demonstration that social networks show significant community structure [15, 16, 17, 18] , and that social processes such as homophily and transitivity result in highly clustered and modular networks [19] , the effect of such microstructures on epidemic dynamics has only recently started to be investigated. most initial work has focused on the effect of small cycles, predominantly in the context of clustering coefficients (i.e. the fraction of closed triplets in a contact network) [20, 21, 22, 23, 24] . in this article, we aim to understand how community structure affects epidemic dynamics and control of infectious disease. community structure exists when connections between members of a group of nodes are more dense than connections between members of different groups of nodes [15] . the terminology is relatively new in network analysis and recent algorithm development has greatly expanded our ability to detect sub-structuring in networks. while there has been a recent explosion in interest and methodological development, the concept is an old one in the study of social networks where it is typically referred to as a ''cohesive subgroups,'' groups of vertices in a graph that share connections with each other at a higher rate than with vertices outside the group [18] . empirical data on social structure suggests that community structuring is extensive in epidemiological contacts [25, 26, 27] relevant for infectious diseases transmitted by the respiratory or close-contact route (e.g. influenza-like illnesses), and in social groups more generally [16, 17, 28, 29, 30] . similarly, the results of epidemic models of directly transmitted infections such as influenza are most consistent with the existence of such structure [8, 9, 11, 31, 32, 33] . using both simulated and empirical social networks, we show how community structure affects the spread of diseases in networks, and specifically that these effects cannot be accounted for by the degree distribution alone. the main goal of this study is to demonstrate how community structure affects epidemic dynamics, and what strategies are best applied to control epidemics in networks with community structure. we generate networks computationally with community structure by creating small subnetworks of locally dense communities, which are then randomly connected to one another. a particular feature of such networks is that the variance of their degree distribution is relatively low, and thus the spread of a disease is only marginally affected by it [34] . running standard susceptible-infected-resistant (sir) epidemic simulations (see methods) on these networks, we find that the average epidemic size, epidemic duration and the peak prevalence of the epidemic are strongly affected by a change in community structure connectivity that is independent of the overall degree distribution of the full network ( figure 1 ). note that the value range of q shown in figure 1 is in agreement with the value range of q found in the empirical networks used further below, and that lower values of q do not affect the results qualitatively (see suppl. mat. figure s1 ). epidemics in populations with community structure show a distinct dynamical pattern depending on the extent of community structure. in networks with strong community structure, an infected individual is more likely to infect members of the same community than members outside of the community. thus, in a network with strong community structure, local outbreaks may die out before spreading to other communities, or they may spread through various communities in an almost serial fashion, and large epidemics in populations with strong community structure may therefore last for a long time. correspondingly, the incidence rate can be very low, and the number of generations of infection transmission can be very high, compared to the explosive epidemics in populations with less community structure (figures 2a and 2b ). on average, epidemics in networks with strong community structure exhibit greater variance in final size (figures 2c and 2d) , a greater number of small, local outbreaks that do not develop into a full epidemic, and a higher variance in the duration of an epidemic. in order to halt or mitigate an epidemic, targeted immunization interventions or social distancing interventions aim to change the structure of the network of susceptible individuals in such a way as to make it harder for a pathogen to spread [35] . in practice, the number of people to be removed from the susceptible class is often constrained for a number of reasons (e.g., due to limited vaccine supply or ethical concerns of social distancing measures). from a network perspective, targeted immunization methods translate into indentifying which nodes should be removed from a network, a problem that has caught considerable attention (see for example [36] and references therein). targeting highly connected individuals for immunization has been shown to be an effective strategy for epidemic control [7, 14] . however, in networks with strong community structure, this strategy may not be the most effective: some individuals connect to multiple communities (so-called community bridges [37] ) and may thus be more important in spreading the disease than individuals with fewer inter-community connections, but this importance is not necessarily reflected in the degree. identification of community bridges can be achieved using understanding the spread of infectious diseases in populations is key to controlling them. computational simulations of epidemics provide a valuable tool for the study of the dynamics of epidemics. in such simulations, populations are represented by networks, where hosts and their interactions among each other are represented by nodes and edges. in the past few years, it has become clear that many human social networks have a very remarkable property: they all exhibit strong community structure. a network with strong community structure consists of smaller sub-networks (the communities) that have many connections within them, but only few between them. here we use both data from social networking websites and computer generated networks to study the effect of community structure on epidemic spread. we find that community structure not only affects the dynamics of epidemics in networks, but that it also has implications for how networks can be protected from large-scale epidemics. the betweenness centrality measure [38] , defined as the fraction of shortest paths a node falls on. while degree and betweenness centrality are often strongly positively correlated, the correlation between degree and betweenness centrality becomes weaker as community structure becomes stronger ( figure 3 ). thus, in networks with community structure, focusing on the degree alone carries the risk of missing some of the community bridges that are not highly connected. indeed, at a low vaccination coverage, an immunization strategy based on betweenness centrality results in fewer infected cases than an immunization strategy based on degree as the magnitude of community structure increases ( figure 4a ). this observation is critical because the potential vaccination coverage for an emerging infection will typically be very low. a third measure, random walk centrality, identifies target nodes by a random walk, counting how often a node is traversed by a random walk between two other nodes [39] . the random walk centrality measure considers not only the shortest paths between pairs of nodes, but all paths between pairs of nodes, while still giving shorter paths more weight. while infections are most likely to spread along the shortest paths between any two nodes, the cumulative contribution of other paths can still be important [40] : immunization strategies based on random walk centrality result in the lowest number of infected cases at low vaccination coverage (figure 4b and 4c ). to test the efficiency of targeted immunization strategies on real networks, we used interaction data of individuals at five different universities in the us taken from a social network website [41] , and obtained the contact network relevant for directly transmissible diseases (see methods). we find again that the overall most successful targeted immunization strategy is the one that identifies the targets based on random walk centrality. limited immunization based on random walk centrality significantly outperforms immunization based on degree especially when vaccination coverage is low (figure 5a ). in practice, identifying immunization targets may be impossible using such algorithms, because the structure of the contact network relevant for the spread of a directly transmissible disease is generally not known. thus, algorithms that are agnostic about the full network structure are necessary to identify target individuals. the only algorithm we are aware of that is completely agnostic about the network structure network structure identifies target nodes by picking a random contact of a randomly chosen individual [42] . once such an acquaintance has been picked n times, it is immunized. the acquaintance method has been shown to be able to identify some of the highly connected individuals, and thus approximates an immunization strategy that targets highly connected individuals. we propose an alternative algorithm (the so-called community bridge finder (cbf) algorithm, described in detail in the methods) that aims to identify community bridges connecting two groups of clustered nodes. briefly, starting from a random node, the algorithm follows a random path on the contact network, until it arrives at a node that does not connect back to more than one of the previously visited nodes on the random walk. the basic goal of the cbf algorithm is to find nodes that connect to multiple communities -it does so based on the notion that the first node that does not connect back to previously visited nodes of the current random walk is likely to be part of a different community. on all empirical and computationally generated networks tested, this algorithm performed mostly better, often equally well, and rarely worse than the alternative algorithm. it is important to note a crucial difference between algorithms such as cbf (henceforth called stochastic algorithms) and algorithms such as those that calculate, for example, the betweenness centrality of nodes (henceforth called deterministic algorithms). a deterministic algorithm always needs the complete information about each node (i.e. either the number or the identity of all connected nodes for each node in the network). a comparison between algorithms is therefore of limited use if they are not of the same type as they have to work with different inputs. clearly, a deterministic algorithm with information on the full network structure as input should outperform a stochastic algorithm that is agnostic about the full network structure. thus, we will restrict our comparison of cbf to the acquaintance method since this is the only stochastic algorithm we are aware of the takes as input the same limited amount of local information. in the computationally generated networks, cbf outperformed the acquaintance method in large areas of the parameter space ( figure 4d ). it may seem unintuitive at first that the acquaintance method outperforms cbf at very high values of modularity, but one should keep in mind that epidemic sizes are very small in those extremely modular networks (see figure 1a ) because local outbreaks only rarely jump the community borders. if outbreaks are mostly restricted to single communities, then cbf is not the optimal strategy because immunizing community bridges is useless; the acquaintance method may at least find some well connected nodes in each community and will thus perform slightly better in this extreme parameter space. in empirical networks, cbf did particularly well on the network with the strongest community structure (oklahoma), especially in comparison to the similarly effective acquaintance method with n = 2. (figure 5c ). as immunization strategies should be deployed as fast as possible, the speed at which a certain fraction of the . assessing the efficacy of targeted immunization strategies based on deterministic and stochastic algorithms in the computationally generated networks. color code denotes the difference in the average final size s m of disease outbreaks in networks that were immunized before the outbreak using method m. the top panel (a) shows the difference between the degree method and the betweenness centrality method, i.e. s degree 2 s betweenness . a positive difference (colored red to light gray) indicates that the betweenness centrality method resulted in smaller final sizes than the degree method. a negative difference (colored blue to black) indicates that the betweenness centrality method resulted in bigger final sizes than the degree method. if the difference is not bigger than 0.1% of the total population size, then no color is shown (white). panel (a) shows that the betweenness centrality method is more effective than the degree based method in networks with strong community structure (q is high). (b) and (c): like (a), but showing s degree 2 s randomwalk (in (b)) and s betweenness 2 s randomwalk (in (c)). panels (b) and (c) show that the random walk method is the most effective method overall. panel (d) shows that the community bridge finder (cbf) method generally outperforms the acquaintance method (with n = 1) except when community structure is very strong (see main text). final epidemic sizes were obtained by running 2000 sir simulations per network, vaccination coverage and immunization method. doi:10.1371/journal.pcbi.1000736.g004 network can be immunized is an additional important aspect. we measured the speed of the algorithm as the number of nodes that the algorithm had to visit in order to achieve a certain vaccination coverage, and find that the cbf algorithm is faster than the similarly effective acquaintance method with n = 2 at vaccination coverages ,30% (see figure 6 ). a great number of infectious diseases of humans spread directly from one person to another person, and early work on the spread of such diseases has been based on the assumption that every infected individual is equally likely to transmit the disease to any susceptible individual in a population. one of the most important consequences of incorporating network structure into epidemic models was the demonstration that heterogeneity in the number of contacts (degree) can strongly affect how r 0 is calculated [12, 13, 34] . thus, the same disease can exhibit markedly different epidemic patterns simply due to differences in the degree distribution. our results extend this finding and show that even in networks with the same degree distribution, fundamentally different epidemic dynamics are expected to be observed due to different levels of community structure. this finding is important for various reasons: first, community structure has been shown to be a crucial feature of social networks [15, 16, 17, 19] , and its effect on disease spread is thus relevant to infectious disease dynamics. furthermore, it corroborates earlier suggestions that community structure affects the spread of disease, and is the first to clearly isolate this effect from effects due to variance in the degree distribution [43] . second, and consequently, data on the degree distribution of contact networks will not be sufficient to predict epidemic dynamics. third, the design of control strategies benefits from taking community structure into account. an important caveat to mention is that community structure in the sense used throughout this paper (i.e. measured as modularity q ) does not take into account explicitly the extent to which communities overlap. such overlap is likely to play an important role in infectious disease dynamics, because people are members of multiple, potentially overlapping communities (households, schools, workplaces etc.). a strong overlap would likely be reflected in lower overall values for q; however, the exact effect of community overlap on infectious disease dynamics remains to be investigated. identifying important nodes to affect diffusion on networks is a key question in network theory that pertains to a wide range of fields and is not limited to infectious disease dynamics only. there are however two major issues associated with this problem: (i) the structure of networks is often not known, and (ii) many networks are too large to compute, for example, centrality measures efficiently. stochastic algorithms like the proposed cbf algorithm or the acquaintance method address both problems at once. to what extent targeted immunization strategies can be implemented in a infectious diseases/public health setting based on practical and ethical considerations remains an open question. this is true not only for the strategy based on the cbf algorithm, but for most strategies that are based on network properties. as mentioned above, the contact networks relevant for the spread of infectious diseases are generally not known. stochastic algorithms such as the cbf or the acquaintance method are at least in principle applicable when data on network structure is lacking. community structure in host networks is not limited to human networks: animal populations are often divided into subpopulations, connected by limited migration only [44, 45] . targeted immunization of individuals connecting subpopulations has been shown to be an effective low-coverage immunization strategy for the conservation of endangered species [46] . under the assumption of homogenous mixing, the elimination of a disease requires an immunization coverage of at least 1-1/r 0 [1] but such coverage is often difficult or even impossible to achieve due to limited vaccine supply, logistical challenges or ethical concerns. in the case of wildlife animals, high vaccination coverage is also problematic as vaccination interventions can be associated with substantial risks. little is known about the contact network structure in humans, let alone in wildlife, and progress should therefore be made on the development of immunization strategies that can deal with the absence of such data. stochastic algorithms such as the acquaintance method and the cbf method are first important steps in addressing the problem, but the large difference in efficacy between stochastic and deterministic algorithms demonstrates that there is still a long way to go. to investigate the spread of an infectious disease on a contact network, we use the following methodology: individuals in a population are represented as nodes in a network, and the edges between the nodes represent the contacts along which an infection can spread. contact networks are abstracted by undirected, unweighted graphs (i.e. all contacts are reciprocal, and all contacts transmit an infection with the same probability). edges always link between two distinct nodes (i.e. no self loops), and there must be maximally one edge between any single pair of nodes (i.e no parallel edges). each node can be in one of three possible states: (s)usceptible, (i)nfected, or (r)esistant/immune (as in standard sir models). initially, all nodes are susceptible. simulations with immunization strategies implement those strategies before the first infection occurs. targeted nodes are chosen according to a given immunization algorithm (see below) until a desired immunization coverage of the population is achieved, and then their state is set to resistant. after this initial set-up, a random susceptible node is chosen as patient zero, and its state is set to infected. then, during a number of time steps, the initial infection can spread through the network, and the simulation is halted once there are no further infected nodes. at each time step (the unit of time we use is one day, i.e. a figure 5 . assessing the efficacy of targeted immunization strategies in empirical networks based on deterministic and stochastic algorithms. the bars show the difference in the average final size s m of disease outbreaks (n cases) in networks that were immunized before the outbreak using method m. the left panels show the difference between the degree method and the random walk centrality method, i.e. s degree 2 s randomwalk . if the difference is positive (red bars), then the random walk centrality method resulted in smaller final sizes than the degree method. a negative value (black bars) means that the opposite is true. shaded bars show non-significant differences (assessed at the 5% level using the mann-whitney test). the middle and right panels are generated using the same methodology, but measuring the difference between the acquaintance method (with n = 1 in the middle column and n = 2 in the right column, see methods) and the community bridge finder (cbf) method, i.e. s acquaintance1 2 s cbf and s acquaintance2 2 s cbf . again, positive red bars mean that the cbf method results in smaller final sizes, i.e. prevents more cases, than the acquaintance methods, whereas negative black bars mean the opposite. final epidemic sizes were obtained by running 2000 sir simulations per network, vaccination coverage and immunization method. doi:10.1371/journal.pcbi.1000736.g005 time step is one day), an infected node can get infected with probability 12exp(2bi), where b is the transmission rate from an infected to a susceptible node, and i is the number of infected neighboring nodes. at each time step, infected nodes recover at rate c, i.e. the probability of recovery of an infected node per time step is c (unless noted otherwise, we use c = 0.2). if recovery occurs, the state of the recovered node is toggled from infected to resistant. unless mentioned otherwise, the transmission rate b is chosen such that r 0 ,(b/c) * d<3 where d is the mean network degree, i.e the average number of contacts per node. for the networks used here, this approximation is in line with the result from static network theory [47] , r 0 ,t(,k 2 ./,k.21), where ,k. and ,k 2 . are the mean degree and mean square degree, respectively, and where t is the average probability of disease transmission from a node to a neighboring node, i.e. t 1 makes no sense, which explains the upper limit. furthermore, since temporal networks usually are effectively sparser (in terms of the number of possible infection events per time), the smallest β values will give similar results, which is the reason for the higher cutoff in this case. for both temporal and static networks, we assume the outbreak starts at one randomly chosen node. analogously, in the temporal case we assume the disease is introduced with equal probability at any time throughout the sampling period. for every data set and set of parameter values, we sample 10 7 runs of epidemic simulations. as motivated in the introduction, we base our study on empirical temporal networks. all networks that we study record contacts between people and falls into two classes: human proximity networks and communication networks. proximity networks are, of course, most relevant for epidemic studies, but communication networks can serve as a reference (and it is interesting to see how general results are over the two classes). the data sets consist of anonymized lists of two identification numbers in contact and the time since the beginning of the contact. many of the proximity data sets we use come from the sociopatterns project [17] . these data sets were gathered by people wearing radio-frequency identification (rfid) sensors that detect proximity between 1 and 1.5 m. one such datasets comes from a conference, hypertext 2009, (conference 1) [18] , another two from a primary school (primary school) [19] and five from a high school (high school) [20] , a third from a hospital (hospital) [21] , a fourth set of five data sets from an art gallery (gallery) [22] , a fifth from a workplace (office) [23] , and a sixth from members of five families in rural kenya [24] . the gallery data sets consist of several days where we use the first five. in addition to data gathered by rfid sensors, we also use data from the longer-range (around 10m) bluetooth channel. the cambridge 1 [25] and 2 [26] datasets were measured by the bluetooth channel of sensors (imotes) worn by people in and around cambridge, uk. st andrews [27] , conference 2 [25] , and intel [25] are similar data sets tracing contacts at, respectively, the university of st. andrews, the conference infocom 2006, and the intel research laboratory in cambridge, uk. the reality [28] and copenhagen bluetooth [29] data sets also come from bluetooth data, but from smartphones carried by university students. in the romania data, the wifi channel of smartphones was used to log the proximity between university students [30] , whereas the wifi dataset links students of a chinese university that are logged onto the same wifi router. for the diary data set, a group of colleagues and their family members were self-recording their contacts [31] . our final proximity data, the prostitution network, comes from from self-reported sexual contacts between female sex workers and their male sex buyers [32] . this is a special form of proximity network since contacts represent more than just proximity. among the data sets from electronic communication, facebook comes from the wall posts at the social media platform facebook [33] . college is based on communication at a facebook-like service [34] . dating shows interactions at an early internet dating website [35] . messages and forum are similar records of interaction at a film community [36] . copenhagen calls and copenhagen sms consist of phone calls and text messages gathered in the same experiment as copenhagen bluetooth [29] . finally, we use four data sets of e-mail communication. one, e-mail 1, recorded all e-mails to and from a group of accounts [37] . the other three, e-mail 2 [38] , 3 [39] , and 4 [40] recorded e-mails within a set of accounts. we list basic statistics-sizes, sampling durations, etc.-of all the data sets in table i . to gain further insight into the network structures promoting the objective measures, we correlate the objective measures with quantities describing the position of a node in the static networks. since many of our networks are fragmented into components, we restrict ourselves to measures that are well defined for disconnected networks. otherwise, in our selection, we strive to cover as many different aspects of node importance as we can. degree is simply the number of neighbors of a node. it usually presented as the simplest measure of centrality and one of the most discussed structural predictors of importance with respect to disease spreading [42] . (centrality is a class of measures of a node's position in a network that try to capture what a "central" node is; i.e., ultimately centrality is not more well-defined than the vernacular word.) it is also a local measure in the sense that a node is able to estimate its degree, which could be practical when evaluating sentinel surveillance in real networks. subgraph centrality is based on the number of closed walks a node is a member of. (a walk is a path that could be overlapping itself.) the number of paths from node i to itself is given by a λ ii , where a is the adjacency matrix and λ is the length of the path. reference [43] argues that the best way to weigh paths of different lengths together is through the formula as mentioned, several of the data sets are fragmented (even though the largest connected component dominates components of other sizes). in the limit of high transmission table i. basic statistics of the empirical temporal networks. n is the number of nodes, c is the number of contacts, t is the total sampling time, t is the time resolution of the data set, m is the number of links in the projected and thresholded static networks, and θ is the threshold. probabilities, all nodes in the component of the infection seed will be infected. in such a case it would make sense to place a sentinel in the largest component (where the disease most likely starts). closeness centrality builds on the assumption that a node that has, on average, short distances to other nodes is central [44] . here, the distance d(i, j ) between nodes i and j is the number of links in the shortest paths between the nodes. the classical measure of closeness centrality of a node i is the reciprocal average distance between i and all other nodes. in a fragmented network, for all nodes, there will be some other node that it does not have a path to, meaning that the closeness centrality is ill defined. (assigning the distance infinity to disconnected pairs would give the closeness centrality zero for all nodes.) a remedy for this is, instead of measuring the reciprocal average of distances, measuring the average reciprocal distance [45] , where d −1 (i, j ) = 0 if i and j are disconnected. we call this the harmonic closeness by analogy to the harmonic mean. vitality measures are a class of network descriptor that capture the impact of deleting a node on the structure of the entire network [46, 47] . specifically, we measure the harmonic closeness vitality, or harmonic vitality, for short. this is the change of the sum of reciprocal distances of the graph (thus, by analogy to the harmonic closeness, well defined even for disconnected graphs): here the denominator concerns the graph g with the node i deleted. if deleting i breaks many shortest paths, then c c (i) decreases, and thus c v (i) increases. a node whose removal disrupts many shortest paths would thus score high in harmonic vitality. our sixth structural descriptor is coreness. this measure comes out of a procedure called k-core decomposition. first, remove all nodes with degree k = 1. if this would create new nodes with degree one, delete them too. repeat this until there are no nodes of degree 1. then, repeat the above steps for larger k values. the coreness of a node is the last level when it is present in the network during this process [48] . like for the static networks, in the temporal networks we measure the degree of the nodes. to be precise, we define the degree as the number of distinct other nodes a node in contact with within the data set. strength is the total number of contacts a node has participated in throughout the data set. unlike degree, it takes the number of encounters into account. temporal networks, in general, tend to be more disconnected than static networks. for node i to be connected to j in a temporal networks there has to be a time-respecting path from i to j , i.e., a sequence of contacts increasing in time that (if time is projected out) is a path from i to j [7, 8] . thus two interesting quantities-corresponding to the component sizes of static networks-are the fraction of nodes reachable from a node by time-respecting paths forward (downstream component size) and backward in time (upstream component size) [49] . if a node only exists in the very early stage of the data, the sentinel will likely not be active by the time the outbreak happens. if a node is active only at the end of the data set, it would also be too late to discover an outbreak early. for these reasons, we measure statistics of the times of the contacts of a node. we measure the average time of all contacts a node participates in; the first time of a contact (i.e., when the node enters the data set); and the duration of the presence of a node in the data (the time between the first and last contact it participates in). we use a version of the kendall τ coefficient [50] to elucidate both the correlations between the three objective measures, and between the objective measures and network structural descriptors. in its basic form, the kendall τ measures the difference between the number of concordant (with a positive slope between them) and discordant pairs relative to all pairs. there are a few different versions that handle ties in different ways. we count a pair of points whose error bars overlap as a tie and calculate where n c is the number of concordant pairs, n d is the number of discordant pairs, and n t is the number of ties. we start investigating the correlation between the three objective measures throughout the parameter space of the sir model for all our data sets. we use the time to detection and extinction as our baseline and compare the other two objective measures with that. in fig. 2 , we plot the τ coefficient between t x and t d and between t x and f d . we find that for low enough values of β, the τ for all objective measures coincide. for very low β the disease just dies out immediately, so the measures are trivially equal: all nodes would be as good sentinels in all three aspects. for slightly larger β-for most data sets 0.01 < β < 0.1-both τ (t x , t d ) and τ (t x , f d ) are negative. this is a region where outbreaks typically die out early. for a node to have low t x , it needs to be where outbreaks are likely to survive, at least for a while. this translates to a large f d , while for t d , it would be beneficial to be as central as possible. if there are no extinction events at all, t x and t d are the same. for this reason, it is no surprise that, for most of the data sets, τ (t x , t d ) becomes strongly positively correlated for large β values. the τ (t x , f d ) correlation is negative (of a similar magnitude), meaning that for most data sets the different methods would rank the possible sentinels in the same order. for some of the data sets, however, the correlation never becomes positive even for large β values (like copenhagen calls and copenhagen sms). these networks are the most fragmented onesm meaning that one sentinel unlikely would detect the outbreak (since it probably happens in another component). this makes t x rank the important nodes in a way similar to f d , but since diseases that do reach a sentinel do it faster in a small component than a large one, t x and t d become anticorrelated. in fig. 3 , we perform the same analysis as in the previous section but for static networks. the picture is to some extent similar, but also much richer. just as for the case of static networks, τ (t x , f d ) is always nonpositive, meaning the time to detection or extinction ranks the nodes in a way positively correlated with the frequency of detection. furthermore, like the static networks, τ (t x , t d ) can be both positively and negatively correlated. this means that there are regions where t d ranks the nodes in the opposite way than the t x . these regions of negative τ (t x , t d ) occur for low β and ν. for some data sets-for example the gallery data sets, dating, copenhagen calls, and copenhagen sms-the correlations are negative throughout the parameter space. among the data sets with a qualitative difference between the static and temporal representations, we find prostitution and e-mail 1 both have strongly positive values of τ (t x , t d ) for large β values in the static networks but moderately negative values for temporal networks. in this section, we take a look at how network structures affect our objective measures. in fig. 4 , we show the correlation between our three objective measures and the structural descriptors as a function of β for the office data set. panel (a) shows the results for the time to detection or extinction. there is a negative correlation between this measure and traditional centrality measures like degree or subgraph centrality. this is because t x is a quantity one wants to minimize to find the optimal sentinel, whereas for all the structural descriptors a large value means that a node is a candidate sentinel node. we see that degree and subgraph centrality are the two quantities that best predict the optimal sentinel location, while coreness is also close (at around −0.65). this in line with research showing that certain biological problems are better determined by degree than more elaborate centrality measures [51] . over all, the τ curves are rather flat. this is partly explained by τ being a rank correlation for t d [ fig. 4(b) ], most curves change behavior around β = 0.2. this is the region when larger outbreaks could happen, so one can understand there is a transition to a situation similar to t x [ fig. 4(a) ]. f d [fig. 4(c) ] shows a behavior similar to t d in that the curves start changing order, and what was a correlation at low β becomes an anticorrelation at high β. this anticorrelation is a special feature of this particular data set, perhaps due to its pronounced community structure. nodes of degree 0, 1, and 2 have a strictly increasing values of f d , but for some of the high degree nodes (that all have f d close to one) the ordering gets anticorrelated with degree which makes kendall's τ negative. since rank-based correlations are more principled for skew-distributed quantities common in networks, we keep them. we currently investigate what creates these unintuitive anticorrelations among the high degree nodes in this data set. next, we proceed with an analysis of all data sets. we summarize plots like fig. 4 by the structural descriptor with the largest magnitude of the correlation |τ |. see fig. 2 . we can see, that there is not one structural quantity that uniquely determines the ranking of nodes, there is not even one that dominates over (1) degree is the strongest structural determinant of all objective measures at low β values. this is consistent with ref. [13] . (2) component size only occurs for large β. in the limit of large β, f d is only determined by component size (if we would extend the analysis to even larger β, subgraph centrality would have the strongest correlation for the frequency of detection). (3) harmonic vitality is relatively better as a structural descriptor for t d , less so for t x and f d . t x and f d capture the ability of detecting an outbreak before it dies, so for these quantities one can imagine more fundamental quantities like degree and the component size are more important. (4) subgraph centrality often shows the strongest correlation for intermediate values of β. this is interesting, but difficult to explain since the rationale of subgraph centrality builds on cycle counts and there is no direct process involving cycles in the sir model. (5) harmonic closeness rarely gives the strongest correlation. if it does, it is usually succeeded by coreness and the data set is typically rather large. (6) datasets from the same category can give different results. perhaps college and facebook is the most conspicuous example. in general, however, similar data sets give similar results. the final observation could be extended. we see that, as β increases, one color tends to follow another. this is summarized in fig. 6 , where we show transition graphs of the different structural descriptors such that the size corresponds to their frequency in fig. 7 , and the size of the arrows show how often one structural descriptor is succeeded by another as β is increased. for t x , the degree and subgraph centrality are the most important structural descriptors, and the former is usually succeeded by the latter. for t d , there is a common peculiar sequence of degree, subgraph centrality, coreness component size, and harmonic vitality that is manifested as the peripheral, clockwise path of fig. 6(b) . finally, f d is similar to t x except that there is a rather common transition from degree to coreness, and harmonic vitality is, relatively speaking, a more important descriptor. in fig. 7 , we show the figure for temporal networks corresponding to fig. 5 . just like the static case, even though every data set and objective measure is unique, we can make some interesting observations. (1) strength is most important for small ν and β. this is analogous to degree dominating the static network at small parameter values. (2) upstream component size dominates at large ν and β. this is analogous to the component size of static networks. since temporal networks tend to be more fragmented than static ones [49] , this dominance at large outbreak sizes should be even more pronounced for temporal networks. (3) most of the variation happens in the direction of larger ν and β. in this direction, strength is succeeded by degree which is succeeded by upstream component size. (4) like the static case, and the analysis of figs. 5 and 7 , t x and f d are qualitatively similar compared to t d . (5) temporal quantities, such as the average and first times of a node's contacts, are commonly the strongest predictors of t d . (6) when a temporal quantity is the strongest predictor of t x and f d it is usually the duration. it is understandable that this has little influence on t d , since the ability to be infected at all matters for these measures; a long duration is beneficial since it covers many starting times of the outbreak. (7) similar to the static case, most categories of data sets give consistent results, but some differ greatly (facebook and college is yet again a good example). the bigger picture these observations paint is that, for our problem, the temporal and static networks behave rather similarly, meaning that the structures in time do not matter so much for our objective measures. at the same time, there is not only one dominant measure for all the data sets. rather are there several structural descriptors that correlate most strongly with the objective measures depending on ν and β. in this paper, we have investigated three different objective measures for optimizing sentinel surveillance: the time to detection or extinction, the time to detection (given that the detection happens), and the frequency of detection. each of these measures corresponds to a public health scenario: the time to detection or extinction is most interesting to minimize if one wants to halt the outbreak as quickly as possible, and the frequency of detection is most interesting if one wants to monitor the epidemic status as accurately as possible. the time to detection is interesting if one wants to detect the outbreak early (or else it is not important), which could be the case if manufacturing new vaccine is relatively time consuming. we investigate these cases for 38 temporal network data sets and static networks derived from the temporal networks. our most important finding is that, for some regions of parameter space, our three objective measures can rank nodes very differently. this comes from the fact that sir outbreaks have a large chance of dying out in the very early phase [52] , but once they get going they follow a deterministic path. for this reason, it is thus important to be aware of what scenario one is investigating when addressing the sentinel surveillance problem. another conclusion is that, for this problem, static and temporal networks behave reasonably similarly (meaning that the temporal effects do not matter so much). naturally, some of the temporal networks respond differently than the static ones, but compared to, e.g., the outbreak sizes or time to extinction [53] [54] [55] , differences are small. among the structural descriptors of network position, there is no particular one that dominates throughout the parameter space. rather, local quantities like degree or strength (for the temporal networks) have a higher predictive power at low parameter values (small outbreaks). for larger parameter values, descriptors capturing the number of nodes reachable from a specific node correlate most with the objective measures rankings. also in this sense, the static network quantities dominate the temporal ones, which is in contrast to previous observations (e.g., refs. [53] [54] [55] ). for the future, we anticipate work on the problem of optimizing sentinel surveillance. an obvious continuation of this work would be to establish the differences between the objective metrics in static network models. to do the same in temporal networks would also be interesting, although more challenging given the large number of imaginable structures. yet an open problem is how to distribute sentinels if there are more than one. it is known that they should be relatively far away [13] , but more precisely where should they be located? modern infectious disease epidemiology infectious diseases in humans temporal network epidemiology a guide to temporal networks principles and practices of public health surveillance stochastic epidemic models and their statistical analysis pretty quick code for regular (continuous time, markovian) sir on networks, github.com/pholme/sir proceedings, acm sigcomm 2006-workshop on challenged networks (chants) crawdad dataset st_andrews/sassy third international conference on emerging intelligent data and web technologies proc. natl. acad. sci. usa proceedings of the 2nd acm workshop on online social networks, wosn '09 proceedings of the tenth acm international conference on web search and data mining, wsdm '17 proceedings of the 14th international conference networks: an introduction network analysis: methodological foundations distance in graphs we thank sune lehmann for providing the copenhagen data sets. this work was supported by jsps kakenhi grant no. jp 18h01655. key: cord-034824-eelqmzdx authors: guo, chungu; yang, liangwei; chen, xiao; chen, duanbing; gao, hui; ma, jing title: influential nodes identification in complex networks via information entropy date: 2020-02-21 journal: entropy (basel) doi: 10.3390/e22020242 sha: doc_id: 34824 cord_uid: eelqmzdx identifying a set of influential nodes is an important topic in complex networks which plays a crucial role in many applications, such as market advertising, rumor controlling, and predicting valuable scientific publications. in regard to this, researchers have developed algorithms from simple degree methods to all kinds of sophisticated approaches. however, a more robust and practical algorithm is required for the task. in this paper, we propose the enrenew algorithm aimed to identify a set of influential nodes via information entropy. firstly, the information entropy of each node is calculated as initial spreading ability. then, select the node with the largest information entropy and renovate its l-length reachable nodes’ spreading ability by an attenuation factor, repeat this process until specific number of influential nodes are selected. compared with the best state-of-the-art benchmark methods, the performance of proposed algorithm improved by 21.1%, 7.0%, 30.0%, 5.0%, 2.5%, and 9.0% in final affected scale on cenew, email, hamster, router, condmat, and amazon network, respectively, under the susceptible-infected-recovered (sir) simulation model. the proposed algorithm measures the importance of nodes based on information entropy and selects a group of important nodes through dynamic update strategy. the impressive results on the sir simulation model shed light on new method of node mining in complex networks for information spreading and epidemic prevention. complex networks are common in real life and can be used to represent complex systems in many fields. for example, collaboration networks [1] are used to cover the scientific collaborations between authors, email networks [2] denote the email communications between users, protein-dna networks [3] help people gain a deep insight on biochemical reaction, railway networks [4] reveal the structure of railway via complex network methods, social networks show interactions between people [5, 6] , and international trade network [7] reflects the products trade between countries. a deep understanding and controlling of different complex networks is of great significance in information spreading and network connectivity. on one hand, by using the influential nodes, we can make successful advertisements for products [8] , discover drug target candidates, assist information weighted networks [54] and social networks [55] . however, the node set built by simply assembling the nodes and sorting them employed by the aforementioned methods may not be comparable to an elaborately selected set of nodes due to the rich club phenomenon [56] , namely, important nodes tend to overlap with each other. thus, lots of methods aim to directly select a set of nodes are proposed. kempe et al. defined the problem of identifying a set of influential spreaders in complex networks as influence maximization problem [57] , and they used hill-climbing based greedy algorithm that is within 63% of optimal in several models. greedy method [58] is usually taken as the approximate solution of influence maximization problem, but it is not efficient for its high computational cost. chen et al. [58] proposed newgreedy and mixedgreedy method. borgatti [59] specified mining influential spreaders in social networks by two classes: kpp-pos and kpp-neg, based on which he calculated the importance of nodes. narayanam et al. [60] proposed spin algorithm based on shapley value to deal with information diffusion problem in social networks. although the above greedy based methods can achieve relatively better result, they would cost lots of time for monte carlo simulation. so more heuristic algorithms were proposed. chen et al. put forward simple and efficient degreediscount algorithm [58] in which if one node is selected, its neighbors' degree would be discounted. zhang et al. proposed voterank [61] which selects the influential node set via a voting strategy. zhao et al. [62] introduced coloring technology into complex networks to seperate independent node sets, and selected nodes from different node sets, ensuring selected nodes are not closely connected. hu et al. [63] and guo et al. [64] further considered the distance between independent sets and achieved a better performance. bao et al. [65] sought to find dispersive distributed spreaders by a heuristic clustering algorithm. zhou [66] proposed an algorithm to find a set of influential nodes via message passing theory. ji el al. [67] considered percolation in the network to obtain a set of distributed and coordinated spreaders. researchers also seek to maximize the influence by studying communities [68] [69] [70] [71] [72] [73] . zhang [74] seperated graph nodes into communities by using k-medoid method before selecting nodes. gong et al. [75] divided graph into communities of different sizes, and selected nodes by using degree centrality and other indicators. chen et al. [76] detected communities by using shrink and kcut algorithm. later they selected nodes from different communities as candidate nodes, and used cdh method to find final k influential nodes. recently, some novel methods based on node dynamics have been proposed which rank nodes to select influential spreaders [77, 78] .şirag erkol et al. made a systematic comparison between methods focused on influence maximization problem [79] . they classify multiple algorithms to three classes, and made a detailed explanation and comparison between methods. more algorithms in this domain are described and classified clearly by lü et al. in their review paper [80] . most of the non-greedy strategy methods suffer from a possibility that some spreaders are so close that their influence may overlap. degreediscount and voterank use iterative selection strategy. after a node is selected, they weaken its neighbors' influence to cope with the rich club phenomenon. however, these two algorithms roughly induce nodes' local information. besides, they do not further make use of the difference between nodes when weakening nodes' influence. in this paper, we propose a new heuristic algorithm named enrenew based on node's entropy to select a set of influential nodes. enrenew also uses iterative selection strategy. it initially calculates the influence of each node by its information entropy (further explained in section 2.2), and then repeatedly select the node with the largest information entropy and renovate its l-length reachable nodes' information entropy by an attenuation factor until specific number of nodes are selected. experiments show that the proposed method yields the largest final affected scale on 6 real networks in the susceptible-infected-recovered (sir) simulation model compared with state-of-the-art benchmark methods. the results reveal that enrenew could be a promising tool for related work. besides, to make the algorithm practically more useful, we provide enrenew's source code and all the experiments details on https://github.com/yangliangwei/influential-nodes-identification-in-complex-networksvia-information-entropy, and researchers can download it freely for their convenience. the rest of paper is organized as follows: the identifying method is presented in section 2. experiment results are analyzed and discussed in section 3. conclusions and future interest research topics are given in section 4. the best way to measure the influence of a set of nodes in complex networks is through propagation dynamic process on real life network data. a susceptible infected removed model (sir model) is initially used to simulate the dynamic of disease spreading [23] . it is later widely used to analyze similar spreading process, such as rumor [81] and population [82] . in this paper, the sir model is adopted to objectively evaluate the spreading ability of nodes selected by algorithms. each node in the sir model can be classified into one of three states, namely, susceptible nodes (s), infected nodes (i), and recovered nodes (r). at first, set initial selected nodes to infected status and all others in network to susceptible status. in each propagation iteration, each infected node randomly choose one of its direct neighbors and infect it with probability µ. in the meantime, each infected node will be recovered with probability β and won't be infected again. in this study, λ = µ β is defined as infected rate, which is crucial to the spreading speed in the sir model. apparently, the network can reach a steady stage with no infection after enough propagation iterations. to enable information spreads widely in networks, we set µ = 1.5µ c , where µ c = k k 2 − k [83] is the spreading threshold of sir, k is the average degree of network. when µ is smaller than µ c , spreading in sir could only affect a small range or even cannot spread at all. when it is much larger than µ c , nearly all methods could affect the whole network, which would be meaningless for comparison. thus, we select µ around µ c on the experiments. during the sir propagation mentioned above, enough information can be obtained to evaluate the impact of initial selected nodes in the network and the metrics derived from the procedure is explained in section 2.4. the influential nodes selecting algorithm proposed in this paper is named enrenew, deduced from the concept of the algorithm. enrenew introduces entropy and renews the nodes' entropy through an iterative selection process. enrenew is inspired by voterank algorithm proposed by zhang et al. [61] , where the influential nodes are selected in an iterative voting procedure. voterank assigns each node with voting ability and scores. initially, each node's voting ability to its neighbors is 1. after a node is selected, the direct neighbors' voting ability will be decreased by 1 k , where k = 2 * m n is the average degree of the network. voterank roughly assigns all nodes in graph with the same voting ability and attenuation factor, which ignores node's local information. to overcome this shortcoming, we propose a heuristic algorithm named enrenew and described as follows. in information theory, information quantity measures the information brought about by a specific event and information entropy is the expectation of the information quantity. these two concepts are introduced into complex network in reference [44] [45] [46] to calculate the importance of node. information entropy of any node v can be calculated by: where p uv = d u ∑ l∈γv d l , ∑ l∈γ v p lv = 1, γ v indicates node v's direct neighbors, and d u is the degree of node u. h uv is the spreading ability provided from u to v. e v is node v's information entropy indicating its initial importance which would be renewed as described in algorithm 1. a detailed calculating of node entropy is shown in figure 1 . it shows how the red node's (node 1) entropy is calculated in detail. node 1 has four neighbors from node 2 to node 5. node 1's information entropy is then calculated by simply selecting the nodes with a measure of degree as initial spreaders might not achieve good results. because most real networks have obvious clumping phenomenon, that is, high-impact nodes in the network are often connected closely in a same community. information cannot be copiously disseminated to the whole network. to manage this situation, after each high impact node is selected, we renovate the information entropy of all nodes in its local scope and then select the node with the highest information entropy, the process of which is shown in algorithm 1. e k = − k · 1 k · log 1 k and k is the average degree of the network. 1 2 l−1 is the attenuation factor, the farther the node is from node v, the smaller impact on the node will be. e k can be seen as the information entropy of any node in k -regular graph if k is an integer. from algorithm 1, we can see that after a new node is selected, the renew of its l-length reachable nodes' information entropy is related with h and e k , which reflects local structure information and global network information, respectively. compared with voterank, enrenew replaces voting ability by h value between connected nodes. it induces more local information than directly set voting ability as 1 in voterank. at the same time, enrenew uses h e k as the attenuate factor instead of 1 k in voterank, retaining global information. computational complexity (usually time complexity) is used to describe the relationship between the input of different scales and the running time of the algorithm. generally, brute force can solve most problems accurately, but it cannot be applied in most scenarios because of its intolerable time complexity. time complexity is an extremely important indicator of an algorithm's effectiveness. through analysis, the algorithm is proved to be able to identify influential nodes in large-scale network in limited time. the computational complexity of enrenew can be analyzed in three parts, initialization, selection and renewing. n, m and r represent the number of nodes, edges and initial infected nodes, respectively. at start, enrenew takes o(n · k ) = o(m) for calculating information entropy. node selection selects the node with the largest information entropy and requires o(n), which can further be decreased to o(log n) if stored in an efficient data structure such as red-black tree. renewing the l-length reachable nodes' information entropy needs o( k l ) = o( m l n l ). as suggested in section 3.3, l = 2 yields impressive results with o( m 2 n 2 ). since selection and renewing parts need to be performed r times to get enough spreaders,the final computational complexity is o(m + n) + o(r log n) + o(r k 2 ) = o(m + n + r log n + rm 2 n 2 ). especially, when the network is sparse and r n, the complexity will be decreased to o(n). the algorithm's performance is measured by the selected nodes' properties including its spreading ability and location property. spreading ability can be measured by infected scale at time t f(t) and final infected scale f(t c ), which are obtained from sir simulation and widely used to measure the spreading ability of nodes [61, [84] [85] [86] [87] [88] . l s is obtained from selected nodes' location property by measuring their dispersion [61] . infected scale f(t) demonstrates the influence scale at time t and is defined by where n i(t) and n r(t) are the number of infected and recovered nodes at time t, respectively. at the same time step t, larger f(t) indicates more nodes are infected by initial influential nodes, while a shorter time t indicates the initial influential nodes spread faster in the network. f(t c ) is the final affected scale when the spreading reaches stable state. this reflects the final spreading ability of initial spreaders. the larger the value is, the stronger the spreading capacity of initial nodes. f(t c ) is defined by: where t c is the time when sir propagation procedure reaches its stable state. l s is the average shortest path length of initial infection set s. usually, with larger l s , the initial spreaders are more dispersed and can influence a larger range. this can be defined by: where l u,v denotes the length of the shortest path from node u to v. if u and v is disconnected, the shortest path is replaced by d gc + 1, where d gc is the largest diameter of connected components. an example network shown in figure 2 is used to show the rationality of nodes the proposed algorithm chooses. the first three nodes selected by enrenew is distributed in three communities, while those selected by the other algorithms are not. we further run the sir simulation on the example network with enrenew and other five benchmark methods. the detailed result is shown in table 1 for an in-depth discussion. this result is obtained by averaging 1000 experiments. . this network consists of three communities at different scales. the first nine nodes selected by enrenew are marked red. the network typically shows the rich club phenomenon, that is, nodes with large degree tend to be connected together. table 2 shows the experiment results when choosing 9 nodes as the initial spreading set. greedy method is usually used as the upper bound, but it is not efficient in large networks due to its high time complexity. enrenew and pagerank distribute 4 nodes in community 1, 3 nodes in community 2, and 1 node in community 3. the distribution matches the size of community. however, the nodes selected by the other algorithms tend to cluster in community 1 except for greedy method. this will induce spreading within high density area, which is not efficient to spread in the entire network. enrenew and pagerank can adaptively allocate reasonable number of nodes based on the size of the community just as greedy method. nodes selected by enrenew have the second largest average distance except greedy, which indicates enrenew tends to distribute nodes sparsely in the graph. it aptly alleviates the adverse effect of spreading caused by the rich club phenomenon. although enrenew's average distance is smaller than pagerank, it has a higher final infected scale f(t c ). test result on pagerank also indicates that just select nodes widely spread across the network may not induce to a larger influence range. enrenew performs the closest to greedy with a low computational cost. it shows the proposed algorithm's effectiveness to maximize influence with limited nodes. note: n and m are the total number of nodes and edges, respectively, and k = 2 * m n stands for average node degree and k max = max v∈v d v is the max degree in the network and average clustering coefficient c measures the degree of aggregation in the network. c = 1 n ∑ n i=1 2 * i i |γ i | * (|γ i |−1) , where i i denotes the number of edges between direct neighbors of node i. table 2 describes six different networks varying from small to large-scale, which are used to evaluate the performance of the methods. cenew [89] is a list of edges of the metabolic network of c.elegans. email [90] is an email user communication network. hamster [91] is a network reflecting friendship and family links between users of the website http://www.hamsterster.com, where node and edge demonstrate the web user and relationship between two nodes, respectively. router network [92] reflects the internet topology at the router level. condmat (condense matter physics) [93] is a collaboration network of authors of scientific papers from the arxiv. it shows the author collaboration in papers submitted to condense matter physics. a node in the network represents an author, and an edge between two nodes shows the two authors have collaboratively published papers. in the amazon network [94] , each node represents a product, and an edge between two nodes represents two products were frequently purchased together. we firstly conduct experiments on the parameter l, which is the influence range when renewing the information entropy. if l = 1, only the direct neighbors' importance of selected node will be renewed, and if l = 2, the importance of 2-length reachable nodes will be renewed and so forth. the results with varying parameter l from 1 to 4 on four networks are shown in figure 3 . it can be seen from figure 3 that, when l = 2, the method gets the best performance in four of the six networks. in network email, although the results when l = 3 and l = 4 are slightly better comparing with the case of l = 2, the running time increases sharply. besides, the three degrees of influence (tdi) theory [95] also states that a individual's social influence is only within a relatively small range. based on our experiments, we set the influence range parameter l at 2 in the preceding experiments. with specific ratio of initial infected nodes p, larger final affected scale f(t c ) means more reasonable of the parameter l. the best parameter l differs from different networks. in real life application, l can be used as an tuning parameter. many factors affect the final propagation scale in networks. a good influential nodes mining algorithm should prove its robustness in networks varying in structure, nodes size, initial infection set size, infection probability, and recovery probability. to evaluate the performance of enrenew, voterank , adaptive degree, k-shell, pagerank, and h-index algorithms are selected as benchmark methods for comparing. furthermore, greedy method is usually taken as upper bound on influence maximization problem, but it is not practical on large networks due to its high time computational complexity. thus, we added greedy method as upper bound on the two small networks (cenew and email). the final affected scale f(t c ) of each method on different initial infected sizes are shown in figure 4 . it can be seen that enrenew achieves an impressing result on the six networks. in the small network, such as cenew and email, enrenew has an apparent better result on the other benchmark methods. besides, it nearly reaches the upper bound on email network. in hamster network, it achieves a f(t c ) of 0.22 only by ratio of 0.03 initial infected nodes, which is a huge improvement than all the other methods. in condmat network, the number of affected nodes are nearly 20 times more than the initial ones. in a large amazon network, 11 nodes will be affected on average for one selected initial infected node. but the algorithm performs unsatisfactory on network router. all the methods did not yield good results due to the high sparsity structure of the network. in this sparse network, the information can hardly spread out with small number of initial spreaders. by comparing the 6 methods from the figure 4 , enrenew surpasses all the other methods on five networks with nearly all kinds of p varying from small to large. this result reveals that when the size of initial infected nodes varies, enrenew also shows its superiority to all the other methods. what is worth noticing is that enrenew performs about the same as other methods when p is small, but it has a greater improvement with the rise of initial infected ratio p. this phenomenon shows the rationality of the importance renewing process. the renewing process of enrenew would influence more nodes when p is larger. the better improvement of enrenew than other methods shows the renewing process reasonability redistributes nodes' importance. timestep experiment is made to assess the propagation speed when given a fixed number of initial infected nodes. the exact results of f(t) varying with time step t are shown in figure 5 . from the experiment, it can be seen that with same number of initial infected nodes, enrenew always reaches a higher peak than the benchmark methods, which indicates a larger final infection rate. in the steady stage, enrenew surpasses the best benchmark method by 21.1%, 7.0%, 30.0%, 5.0%, 2.5% and 9.0% in final affected scale on cenew, email, hamster, router, condmat and amazon networks, respectively. in view of propagation speed, enrenew reaches the peak at about 300th time step in cenew, 200th time step in email, 400th time step in hamster, 50th time step in router, 400th time step in condmat and 150th time step in amazon. enrenew always takes less time to influence the same number of nodes compared with other benchmark methods. from figure 5 , it can also be seen that k-shell also performs worst from the early stage in all the networks. nodes with high core value tend to cluster together, which makes information hard to dissipate. especially in the amazon network, after 100 timesteps, all other methods reach a f(t) of 0.0028, which is more than twice as large as k-shell. in contrast to k-shell, enrenew spreads the fastest from early stage to the steady stage. it shows that the proposed method not only achieve a larger final infection scale, but also have a faster infection rate of propagation. in real life situations, the infected rate λ varies greatly and has huge influence on the propagation procedure. different λ represents virus or information with different spreading ability. the results on different λ and methods are shown in figure 6 . from the experiments, it can be observed that in most of cases, enrenew surpasses all other algorithms with λ varying from 0.5 to 2.0 on all networks. besides, experiment results on cenew and email show that enrenew nearly reaches the upper bound. it shows enrenew has a stronger generalization ability comparing with other methods. especially, the enrenew shows its impressing superiority in strong spreading experiments when λ is large. generally speaking, if the selected nodes are widely spread in the network, they tend to have an extensive impact influence on information spreading in entire network. l s is used to measure dispersity of initial infected nodes for algorithms. figure 7 shows the results of l s of nodes selected by different algorithms on 6 different networks. it can be seen that, except for the amazon network, enrenew always has the largest l s , indicting the widespread of selected nodes. especially in cenew, enrenew performs far beyond all the other methods as its l s is nearly as large as the upper bound. in regard to the large-scale amazon network, the network contains lots of small cliques and k-shell selects the dispersed cliques, which makes k-shell has the largest l s . but other experimental results of k-shell show a poor performance. this further confirms that enrenew does not naively distribute selected nodes widely across the network, but rather based on the potential propagation ability of each node. figure 5 . this experiment compares different methods regard to spreading speed. each subfigure shows experiment results on one network. the ratio of initial infected nodes is 3% for cenew, email, hamster and router, 0.3% for condmat and 0.03% for amazon. the results are obtained by averaging on 100 independent runs with spread rate λ = 1.5 in sir. with the same spreading time t, larger f(t) indicates larger influence scale in network, which reveals a faster spreading speed. it can be seen from the figures that enrenew spreads apparently faster than other benchmark methods on all networks. on the small network cenew and email, enrenew's spreading speed is close to the upper bound. 0.5 0. 8 figure 6 . this experiment tests algorithms' effectiveness on different spreading conditions. each subfigure shows experiment results on one network. the ratio of initial infected nodes is 3% for cenew, email, hamster and router, 0.3% for condmat, and 0.03% for amazon. the results are obtained by averaging on 100 independent runs. different infected rate λ of sir can imitate different spreading conditions. enrenew gets a larger final affected scale f(t c ) on different λ than all the other benchmark methods, which indicates the proposed algorithm has more generalization ability to different spreading conditions. . this experiment analysis average shortest path length l s of nodes selected by different algorithms. each subfigure shows experiment results on one network. p is the ratio of initial infected nodes. generally speaking, larger l s indicates the selected nodes are more sparsely distributed in network. it can be seen that nodes selected by enrenew have the apparent largest l s on five networks. it shows enrenew tends to select nodes sparsely distributed. the influential nodes identification problem has been widely studied by scientists from computer science through to all disciplines [96] [97] [98] [99] [100] . various algorithms that have been proposed aim to solve peculiar problems in this field. in this study, we proposed a new method named enrenew by introducing entropy into a complex network, and the sir model was adopted to evaluate the algorithms. experimental results on 6 real networks, varying from small to large in size, show that enrenew is superior over state-of-the-art benchmark methods in most of cases. besides, with its low computational complexity, the presented algorithm can be applied to large scale networks. the enrenew proposed in this paper can also be well applied in rumor controlling, advertise targeting, and many other related areas. but, for influential nodes identification, there still remain many challenges from different perspectives. from the perspective of network size, how to mine influential spreaders in large-scale networks efficiently is a challenging problem. in the area of time-varying networks, most of these networks are constantly changing, which poses the challenge of identifying influential spreaders since they could shift with the changing topology. in the way of multilayer networks, it contains information from different dimensions with interaction between layers and has attracted lots of research interest [101] [102] [103] . to identify influential nodes in multilayer networks, we need to further consider the method to better combine information from different layers and relations between them. the scientific collaboration networks in university management in brazil arenas, a. self-similar community structure in a network of human interactions insights into protein-dna interactions through structure network analysis statistical analysis of the indian railway network: a complex network approach social network analysis network analysis in the social sciences prediction in complex systems: the case of the international trade network the dynamics of viral marketing extracting influential nodes on a social network for information diffusion structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review efficient immunization strategies for computer networks and populations a study of epidemic spreading and rumor spreading over complex networks epidemic processes in complex networks unification of theoretical approaches for epidemic spreading on complex networks epidemic spreading in time-varying community networks suppression of epidemic spreading in complex networks by local information based behavioral responses efficient allocation of heterogeneous response times in information spreading process absence of influential spreaders in rumor dynamics a model of spreading of sudden events on social networks daniel bernoulli?s epidemiological model revisited herd immunity: history, theory, practice epidemic disease in england: the evidence of variability and of persistency of type infectious diseases of humans: dynamics and control thermodynamic efficiency of contagions: a statistical mechanical analysis of the sis epidemic model a rumor spreading model based on information entropy an algorithmic information calculus for causal discovery and reprogramming systems the hidden geometry of complex, network-driven contagion phenomena extending centrality the h-index of a network node and its relation to degree and coreness identifying influential nodes in complex networks identifying influential nodes in large-scale directed networks: the role of clustering collective dynamics of ?small-world?networks identification of influential spreaders in complex networks ranking spreaders by decomposing complex networks eccentricity and centrality in networks the centrality index of a graph a set of measures of centrality based on betweenness a new status index derived from sociometric analysis mutual enhancement: toward an understanding of the collective preference for shared information factoring and weighting approaches to status scores and clique identification dynamical systems to define centrality in social networks the anatomy of a large-scale hypertextual web search engine leaders in social networks, the delicious case using mapping entropy to identify node centrality in complex networks path diversity improves the identification of influential spreaders how to identify the most powerful node in complex networks? a novel entropy centrality approach a novel entropy-based centrality approach for identifying vital nodes in weighted networks node importance ranking of complex networks with entropy variation key node ranking in complex networks: a novel entropy and mutual information-based approach a new method to identify influential nodes based on relative entropy influential nodes ranking in complex networks: an entropy-based approach discovering important nodes through graph entropy the case of enron email database identifying node importance based on information entropy in complex networks ranking influential nodes in complex networks with structural holes ranking influential nodes in social networks based on node position and neighborhood detecting rich-club ordering in complex networks maximizing the spread of influence through a social network efficient influence maximization in social networks identifying sets of key players in a social network a shapley value-based approach to discover influential nodes in social networks identifying a set of influential spreaders in complex networks identifying effective multiple spreaders by coloring complex networks effects of the distance among multiple spreaders on the spreading identifying multiple influential spreaders in term of the distance-based coloring identifying multiple influential spreaders by a heuristic clustering algorithm spin glass approach to the feedback vertex set problem effective spreading from multiple leaders identified by percolation in the susceptible-infected-recovered (sir) model finding influential communities in massive networks community-based influence maximization in social networks under a competitive linear threshold model a community-based algorithm for influence blocking maximization in social networks detecting community structure in complex networks via node similarity community structure detection based on the neighbor node degree information community-based greedy algorithm for mining top-k influential nodes in mobile social networks identifying influential nodes in complex networks with community structure an efficient memetic algorithm for influence maximization in social networks efficient algorithms for influence maximization in social networks local structure can identify and quantify influential global spreaders in large scale social networks identifying influential spreaders in complex networks by propagation probability dynamics systematic comparison between methods for the detection of influential spreaders in complex networks vital nodes identification in complex networks sir rumor spreading model in the new media age stochastic sir epidemics in a population with households and schools thresholds for epidemic spreading in networks a novel top-k strategy for influence maximization in complex networks with community structure identifying influential spreaders in complex networks based on kshell hybrid method identifying key nodes based on improved structural holes in complex networks ranking nodes in complex networks based on local structure and improving closeness centrality an efficient algorithm for mining a set of influential spreaders in complex networks the large-scale organization of metabolic networks the koblenz network collection the network data repository with interactive graph analytics and visualization measuring isp topologies with rocketfuel graph evolution: densification and shrinking diameters defining and evaluating network communities based on ground-truth the spread of obesity in a large social network over 32 years identifying the influential nodes via eigen-centrality from the differences and similarities of structure tracking influential individuals in dynamic networks evaluating influential nodes in social networks by local centrality with a coefficient a survey on topological properties, network models and analytical measures in detecting influential nodes in online social networks identifying influential spreaders in noisy networks spreading processes in multilayer networks identifying the influential spreaders in multilayer interactions of online social networks identifying influential spreaders in complex multilayer networks: a centrality perspective we would also thank dennis nii ayeh mensah for helping us revising english of this paper. the authors declare no conflict of interest. key: cord-028660-hi35xvni authors: chen, jie; li, yang; zhao, shu; wang, xiangyang; zhang, yanping title: three-way decisions community detection model based on weighted graph representation date: 2020-06-10 journal: rough sets doi: 10.1007/978-3-030-52705-1_11 sha: doc_id: 28660 cord_uid: hi35xvni community detection is of great significance to the study of complex networks. community detection algorithm based on three-way decisions (twd) forms a multi-layered community structure by hierarchical clustering and then selects a suitable layer as the community detection result. however, this layer usually contains overlapping communities. based on the idea of twd, we define the overlapping part in the communities as boundary region (bnd), and the non-overlapping part as positive region (pos) or negative region (neg). how to correctly divide the nodes in the bnd into the pos or neg is a challenge for three-way decisions community detection. the general methods to deal with boundary region are modularity increment and similarity calculation. but these methods only take advantage of the local features of the network, without considering the information of the divided communities and the similarity of the global structure. therefore, in this paper, we propose a method for three-way decisions community detection based on weighted graph representation (wgr-twd). the weighted graph representation (wgr) can well transform the global structure into vector representation and make the two nodes in the boundary region more similar by using frequency of appearing in the same community as the weight. firstly, the multi-layered community structure is constructed by hierarchical clustering. the target layer is selected according to the extended modularity value of each layer. secondly, all nodes are converted into vectors by wgr. finally, the nodes in the bnd are divided into the pos or neg based on cosine similarity. experiments on real-world networks demonstrate that wgr-twd is effective for community detection in networks compared with the state-of-the-art algorithms. nowadays, there are all kinds of complex systems with specific functions in the real world such as online social systems, medical systems and computer systems. these systems can be abstracted into networks with complex internal structures, called complex networks. the research of complex networks has received more and more attention due to the development of the internet. community structure [6, 23] is a common feature of complex networks, which means that a network consists of several communities, the connections between communities are sparse and the connections within a community are dense [10] . mining the community structure in the network is of great significance to understand the network structure, analyze the network characteristics and predict the network behavior. thus, community detection has become one of the most important issues in the study of complex networks. in recent years, a great deal of research is devoted to community detection in networks. most community detection methods are used to identify nonoverlapping communities (i.e., a node belongs to only one community). the main approaches include graph partitioning and clustering [9, 10, 13] , modularity maximization [1, 20] , information theory [12, 25] and non-negative matrix factorization [16, 27] . the kernighan-lin algorithm [13] is a heuristic graph partitioning method that detects communities by optimizing the edges within and between communities. gn algorithm [10] is a representative hierarchical clustering method, which can find communities by removing the links between communities. blondel et al. proposed the louvain algorithm [1] , which is a well-known optimization method based on modularity. it is used to handle large-scale networks due to low time complexity. liu et al. [16] put forward a community detection method by using non-negative matrix factorization. zhao et al. [33] introduced the idea of granular computing into the community detection of network and proposed a community detection method based on clustering granulation. the existing non-overlapping community detection algorithms have made great achievements, but these algorithms only use the traditional two-way decisions [29, 30] method (the acceptance or rejection decision) to deal with the overlapping nodes between communities. compared with the two-way decisions method, the three-way decisions theory (twd) [28] adds a non-commitment decision. the main idea of twd is to divide an entity set into three disjoint regions, which are denoted as positive region (pos), negative region (neg) and boundary region (bnd) respectively. the pos adopts the acceptance decision, the neg adopts the rejection decision, and the bnd adopts the noncommitment decision (i.e., entities that cannot make a decision based on the current information are placed in the bnd). for entities in the bnd, we can further mine more information to realize their final partition. the introduction of non-commitment decision can effectively solve the decision-making errors caused by insufficient information, which is more flexible and closer to the actual situation. how to deal with the boundary region has become a key issue for threeway decisions community detection. at present, the commonly used methods to process the boundary region include modularity increment [20] and similarity calculation [2, 8] . but these methods only take advantage of the local features of the network, without considering the information of the divided communities and the similarity of the global structure. therefore, how to tackle the boundary region effectively is a challenge. in this paper, we propose a three-way decisions community detection model based on weighted graph representation (wgr-twd). the graph representation can well transform the global structure of the network into vector representation and make the two nodes in the boundary region that appear in the same community more similar by using the weight. firstly, the multi-layered community structure is constructed by hierarchical clustering. the target layer is selected according to the extended modularity value of each layer. secondly, all nodes are converted into vectors by weighted graph representation. finally, nodes in the boundary region are divided into positive or negative region based on cosine similarity. thus, non-overlapping community detection is realized. the key contributions of this paper can be summarized as follows: (1) we use weighted graph representation to obtain the global structure information of the network to guide the processing of the boundary region, which gets a better three-way decisions community detection method. (2) based on the knowledge of the communities in the target layer, we make the two nodes connected by a direct edge in the boundary region more similar by using frequency of appearing in the same community as the weight. then the walk sequences are constructed according to the weight of the edge. finally, the skip-gram model is used to obtain the vector representation of nodes. therefore, the weighted graph representation method is realized. the rest of this paper is organized as follows. we introduce related work in sect. 2. we give the detailed description of our algorithm in sect. 3. experiments on real-world networks are reported in sect. 4. finally, we conclude the paper in sect. 5. hierarchical clustering method has been widely used in community detection due to the hierarchical nature of the network structure. this approach can be divided into two forms: divisive method and agglomerative method. the divisive method removes the link with the lowest similarity index repeatedly, while the agglomerative method merges the pair of clusters with the highest similarity index repeatedly. these two methods eventually form a dendrogram, and communities are detected by cutting the tree. the research of community detection based on hierarchical clustering has received widespread attention from scholars. girvan and newman proposed the gn algorithm [10] , which is a typical divisive method. clauset et al. [5] proposed a community detection algorithm based on data analysis, which is a representative agglomerative method. fortunato et al. [9] presented an algorithm to find community structures based on node information centrality. chen et al. proposed the lcv algorithm [4] which detects communities by finding local central nodes. zhang et al. [32] introduced a hierarchical community detection algorithm based on partial matrix convergence using random walks. combining hierarchical clustering with granular computing, we introduce an agglomerative method based on variable granularity to build a dendrogram. given an undirected and unweighted graph g = (v, e), where v is the set of nodes, e denotes the set of edges. the set of neighbor nodes to a node v i is denoted as the formation process of the initial granules is as follows. first, we calculate the local importance of each node in the network. the local importance of a node v i is defined as follows: where is the degree of node v i , and |·| denotes the number of elements in a set. second, all important nodes are found according to the local importance of nodes. the node v i is an important node if i (v i ) > 0. finally, for any important node, an initial granule is composed of all neighbor nodes of the important node and the important node itself. after all the initial granules are obtained, the hierarchical clustering method based on variable granularity is described. the clustering coefficient between the two granules is defined as where med {} is a median function. the clustering process is as follows. firstly, for ∀c m i , c m j ∈ h m , the clustering coefficient between them is calculated. then the clustering threshold λ m of the current layer is calculated. and the maximum clustering coefficient is found, which is denoted as f c m α , c m β . if f c m α , c m β λ m , the two granules c m α and c m β are merged to form a new granule and the new granule is added to h m+1 . otherwise, all the granules in h m are added to h m+1 and h m is set to empty. for each layer, repeat above clustering process until all nodes in the network are in a granule. therefore, a dendrogram is built. traditional network representation usually uses high-dimensional sparse vectors, which takes more running time and computational space in statistical learning. network representation learning (nrl) is proposed to address the problem. nrl aims to learn the low-dimensional potential representations of nodes in networks. the learned representations can be used as features of the graph for various graph-based tasks, such as classification, clustering, link prediction, community detection, and visualization. deepwalk [24] is the first influential nrl model in recent years, which adopts the approach of natural language processing by using the skip-gram model [18, 19] to learn the representation of nodes in the network. the goal of skip-gram is to maximize the probability of co-occurrence among the words that appear within a window. deepwalk first generates a large number of random walk sequences by sampling from the network. these walk sequences can be analogized to the sentences of the article, and the nodes are analogized to the words in the sentence. then skip-gram can be applied to these walk sequences to acquire network embedding. deepwalk can express the connection of the network well, and has high efficiency when the network is large. to effectively deal with overlapping communities in the target layer, a weighted graph representation approach is proposed. at first, a weighted graph is constructed according to the community structure of the target layer. the weights of edges in an unweighted graph are defined as follows where σ ij is the number of communities in which nodes v i and v j appear in a community at the same time, n c is the total number of communities in the target layer. after that, an improved deepwalk (idw) model is used to acquire the vector representation of all nodes in the graph. unlike deepwalk, the idw model constructs the walk sequences according to the weight of the edge. the greater the weight, the higher the walk probability. assume that the current walk node is v i , if v j ∈ n (v i ), then the walk probability from node v i to node v j is after obtaining all the walk sequences, the skip-gram model is used to learn the vector representation of nodes from the walk sequences. the objective function of idm is as follows where r (v i ) is the vector representation of node v i , ω is the window size which is maximum distance between the current and predicted node within a walk sequence. thus, the vector representation of all nodes in the network is obtained. we will present the proposed wgr-twd algorithm in this section. figure 1 shows the overall framework of the proposed algorithm. our algorithm consists of two parts: the construction of multi-layered community structure and boundary region processing. the first part, we employ the hierarchical clustering method based on variable granularity to construct a multi-layered community structure according to sect. 2.1. some overlapping communities exist in the multi-layered community structure because of clustering mechanism, so we use the extended modularity (eq) [26] to measure the partition quality of each layer. it is defined as follows where m is the number of edges in the network, c i represents a community, o u is the number of communities that node u belongs to, a uv is the element of adjacent matrix, and d u is the degree of node u. a larger eq value means better performance for overlapping community division. thus, we select the layer corresponding to the largest eq value as the target layer. the second part introduces the method of dealing with overlapping communities in the target layer. since there are overlapping communities in the target layer, we need to further divide the target layer to achieve non-overlapping community detection. therefore, the three-way decisions theory (twd) is introduced to handle overlapping communities. based on the idea of twd, we define the overlapping part in the communities as boundary region (bnd), and the non-overlapping part as positive region (pos) or negative region (neg). and our goal is to process nodes in the bnd. first of all, we adopt the weighted graph representation method to learn the vector representation of all nodes in the network. after that, the nodes in the bnd are divided into the pos or neg by using cosine similarity. suppose the vector of node u is u = (x 1 , x 2 , ..., x n ) , node v is v = (y 1 , y 2 , ..., y n ) , then the cosine similarity is defined as for arbitrary node v i in the bnd, find out all communities containing node v i in the target layer, calculate the average value of cosine similarity between node v i and non-overlapping nodes in each community as the similarity between node v i and this community, then join node v i into the community corresponding to the maximum similarity and update the community structure of the target layer. repeat the above operation until all nodes in the bnd are processed. the wgr-twd algorithm is described in algorithm 1. we test the performance of our method on eight real-world datasets in which each dataset is described as follows, and the main information of those datasets are shown in table 1 . zachary's karate club [31] . this is a social network of friendships between 34 members of a karate club at a us university in the 1970s. dolphin social network [17] . it is an undirected social network of frequent associations between 62 dolphins in a community living off doubtful sound, new zealand. books about us politics [22] . a network of books about us politics published around the time of the 2004 presidential election and sold by the online bookseller amazon.com. edges between books represent frequent co-purchasing of books by the same buyers. american college football [10] . a network of american football games between division ia colleges in 2000. email communication network [21] . it is a complex network which indicates the email communications of a university. the network was composed by alexandre arenas. facebook [15] . the network was collected from survey participants using facebook app. geom [11] . the authors collaboration network in computational geometry. collaboration [14] . the network is from the e-print arxiv and covers scientific collaborations between authors papers submitted to high energy physics theory category. in this paper, two representative algorithms are chosen to compare with the proposed wgr-twd, as shown below: -modularity increment (mi) [3] . a hierarchical clustering method based on variable granularity, and the overlapping nodes between communities are divided according to modularity optimization. -deepwalk [24] . it is a network representation learning method. this approach is used to handle the overlapping communities in the target layer. we employ two widely used criteria to evaluate the performance of community detection algorithms. the first index is modularity (q) [5] , which is often used when the real community structure is not known. q is defined as follows where m is the number of edges in the network, a is the adjacent matrix, d i is the degree of node i, c i represents the community to which node i belongs, and δ (c i , c j ) = 1 when c i = c j , else δ (c i , c j ) = 0. the higher the modularity value, the better the result of community detection. another index is normalized mutual information (nmi) [7] , which is defined as follows where c a (c b ) denotes the number of communities in partition a (b), c ij is the number of nodes shared by community i in partition a and by community j in partition b, c i. (c .j ) represents the sum of elements of matrix c in row i (column j), and n is the number of nodes in the network. a higher value of nmi indicates the detected community structure is closer to the real community structure. in the networks with known real partition (the first four small networks), we use two indicators (q and nmi) to evaluate our algorithm. table 2 presents the community detection results of the proposed algorithm and the baseline algorithms on networks with known real partition. we can see that our method to further verify the effectiveness of the proposed algorithm, the mi method is used to deal with each layer in the multi-layered community structure. and we select the layer corresponding to the maximum q value as the target layer. the experimental results are shown in table 3 . compared with table 2, table 3 can obtain higher q value. combined with tables 2 and 3, our method can get better community detection results compared with the baseline methods. we also conducted experiments on four large networks. on these networks, the real partition is unknown. therefore, we only use modularity to evaluate the performance of different methods. table 4 shows the community detection results of the proposed method and baseline methods. on the first three networks, we can see that our method obtains better results compared with the two baseline methods. on the collaboration dataset, the mi method achieves the best performance which is a little bit higher than our method. the main reason is that the collaboration network is very sparse which leads to poor vector representation of the nodes. in conclusion, the proposed method effectively addresses the problem of non-overlapping community detection in networks. in this paper, we propose a method for three-way decisions community detection based on weighted graph representation. the target layer in multi-layered community structure is selected according to the extended modularity value of each layer. for the overlapping communities in the target layer, the weighted graph representation can well transform the global structure into vector representation and make the two nodes in the boundary region more similar by using frequency of appearing in the same community as the weight. finally, the nodes in the boundary region are divided according to cosine similarity. experiments on real-world networks demonstrate that the proposed method is effective for community detection in networks. fast unfolding of communities in large networks three-way dicision community detection algorithm based on local group information vghc: a variable granularity hierarchical clustering for community detection a method for local community detection by finding maximaldegree nodes finding community structure in very large networks detecting community structure via the maximal sub-graphs and belonging degrees in complex networks comparing community structure identification three-way decision based on non-overlapping community division method to find community structures based on information centrality community structure in social and biological networks complex network clustering by multiobjective discrete particle swarm optimization based on decomposition information limits for recovering a hidden community an efficient heuristic procedure for partitioning graphs graph evolution: densification and shrinking diameters learning to discover social circles in ego networks semi-supervised community detection based on non-negative matrix factorization with node popularity the emergent properties of a dolphin social network efficient estimation of word representations in vector space distributed representations of words and phrases and their compositionality fast algorithm for detecting community structure in networks finding community structure in networks using the eigenvectors of matrices modularity and community structure in networks finding and evaluating community structure in networks deepwalk: online learning of social representations an information-theoretic framework for resolving community structure in complex networks detect overlapping and hierarchical community structure in networks nonnegative matrix factorization with mixed hypergraph regularization for community detection three-way decision: an interpretation of rules in rough set theory three-way decisions with probabilistic rough sets two semantic issues in a probabilistic rough set model an information flow model for conflict and fission in small groups hierarchical community detection based on partial matrix convergence using random walks community detection algorithm based on clustering granulation acknowledgments. this work is supported by the national natural science foundation of china (grants numbers 61876001) and the major program of the national social science foundation of china (grant no. 18zda032). key: cord-248848-p7jv79ae authors: lee, kookjin; parish, eric j. title: parameterized neural ordinary differential equations: applications to computational physics problems date: 2020-10-28 journal: nan doi: nan sha: doc_id: 248848 cord_uid: p7jv79ae this work proposes an extension of neural ordinary differential equations (nodes) by introducing an additional set of ode input parameters to nodes. this extension allows nodes to learn multiple dynamics specified by the input parameter instances. our extension is inspired by the concept of parameterized ordinary differential equations, which are widely investigated in computational science and engineering contexts, where characteristics of the governing equations vary over the input parameters. we apply the proposed parameterized nodes (pnodes) for learning latent dynamics of complex dynamical processes that arise in computational physics, which is an essential component for enabling rapid numerical simulations for time-critical physics applications. for this, we propose an encoder-decoder-type framework, which models latent dynamics as pnodes. we demonstrate the effectiveness of pnodes with important benchmark problems from computational physics. numerical simulations of dynamical systems described by systems of ordinary differential equations (odes) 1 play essential roles in various engineering and applied science applications. such examples include predicting input/output responses, design, and optimization [55] . these odes and their solutions often depend on a set of input parameters, and such odes are denoted as parameterized odes. examples of such input parameters within the context of fluid dynamics include reynolds number and mach number. in many important scenarios, high-fidelity solutions of parameterized odes are required to be computed i) for many different input parameter instances (i.e., many-query scenario) or ii) in real time on a new input parameter instance. a single run of a high-fidelity simulation, however, often requires fine spatiotemporal resolutions. consequently, performing real-time or multiple runs of a high-fidelity simulation can be computationally prohibitive. to mitigate this computational burden, many model-order reduction approaches have been proposed to replace costly high-fidelity simulations. the common goal of these approaches is to build a reduced-dynamical model with lower complexity than that of the high-fidelity model, and to use the reduced model to compute approximate solutions for any new input parameter instance. in general, model-order reduction approaches consist of two components: i) a low-dimensional latent-dynamics model, where the computational complexity is very low, and ii) a (non)linear mapping that constructs high-dimensional approximate states (i.e., solutions) from the low-dimensional states obtained from the latent-dynamics model. in many studies, such models are constructed via data-driven techniques in the following steps: i) collect solutions of high-fidelity simulations for a set of training parameter instances, ii) build a parameterized surrogate model, and iii) fit the model by training with the data collected from the step i). in the field of deep-learning, similar efforts have been made for learning latent dynamics of various physical processes [33, 9, 47, 61, 20] . neural ordinary differential equations (nodes), a method of learning time-continuous dynamics in the form of a system of ordinary differential equations from data, comprise a particularly promising approach for learning latent dynamics of dynamical systems. nodes have been studied in [71, 9, 26, 62, 42, 12, 22] , and this body of work has demonstrated their ability to successfully learn latent dynamics and to be applied to downstream tasks [9, 61] . because nodes learn latent dynamics in the form of odes, nodes have a naturally good fit as a latent-dynamics model in reduced-order modeling of physical processes and have been applied to several computational physics problems including turbulence modeling [53, 46] and future states predictions in fluids problems [2] . as pointed out in [16, 8] , however, nodes learn a single set of network weights, which fits best for a given training data set. this results in an node model with limited expressibility and often leads to unnecessarily complex dynamics [18] . to overcome this shortcoming, we propose to extend nodes to have a set of input parameters that specify the dynamics of the node model, which leads to parameterized nodes (pnodes). with this simple extension, pnodes can represent multiple trajectories such that the dynamics of each trajectory are characterized by the input parameter instance. the main contributions of this paper are • an extension to nodes that enables them to learn multiple trajectories with a single set of network weights; even for the same initial condition, the dynamics can be different for different input parameter instances, • a framework for learning latent dynamics of parameterized odes arising in computational physics problems, • a demonstration of the effectiveness of the proposed framework with advection-dominated benchmark problems, which are a class of problems where classical linear latent-dynamics learning methods (e.g., principal component analysis) often fail to learn accurately [40] . classical reduced-order modeling. classical reduced-order modeling (rom) techniques rely heavily on linear methods such as the proper orthogonal decomposition (pod) [30] , which is analogous to principal component analysis [31] , for constructing the mappings between a high-dimensional space and a low-dimensional space. these roms then identify the latent-dynamics model by executing a (linear) projection process on the high-dimensional equations e.g., galerkin projection [30] or least-square petrov-galerkin projection [7, 6] . we refer readers to [3, 4, 48] for a complete survey on classical methods. physics-aware deep-learning-based reduced-order modeling . recent work has extended classical roms by replacing proper orthogonal decomposition with nonlinear dimension reduction techniques emerging from deep learning [34, 27, 20, 40, 41, 36] . these approaches operate by identifying a nonlinear mapping (via, e.g., convolutional autoencoders) and subsequently identifying the latent dynamics as certain residual minimization problems [20, 40, 41, 36] , which are defined on the latent space and are derived from the governing equations. in [34, 27] , the latent dynamics is identified by simply projecting the governing equation using the encoder, which may leads to kinematically inconsistent dynamics. another class of physics-aware methods include explicitly modeling time integration schemes [51, 63, 73, 21] , adaptive basis selection [60] , and adding stability/structure-preserving constraints in the latent dynamics [17, 28] . we emphasize that our approach is closely related to [63] , where neural networks are trained to approximate the action of the first-order time-integration scheme applied to latent dynamics and, at each time step, the neural networks take a set of problem-specific parameters as well as reduced state as an input. thus, our approach can be seen as a time-continuous generalization of the approach in [63] . purely data-driven deep-learning-based reduced-order modeling. another approach for developing deeplearning-based roms is to learn both nonlinear mappings and latent dynamics in purely data-driven ways. latent dynamics are modeled as recurrent neural networks with long short-term memory (lstm) units along with linear pod mappings [70, 56, 46] or nonlinear mappings constructed via (convolutional) autoencoders [23, 72, 45, 66] . in [53, 46] , latent dynamics and nonlinear mappings are modeled as neural odes and autoencoders, respectively; in [49, 43, 65, 47] , autoencoders are used to learn approximate invariant subspaces of the koopman operator. relatedly, there have been studies on learning direct mappings via e.g., a neural network, from problem-specific parameters to either latent states or approximate solution states [64, 19, 58, 10, 35, 69] , where the latent states are computed by using autoencoder or linear pod. enhancing node. augmented nodes [16] extends nodes by augmenting additional state variables to hidden state variables, which allows nodes to learn dynamics using the additional dimensions and, consequently, to have increased expressibility. anode [22] discretize the integration range into a fixed number of steps (i.e., checkpoints) to mitigate numerical instability in the backward pass of nodes; aca [76] further extends this approach by adopting adaptive stepsize solver in the bardward pass. anodev2 [75] proposes a coupled system of neural odes, where both hidden state variables and network weights are allowed to evolve over time and their dynamics are approximated as neural networks. neural optimal control [8] formulates an node model as a controlled dynamical system and infers optimal control via an encoder network this formulation results in an node that adjusts the dynamics for different input data. moreover, improved training strategies for nodes have been studied in [18] and an extension of using spectral elements in discretizations of node has been proposed in [54] . neural odes (nodes) are a family of deep neural network models that parameterize the time-continuous dynamics of hidden states using a system of odes: where z(t) is a time-continuous representation of a hidden state, f θ is a parameterized velocity function, which defines the dynamics of hidden states over time, and θ is a set of neural network weights. given the initial condition z(0) (i.e., input), a hidden state at any time index z(t) can be obtained by solving the initial value problem (ivp) (3.1). to solve the ivp, a black-box differential equation solver can be employed and the hidden states can be computed with the desired accuracy: in the backward pass, as proposed in [9] , gradients are computed by solving another system of odes, which are derived using the adjoint sensitivity method [52] , which allows memory efficient training of the node model. as pointed out in the papers [16, 8] , an node model learns a single dynamics for the entire data distribution and, thus, results in a model with limited expressivity. to resolve this, we propose a simple, but powerful extension of neural ode. we refer to this extension a "parameterized neural odes" (pnodes): with a parameterized initial condition z 0 (µ), where µ = [µ 1 , . . . , µ nµ ] ∈ d ⊂ r nµ denotes problem-specific input parameters. inspired by the concept of "parameterized odes", where the odes depend on the input parameters, this simple extension allows node to have multiple latent trajectories that depend on the input parameters. this extension only requires minimal modifications in the definition of the velocity function f θ and can be trained/deployed by utilizing the same mathematical machinery developed for nodes in the forward pass (i.e., via a black-box ode solver) and the backward pass (i.e., via the adjoint-sensitivity method). in practice, f θ is approximated as a neural network which takes z and µ as an input and then produces dz dt as an output. we now investigate pnodes within the context of performing model reduction of computational physics problems. we start by formally introducing the full-order model that we seek to reduce. we then describe our proposed framework, which uses pnodes (or nodes) as the reduced-order (latent-dynamics) model. the full-order model (fom) corresponds to a parameterized system of ordinary differential equations (odes):u = f (u, t; µ), u(0; µ) = u 0 (µ), where u(t; µ), u : [0, t ] × d → r n denotes the state, which is implicitly defined as the solution to the system of odes. here, µ ∈ d denotes the ode parameters that characterize physical properties (e.g., boundary conditions, forcing terms), d ⊂ r nµ denotes the parameter space, where n µ is the number of parameters, and t denotes the final time. the initial state is specified by the parameterized initial condition denotes the velocity andu denotes the differentiation of u with respect to time t. solving (5.1) requires application of an ode solver, where the computational complexity rapidly grows with degrees of freedom n (e.g., n ∼ 10 7 for many practical problems in computational physics). reduced-order modeling mitigates the high cost associated with solving the fom by operating on reduced computational models that comprise i) a (non)linear mapping that constructs high-dimensional states from reduced states and ii) a low-dimensional latent-dynamics model for the reduced states. we denote the mapping from the reduced states to the high-dimensional states as d d d : r p → r n , and denote the latent dynamics model as the system of parameterized odes: u =f (û, t; µ),û(0; µ) =û 0 (µ), (5.2) whereû(t; µ),û : [0, t ] × d → r p denotes the reduced state, which is a low-dimensional representative state of the high-dimensional state (i.e., p n ). analogously,û 0 (µ),û 0 : d → r p denotes the reduced parameterized initial condition, andf (û, t; µ),f : r p × [0, t ] × d → r p denotes the reduced velocity. the objective of the rom is to learn both a nonlinear mapping and a latent-dynamics model such that the rom generates accurate approximate solutions to the full-order model solution, i.e., d d d(û) ≈ u. the aim here is to learn the latent dynamics with the pnode: find a set of node parameters θ such thatu wheref θ (·, ·; ·, θ) : r p × [0, t ] × d → r p denotes the reduced velocity, i.e., modeling a rom (eq. (5.2)) as pnode. to achieve this goal, we propose a framework, where, besides a latent-dynamics model described by the pnode, two additional functions are required: i) an encoder, which maps a high-dimensional initial state u 0 (µ) to a reduced initial stateû 0 (µ), and ii) a decoder, which maps a set of reduced statesû k , k = 1, . . . , n t figure 1 : the forward pass of the proposed framework: i) the encoder (red arrow), which provides a reduced initial state to the pnode, ii) solving pnode (or node) with the initial state results in a set of reduced states, and iii) the decoder (blue arrows), which maps the reduced states to high-dimensional approximate states. to a set of high-dimensional approximate statesũ k , k = 1, . . . , n t . we approximate these functions with two neural networks: with all these neural networks defined, the forward pass of the framework can be described as 1. encode a reduced initial state from the given initial condition:û 0 (µ) = h h h enc (u 0 (µ); θ θ θ enc ), 2. solve a system of odes defined by pnode (or node): 3. decode a set of reduced states to a set of high-dimensional approximate states:ũ k = h h h dec (û k ; θ θ θ dec ), k = 1, . . . , n t , and 4. compute a loss function l(ũ 1 , . . . ,ũ nt , u 1 , . . . , u nt ). figure 1 illustrates the computational graph of the forward pass in the proposed framework. we emphasize that the proposed framework only takes the initial states from the training data and the problem-specific ode parameters µ as an input. pnodes still can learn multiple trajectories, which are characterized by the ode parameters, even if the same initial states are given for different ode parameters, which is not achievable with nodes. furthermore, the proposed framework is significantly simpler than the common neural network settings for nodes when they are used to learn latent dynamics: the sequence-to-sequence architectures as in [9, 61, 74, 45, 46] , which require that a (part of) sequence is fed into the encoder network to produce a context vector, which is then fed into the node decoder network as an initial condition. in the following, we apply the proposed framework for learning latent dynamics of parameterized dynamics from computational physics problems. we then demonstrate the effectiveness of the proposed framework with results of numerical experiments performed on these benchmark problems. to train the proposed framework, we collect snapshots of reference solutions by solving the fom for pre-specified training parameter instances µ ∈ d train ≡ {µ k train } ntrain k=1 ⊂ d. this collection results in a tensor where n t is the number of time steps. the mode-2 unfolding [38] of the solution tensor u u u gives consists of the fom solution snapshots for µ k train and the first column corresponds to the initial condition u 0 (µ k train ). among the collected solution snapshots, only the first columns of u u u (µ k train ), k = 1, . . . , n train (i.e., the initial conditions) are fed into the framework, the rest of solution snapshots are used in computing the loss function. assuming the fom arises from a spatially discretized partial differential equation, the total degrees of freedom n can be defined as n = n u × n 1 × · · · × n n d , where n u is the number of different types of solution variables (e.g., chemical species), and n n d denotes the number of spatial dimensions of the partial differential equation. note that this spatially-distributed data representation is analogous to multi-channel images (i.e., n u corresponds to the number of channels); as such we utilize (transposed) convolutional layers [39, 24] in our encoder and decoder. in the experiments, we employ convolutional encoders and transposed-convolutional decoders. the encoder consists of four convolutional layers, followed by one fully-connected layer, and the decoder consists of one fully-connected layer, followed by four transposed-convolutional layers. to decrease/increase the spatial dimension, we employ strides larger than one, but we do not use pooling layers. for the nonlinear activation functions, we use elu [13] after each (transposed) convolutional layer and fully-connected layer, with an exception of the output layer (i.e., no activation at the output layer). moreover, we employ node and pnode for learning latent dynamics and modelf θ as fully-connected layers. for the nonlinear activation functions, we again use elu. lastly, for odesolve, we use the dormand-prince method [15] , which is provided in the software package of [9] . our implementation reuses many parts of the software package used in [61] , which is written in pytorch. the details of the configurations will be presented in each of the benchmark problems. for training, we set the loss function as the mean squared error and optimize the network weights (θ, θ θ θ enc , θ θ θ dec ) using adamax, a variant of adam [37] , with an initial learning rate 1e-2. at each epoch of training, the loss function is evaluated on the validation set, and the best performing network weights on the validation set are chosen to try the model on the test data. the details on a training/validating/testing-split will be given later in the descriptions of each experiment. for the performance evaluation metrics, we measure errors of approximated solutions with respect to the reference solutions for testing parameter instances, µ k test . we use the relative 2 -norm of the error: where · f denotes the frobenius norm. the first benchmark problem is a parameterized one-dimensional inviscid burgers' equation, which models simplified nonlinear fluid dynamics and demonstrates propagations of shock. the governing system of partial differential equations is 35] . the boundary condition w(0, t; µ) = µ 1 is imposed on the left boundary (x = 0) and the initial condition is set by w(x, 0; µ) = 1. thus, the problem is characterized by a single variable w (i.e., n u = 1) and the two parameters (µ 1 , µ 2 ) (i.e., n µ = 2) which correspond to the dirichlet boundary condition at x = 0 and the forcing term, respectively. following [59] , discretizing eq. 6.2 with godunov's scheme with 256 control volumes results in a system of parameterized odes (fom) with n = n u n 1 = 256. we then solve the fom using the backward-euler scheme with a uniform time step ∆t = 0.07, which results in n t = 500 for each µ k train ∈ d train . figure 2 depicts snapshots of reference solutions for parameter instances (µ 1 , µ 2 ) = (4.25, 0.015) at time t = {7.77, 11.7, 19.5, 23.3, 27.2, 35.0}, illustrating the discontinuity (shock) moving from left to right as time proceeds. for the numerical experiments in this subsection, we use the network described in table 1 . 6.3.1. reconstruction: approximating a single trajectory with latent-dynamics learning in this experiment, we consider a single training/testing parameter instance µ 1 train = µ 1 test = (µ 1 1 , µ 1 2 ) = (4.25, 0.015) and test both node and pnode for latent-dynamics modeling. we set the reduced dimension 2 as p = 5 and the maximum number of epoch as 20,000. figure 3 depicts snapshots of the reference solutions and approximated solutions computed by using the framework with node ( figure 3a ) and with pnode ( figure 3b we now consider two multi-parameter scenarios. in the first scenario (scenario 1), we vary the first parameter (boundary condition) and consider 4 training parameter instances, 2 validation parameter instances, and 2 test parameter instances. the parameter instances are collected as shown in figure 4a we train the framework with node and pnode with the same set of hyperparameters with the maximum number of epochs as 50,000. again, the reduced dimension is set to p = 5. figures 5a-5b depict snapshots of reference solutions and approximated solutions using node and pnode. both node and pnode learn the boundary condition (i.e., 4.67 at x = 0) accurately. for node, this is only because the testing boundary condition is linearly in the middle of two validating boundary conditions (and also in the middle of four training boundary conditions) and minimizing the mean squared error results in learning a single trajectory with the node, where the trajectory has a boundary condition, which is exactly the middle of two validating boundary conditions 4.389 and 4.944. moreover, as node learns a single trajectory that minimizes mse, it actually fails to learn the correct dynamics and results in poor approximate solutions as time proceeds. as opposed to node, the pnode accurately approximates solutions up to the final time. table 2 (second row) shows the relative 2 -errors (eq. 6.1) for both node and pnode. continuing from the previous experiment, we test the second testing parameter instance, d test = {(5.22, 0.015)}, which is located outside d train (i.e., next to µ (7) in figure 4a) . the results are shown in figures 5c-5d : the node only learns a single trajectory with the boundary condition, which lies in the middle of validating parameter instances, whereas the pnode accurately produces approximate solutions for the new testing parameter instances. table 2 (third row) reports the relative errors. next, in the second scenario (scenario 2), we vary both parameters µ 1 and µ 2 as shown in figure 4b : the sets of the training, validating, and testing parameter instances correspond to we have tested the set of testing parameter instances and table 3 reports the relative errors; the result shows that pnode achieves sub 1% error in most cases. on the other hand, node achieves around 10% errors in most cases. the 1.7% error of node for µ 1 test is achieved only because the testing parameter instance is located in the middle of the validating parameter instances (and the training parameter instances). study on effective latent dimension. we further have tested the framework with node and pnode for varying latent dimensions p = {1, 2, 3, 4, 5} with the same hyperparameters described in table 1 with the maximum number of epochs as 50,000 and the same training/validating/testing split shown in figure 4b . for all testing parameter instances, the dimension of latent states marginally affects the performance of the nodes. we believe this is because node learns a dynamics that minimizes the mse over four validating parameter instances in d val regardless of the latent dimensions. on the other hand, decreasing the latent dimension smaller than two (p < 3) negatively affects the performances of the pnodes for all testing parameter instances. nevertheless, even with the latent dimension one, p = 1, pnode still outperforms node in all testing parameter instances; with p = 2, pnode starts to produce almost order-of-magnitude more accurate approximate solutions than node does. moreover, we observe that, for the given training data/training strategy and the hyperparameters, increasing the latent dimension (larger than p = 2) only marginally improves the accuracy of the solutions, which agrees with the observations made in [41] and, in some cases, a neural network work with larger p (i.e., p > 5) requires more epochs to reach the level of the training accuracy that is achieved by a neural network with smaller p (i.e., p = {3, 4, 5}). study on varying training/validating data sets. we now assess performance of the proposed framework for different settings of training and validating data sets. this experiment illustrates the dependence of the framework on the amount of training data as well as settings of training/validation/testing splits. to this end, we have trained and tested the framework with pnode on three sets of parameter instance samplings as shown in figure 7 , where the first set in figure 7a is equivalent to the set in figure 4b . while all three sets share the testing parameter instances, sets 2 and 3 (figures 7b and 7c ) are built incrementally upon set 1 by having additional training and validating parameter instances: compared to set 1, set 2 has two more training parameter instances {µ (13) , µ (15) } and one more validating parameter instance µ (14) as shwon in figure 7b and set 3 has four more training parameter instances {µ (13) , µ (15) , µ (16) , µ (18) } and two more validating parameter instance {µ (14) , µ (17) } as shwon in figure 7c . we again consider the same hyperparameters described in table 1 with the maximum number of epochs as 50,000 on the training sets depicted in figure 7 , and table 4 reports the accuracy of approximate solutions computed on the testing parameter instances. increasing the number of training/validating parameter instances virtually has no effect on the accuracy of the approximation measured on the testing parameter instance µ (5) . that is, for the given network architecture (table 1) , increasing the amount of training/validating data does not have significant effect on the performance of the framework. on the other hand, increasing the number of training/validating parameter instances in a way that is shown in figure 7 has significantly improved the accuracy of the approximations for the other testing parameter instances {µ (10) , µ (11) , µ (12) }. this set of experiments essentially illustrates that, for a given network architecture, more accurate approximation can be achieved for testing parameter instances that lie in between training/validating parameter instances (i.e., interpolation in the parameter space) than for those that lie outside of training/validating parameter instances (i.e., extrapolation in the parameter space). table 4 : prediction scenario 2: the relative 2 -errors (eq 6.1) of the approximate solutions for testing parameter instances computed using the framework with pnode. each framework with pnode is trained on different training/validating sets depicted in figure 7 . 1.0735 × 10 −2 6.5199 × 10 −3 3.1217 × 10 −3 the reaction model of a premixed h 2 -air flame at constant uniform pressure [5] is described by the equation: ∂w(x, t; µ) ∂t = ∇ · (ν∇w(x, t; µ)) − v · ∇w(x, t; µ) + q(w(x, t; µ); µ), (6.3) where, on the right-hand side, the first term is the diffusion term with the spatial gradient operator ∇, the molecular diffusivity ν = 2cm 2 ·s −1 , the second term is the convective term with the constant and divergence-free velocity field v = [50cm·s −1 , 0] t , and the third term is the reactive term with the reaction source term q. the solution w corresponds to the thermo-chemical composition vector defined as where w t denotes the temperature, and w h2 , w o2 , w h2o denote the mass fraction of the chemical species h 2 , o 2 , and h 2 o. the reaction source term is of arrhenius type, which is defined as: q(w; µ) = [q t (w; µ), q h2 (w; µ), q o2 (w; µ), q h2o (w; µ)] t , where q t (w; µ) = qq h2o (w; µ), , v h2o ) = (2, 1, −2) denote stoichiometric coefficients, (w h2 , w o2 , w h2o ) = (2.016, 31.9, 18) denote molecular weights in units g·mol −1 , ρ = 1.39×10 −3 g·cm −3 denotes the density mixture, r = 8.314j·mol −1 ·k −1 denotes the universal gas constant, and q = 9800k denotes the heat of the reaction. the problem has two input parameters (i.e., n µ = 2), which correspond to µ = (µ 1 , µ 2 ) = (a, e), where a and e denote the pre-exponential factor and the activation energy. • γ 4 , γ 5 , and γ 6 : the homogeneous neumann condition, and the initial condition is set as w t = 300k, and (w h2 , w o2 , w h2o ) = (0, 0, 0) (i.e., empty of chemical species). for collecting data, we employ a finite-difference method with 64 × 32 uniform grid points (i.e., n = n µ × n 1 × n 2 = 4 × 64 × 32), the second-order backward euler method (bdf2) with a uniform time step ∆t = 10 −4 and the final time 0.06 (i.e., n t = 600). figure 9 depicts snapshots of the reference solutions of each species for the training parameter instance (µ 1 , µ 2 ) = (2.3375 × 10 12 , 5.6255 × 10 3 ). each species in this problem has different numeric scales: the magnitude of w t is about four-orders of magnitude larger than those of other species (see figure 9 ). to set the values of each species in a common range [0, 1], we employ zero-one scaling to each species separately. moreover, because the values of a and e are several orders of magnitude larger than those of species, we scale the input parameters as well to match the scales of the chemical species. we simply divide the values of the first parameter and the second parameter by 10 13 and 10 4 , respectively. 3 after these scaling operations, we train the framework with hyper-parameters specified in table 5 , where the input data consists of 2-dimensional data with 4 channels, and the reduced dimension is again set as p = 5. in this experiment, we vary two parameters: the pre-exponential factor µ 1 = a and the activation energy µ 2 = e. we consider parameter instances as depicted in figure table 6 presents the relative 2 -errors of approximate solutions computed using node and pnode for testing parameter instances in the predictive scenario. the first three rows in table 6 correspond to the results of testing parameter instances at the middle three red circles in figure 10 . as expected, both node and pnode work well for these testing parameter instances: node is expected to work well for these testing parameter instances because the single trajectory that minimizes the mse over validating parameter instances would be the trajectory associated with the testing parameter µ (8) . as we consider testing parameter instances that are distant from µ (8) , we observe pnode to be (significantly) more accurate than node. from these observations, the node model can be considered as being overfitted to a trajectory that minimizes the mse. this overfitting can be avoided to a certain extent by applying e.g., early-stopping, however, this cannot fundamentally fix the problem of the node (i.e., fitting a single trajectory for all input data distributions). for the third benchmark problem, we consider the quasi-one-dimensional euler equations for modeling inviscid compressible flow in a one-dimensional converging-diverging nozzle with a continuously varying cross-sectional area [44] . the system of the governing equations is here, ρ denotes density, u denotes velocity, p denotes pressure, denotes energy per unit mass, e denotes total energy density, γ denotes the specific heat ratio, and a(x) denotes the converging-diverging nozzle cross-sectional area. we consider a specific heat ratio of γ = 1.3, a specific heat constant of r = 355.4m 2 /s 2 /k, a total temperature of t total = 300k, and a total pressure of p total = 10 6 n/m 2 . the cross-sectional area a(x) is determined by a cubic spline interpolation over the points figure 11 depicts the schematic figures of converging-diverging nozzle determined by a(x), parameterized by the width of the middle cross-sectional area, µ. a perfect gas, which obeys the ideal gas law (i.e., p = ρrt ), is assumed. for the initial condition, the initial flow field is computed as follows; a zero pressure-gradient flow field is constructed via the isentropic relations, where m denotes the mach number, c denotes the speed of sound, a subscript m indicates the flow quantity at x = 0.5 m. the shock is located at x = 0.85 m and the velocity across the shock (u 2 ) is computed by using the jump relations for a stationary shock and the perfect gas equation of state. the velocity across the shock satisfies the quadratic equation where m = ρ 2 u 2 = ρ 1 u 1 , n = ρ 2 u 2 2 + p 2 = ρ 1 u 2 1 + p 1 , h = (e 2 + p 2 )/ρ 2 = (e 1 + p 1 )/ρ 1 . the subscripts 1 and 2 indicates quantities to the left and to the right of the shock. we consider a specific mach number of m m = 2.0. for spatial discretization, we employ a finite-volume scheme with 128 equally spaced control volumes and fully implicit boundary conditions, which leads to n = n u n 1 = 3 × 128 = 384. at each intercell face, the roe flux difference vector splitting method is used to compute the flux. for time discretization, we employ the backward euler scheme with a uniform time step ∆t = 10 −3 and the final time 0.6 (i.e., n t = 600). figure 12 the varying parameter of this problem is the width of the middle cross-sectional area, which determines the geometry of the spatial domain and, thus, determines the initial condition as well as the dynamics. analogously to the previous two benchmark problems, we select 4 training parameter instances, 3 validating parameter instances, and 3 testing parameter instances ( figure 13 ): again, we set the reduced dimension as p = 5. we train the framework either with node and pnode for learning latent dynamics and test the framework in the predictive scenario (i.e., for unseen testing parameter instances as shown in figure 13 ) and figure 14 depicts the solution snapshots at t = {0.1, 0.2, 0.3, 0.4, 0.5, 0.6}. we observe that pnode yields moderate improvements over node, i.e., about 20% decrease in the relative 2 -norm of the error (6.1) for all four testing parameter instances. the improvements are not as dramatic as the ones shown in the previous two benchmark problems. we believe this is because, in this problem setting, varying the input parameter results in fairly distinct initial conditions, but does not significantly affect variations in dynamics; both the initial condition and the dynamics are parameterized by the same input parameter, the width of the middle cross-sectional area of the spatial domain. our general observation is that the benefits of using pnode are most pronounced when the dynamics are parameterized and there is a single initial condition although pnode outperforms node in all our benchmark problems. we expect to see more improvements in the approximation accuracy over node when the dynamics vary significantly for different input parameters, for instance, compartmental modeling (e.g., sir, seir models) of infectious diseases such as the novel corona virus (covid-19) [1, 68] , where the dynamics of transmission is greatly affected by parameters of the model, which are determined by e.g., quarantine policy, social distancing. other potential applications of pnodes include modeling i) the response of a quantity of interest of a parameterized partial differential equations and ii) the errors of reduced-order model of dynamical systems [50] . our approach shares the same limitation with other data-driven rom approaches; it does not guarantee preservation of any important physical properties such as conservation. this is a particularly challenging issue, but there have been recent advances in deep-learning approaches for enforcing conservation laws (e.g., enforcing conservation laws in subdomains [40] , hyperbolic conservation laws [57] , hamiltonian mechanics [25, 67] , symplectic structures [11, 32] , lagrangian mechanics [14] and metriplectic structure [29] ) and we believe that adapting/extending ideas of these approaches potentially mitigates the limitation of data-driven rom approaches. in this study, we proposed a parameterized extension of neural odes and a novel framework for reducedorder modeling of complex numerical simulations of computational physics problems. our simple extension allows neural ode models to learn multiple complex trajectories. this extension overcomes the main drawback of neural odes, namely that only a single set of dynamics are learned for the entire data distribution. we have demonstrated the effectiveness of of parameterized neural odes on several benchmark problems from computational fluid dynamics, and have shown that the proposed method outperforms neural odes. this paper describes objective technical results and analysis. any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the u.s. department of energy or the united states government. sandia national laboratories is a multimission laboratory managed and operated by national technology & engineering solutions of sandia, llc, a wholly owned subsidiary of honeywell international inc., for the u.s. department of energy's national nuclear security administration under contract de-na0003525. a multi-risk sir model with optimally targeted lockdown, tech. rep learning dynamical systems from partial observations a survey of projection-based model reduction methods for parametric dynamical systems model reduction and approximation: theory and algorithms projection-based model reduction for reacting flows galerkin v. least-squares petrov-galerkin projection in nonlinear model reduction the gnat method for nonlinear model reduction: effective implementation and application to computational fluid dynamics and turbulent flows neural optimal control for representation learning advances in neural information processing systems physics-informed machine learning for reduced-order modeling of nonlinear problems symplectic recurrent neural networks nais-net: stable deep networks from non-autonomous differential equations fast and accurate deep network learning by exponential linear units (elus) lagrangian neural networks a family of embedded runge-kutta formulae augmented neural odes physics-informed autoencoders for lyapunov-stable fluid flow prediction how to train your neural ode a comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized pdes latent-space dynamics for reduced deformable simulation modeling the dynamics of pde systems with physics-constrained deep auto-regressive networks anode: unconditionally accurate memory-efficient gradients for neural odes deep convolutional recurrent autoencoders for learning lowdimensional feature dynamics of fluid systems deep learning hamiltonian neural networks stable architectures for deep neural networks, inverse problems a deep learning framework for model reduction of dynamical systems deep learning of thermodynamics-aware reduced-order models from data structure-preserving neural networks turbulence, coherent structures, dynamical systems and symmetry analysis of a complex of statistical variables into principal components sympnets: intrinsic structurepreserving symplectic networks for identifying hamiltonian systems deep variational bayes filters: unsupervised learning of state space models from raw data nonlinear model reduction by deep autoencoder of noise response data a non-intrusive multifidelity method for the reduced order modeling of nonlinear problems a fast and accurate physics-informed neural network reduced order model with shallow masked autoencoder adam: a method for stochastic optimization tensor decompositions and applications, siam review deep learning deep conservation: a latent dynamics model for exact satisfaction of physical conservation laws model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders beyond finite layer neural networks: bridging deep architectures and numerical differential equations deep learning for universal linear embeddings of nonlinear dynamics numerical computation of compressible and viscous flow reduced-order modeling of advection-dominated systems with recurrent neural networks and convolutional autoencoders time-series learning of latent-space dynamics for reduced-order model closure deep dynamical modeling and control of unsteady fluid flows reduced basis methods: success, limitations and future challenges linearly-recurrent autoencoder networks for learning dynamics time-series machine-learning error models for approximate solutions to parameterized dynamical systems a deep learning enabler for nonintrusive reduced order modeling of fluid flows the mathematical theory of optimal processes turbulence forecasting via neural ode snode: spectral discretization of neural odes for system identification reduced basis methods for partial differential equations: an introduction nonintrusive reduced order modeling framework for quasigeostrophic turbulence physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations machine learning for nonintrusive model order reduction of the parametric inviscid transonic flow past an airfoil a trajectory piecewise-linear approach to model order reduction of nonlinear dynamical systems depth separation for reduced deep networks in nonlinear model reduction: distilling shock waves in nonlinear hyperbolic problems latent odes for irregularly-sampled time series deep neural networks motivated by partial differential equations an artificial neural network framework for reduced order modeling of transient flows projection-based model reduction: formulations for physics-based machine learning learning koopman invariant subspaces for dynamic mode decomposition enabling nonlinear manifold projection reduced-order models by extending convolutional neural networks to unstructured data hamiltonian generative networks phase-adjusted estimation of the number of coronavirus disease 2019 cases in wuhan, china non-intrusive reduced order modeling of unsteady flows using artificial neural networks with application to a combustion problem model identification of reduced order fluid dynamics systems using deep learning a proposal on machine learning via dynamical systems latent space physics: towards learning the temporal evolution of fluid flow non-intrusive inference reduced order model for fluids using deep multistep neural network, mathematics ode2vae: deep generative second order odes with bayesian neural networks anodev2: a coupled neural ode framework adaptive checkpoint adjoint method for gradient estimation in neural ode key: cord-027451-ztx9fsbg authors: de chiara, davide; chinnici, marta; kor, ah-lian title: data mining for big dataset-related thermal analysis of high performance computing (hpc) data center date: 2020-05-25 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50436-6_27 sha: doc_id: 27451 cord_uid: ztx9fsbg greening of data centers could be achieved through energy savings in two significant areas, namely: compute systems and cooling systems. a reliable cooling system is necessary to produce a persistent flow of cold air to cool the servers due to increasing computational load demand. servers’ dissipated heat effects a strain on the cooling systems. consequently, it is necessary to identify hotspots that frequently occur in the server zones. this is facilitated through the application of data mining techniques to an available big dataset for thermal characteristics of high-performance computing enea data center, namely cresco 6. this work presents an algorithm that clusters hotspots with the goal of reducing a data centre’s large thermal-gradient due to uneven distribution of server dissipated waste heat followed by increasing cooling effectiveness. a large proportion of worldwide generated electricity is through hydrocarbon combustion. consequently, this causes a rise in carbon emission and other green house gasses (ghg) in the environment, contributing to global warming. data center (dc) worldwide were estimated to have consumed between 203 to 271 billion kwh of electricity in the year 2010 [1] and in 2017, us based dcs alone used up more than 90 billion kilowatt-hours of electricity [14] . according to [2] , unless appropriate steps are taken to reduce energy consumption and go-green, global dc share of carbon emission is estimated to rise from 307 million tons in 2007 to 358 million tons in 2020. servers in dcs consume energy that is proportional to allocated computing loads, and unfortunately, approximately 98% of the energy input is being dissipated as waste heat energy. cooling systems are deployed to maintain the temperature of the computing servers at the vendor specified temperature for consistent and reliable performance. koomey [1] emphasises that a dc energy input is primarily consumed by cooling and compute systems (comprising servers in chassis and racks). thus, these two systems have been critical targets for energy savings. computing-load processing entails jobs and tasks management. on the other hand, dc cooling encompasses the installation of cooling systems and effective hot/cold aisle configurations. thermal mismanagement in a dc could be the primary contributor to it infrastructure inefficiency due to thermal degradation. server microprocessors are the primary energy consumers and waste heat dissipators [4] . generally, existing dc air-cooling systems are not sufficiently efficient to cope with the vast amount of waste heat generated by high performance-oriented microprocessors. thus, it is necessary to disperse dissipated waste heat so that there will be an even distribution of waste heat within a premise to avoid overheating. undeniably, a more effective energy savings strategy is necessary to reduce energy consumed by a cooling system and yet efficient in cooling the servers (in the compute system). one known technique is thermal-aware scheduling where a computational workload scheduling is based on waste heat. thermal-aware schedulers adopt different thermal-aware approaches (e.g. system-level for work placements [16] ; execute 'hot' jobs on 'cold' compute nodes; predictive model for job schedule selection [17] ; ranked node queue based on thermal characteristics of rack layouts and optimisation (e.g. optimal setpoints for workload distribution and supply temperature of the cooling system). heat modelling provides a model that links server energy consumption and their associated waste heat. thermal-aware monitoring acts as a thermal-eye for the scheduling process and entails recording and evaluation of heat distribution within dcs. thermal profiling is based on useful monitoring information on workload-related heat emission and is useful to predict the dc heat distribution. in this paper, our analysis explores the relationship between thermal-aware scheduling and computer workload scheduling. this is followed by selecting an efficient solution to evenly distribute heat within a dc to avoid hotspots and cold spots. in this work, a data mining technique is chosen for hotspots detection and thermal profiling for preventive measures. the novel contribution of the research presented in this paper is the use of real thermal characteristics big dataset for enea high performance computing (hpc) cresco6 compute nodes. analysis conducted are as follows: hotspots localisation; users categorisations based on submitted jobs to cresco6 cluster; compute nodes categorisation based on thermal behaviour of internal and surrounding air temperatures due to workload related waste heat dissipation. this analysis aims to minimise employ thermal gradient within a dc it room through the consideration of the following: different granularity levels of thermal data; energy consumption of calculation nodes; it room ambient temperature. an unsupervised learning technique has been employed to identify hotspots due to the variability of thermal data and uncertainties in defining temperature thresholds. this analysis phase involves the determination of optimal workload distribution to cluster nodes. available thermal characteristics (i.e. exhaust temperature, cpus temperatures) are inputs to the clustering algorithm. subsequently, a series of clustering results are intersected to unravel nodes (identified by ids) that frequently fall into high-temperature areas. the paper is organised as follows: sect. 1 -introduction; sect. 2 -background: related work; sect. 3 -methodology; sect. 4 -results and discussion; sect. 5 -conclusions and future work. in the context of high performance computing data center (hpc-dc), it is essential to satisfy service level agreements with minimal energy consumption. this will involve the following: dc efficient operations and management within recommended it room requirements, specifications, and standards; energy efficiency and effective cooling systems; optimised it equipment utilisation. dc energy efficiency has been a long standing challenge due to multi-faceted factors that affect dc energy efficiency and adding to the complexity, is the trade-off between performance in the form of productivity and energy efficiency. interesting trade-offs between geolocations and dc energy input requirements (e.g. cold geolocations and free air-cooling; hot, sunny geolocations and solar powered renewable energy) are yet to be critically analysed [8] . one of the thermal equipment-related challenge is raising the setpoint of cooling equipment or lowering the speed of crac (computer room air conditioning) fans to save energy, may in the long-term, decrease the it systems reliability (due to thermal degradation). however, a trade-off solution (between optimal cooling system energy consumption and long-term it system reliability) is yet to be researched on [8] . another long-standing challenge is it resource over-provisioning that causes energy waste due to for idle servers. relevant research explores optimal allocation of pdus (power distribution units) for servers, multi-step algorithms for power monitoring, and on-demand provisioning reviewed in [8] . other related work addresses workload management, network-level issues as optimal routing, virtual machines (vm) allocation, and balance between power savings and network qos (quality of service) parameters as well as appropriate metrics for dc energy efficiency evaluation. one standard metric used by a majority of industrial dcs is power usage effectiveness (pue) proposed by green grid consortium [2] . it shows the ratio of total dc energy utilisation with respect to the energy consumed solely by it equipment. a plethora of dc energy efficiency metrics evaluate the following: thermal characteristics; ratio of renewable energy use; energy productivity of various it system components, and etc. there is a pressing need to provide a holistic framework that would thoroughly characterise dcs with a fixed set of metrics and reveal potential pitfalls in their operations [3] . though some existing research work has made such attempts but to date, we are yet to have a standardised framework [9, 10, 13] . to reiterate, the thermal characteristics of the it system ought to be the primary focus of an energy efficiency framework because it is the main energy consumer within a dc. several researches have been conducted to address this issue [12] . sungkap et al. [11] propose an ambient temperature-aware capping to maximize power efficiency while minimising overheating. their research includes an analysis of the composition of energy consumed by a cloud-based dc. their findings for the composition of dc energy consumption are approximately 45% for compute systems; 40% for refrigeration-based air conditioning; remaining 15% for storage and power distribution systems. this implies that approximately half of the dc energy is consumed by non-computing devices. in [6] , wang and colleagues present an analytical model that describes dc resources with heat transfer properties and workloads with thermal features. thermal modelling and temperature estimation from thermal sensors ought to consider the emergence of server hotspots and thermal solicitation due to the increase in inlet air temperature, inappropriate positioning of a rack or even inadequate room ventilation. such phenomena are unravelled by thermal-aware location analysis. the thermal-aware server provisioning approach to minimise the total dc energy consumption calculates the value of energy by considering the maximum working temperature of the servers. this approach should consider the fact that any rise in the inlet temperature rise may cause the servers to reach the maximum temperature resulting in thermal stress, thermal degradation, and severe damage in the long run. typical different identified types of thermal-aware scheduling are reactive, proactive and mixed. however, there is no reference to heatmodelling or thermal-monitoring and profiling. kong and colleagues [4] highlight important concepts of thermal-aware profiling, thermal-aware monitoring, and thermalaware scheduling. thermal-aware techniques are linked to the minimisation of waste heat production, heat convection around server cores, task migrations, and thermalgradient across the microprocessor chip and microprocessor power consumption. dynamic thermal management (dtm) techniques in microprocessors encompasses the following: dynamic voltage and frequency scaling (dvfs), clock gating, task migration, and operating system (os) based dtm and scheduling. in [5] , parolini and colleagues propose a heat model; provide a brief overview of power and thermal efficiency from microprocessor micro-level to dc macro-level. to reiterate, it is essential for dc energy efficiency to address thermal awareness in order to better understand the relationship between both the thermal and the it aspects of workload management. in this paper, the authors incorporate thermal-aware scheduling, heat modelling, thermal aware monitoring and thermal profiling using a big thermal characteristic dataset of a hpc-data center. this research involves measurement, quantification, and analysis of compute nodes and refrigerating machines. the aim of the analysis is to uncover underlying causes that causes temperatures rise that leads to the emergence of thermal hotspots. overall, effective dc management requires energy use monitoring, particularly, energy input, it energy consumption, monitoring of supply air temperature and humidity at room level (i.e. granularity level 0 in the context of this research), monitoring of air temperature at a higher granularity level (i.e. at computer room air conditioning/computer room air handler (crac/crah) unit level, granularity level 1). measurements taken are further analysed to reveal extent of energy use and economisation opportunities for the improvement of dc energy efficiency level (granularity level 2). dc energy efficiency metrics will not be discussed in this paper. however, the discussion in the subsequent section primarily focuses on thermal guidelines from american society of heating, refrigerating and ac engineers (ashrae) [7] . to reiterate, our research goal is to reduce dc wide thermal-gradient, hotspots and maximise cooling effects. this entails the identification of individual server nodes that frequently occur in the hotspot zones through the implementation of a clustering algorithm on the workload management platform. the big thermal characteristics dataset of enea portici cresco6 computing cluster is employed for the analysis. it has 24 measured values (or features) for each single calculation node (see table 1 ) and comprises measurements for the period from may 2018 to january 2020. briefly, the cluster cresco6 is a high-performance computing system (hpc) consisting of 434 calculation nodes with a total of 20832 cores. it is based on lenovo think system sd530 platform, an ultra-dense and economical two-socket server in a 0.5 u rack form factor inserted in a 2u four-mode enclosure. each node is equipped with 2 intel xeon platinum 8160 cpus (each with 24 cores) and a clock frequency of 2.1 ghz; a ram size of 192 gb, corresponding to 4 gb/core. a low-latency intel omni-path 100 series single-port pcie 3.0 x16 hfa network interface. the nodes are interconnected by an intel omni-path network with 21 intel edge switches 100 series of 48 ports each, bandwidth equal to 100 gb/s, and latency equal to 100 ns. the connections between the nodes have 2 tier 2:1 no-blocking tapered fat-tree topology. the power consumption massive computing workloads amount to a maximum of 190 kw. this work incorporates thermal-aware scheduling, heat modelling, and thermal monitoring followed by subsequent user profiling based on "waste heat production" point of view. thermal-aware dc scheduling is designed based on data analytics conducted on real data obtained from running cluster nodes in a real physical dc. for the purpose of this work, approximately 20 months' worth of data has been collected. data collected are related to: relevant parameters for each node (e.g. inlet air temperature, internal temperature of each node, energy consumption of cpu, ram, memory, etc…); environmental parameters (e.g. air temperatures and humidity in both the hot and cold aisles); cooling system related parameters (e.g. fan speed); and finally, individual users who submit their jobs to cluster node. this research focuses on the effect of dynamic workload assignment on energy consumption and performance of both the computing temperature at the front, inside (on cpu1 and cpu2) and at the rear of every single node (expressed in celsius) sysairflow speed of air traversing the node expressed in cfm (cubic foot per minute) dc energy meter of total energy used by the node, updated at corresponding timestamp and expressed in kwh and cooling systems. the constraint is that each arrived job must be assigned irrevocably to a particular server without any information about impending incoming jobs. once the job has been assigned, no pre-emption or migration is allowed, which a rule is typically adhered to for hpc applications due to high data reallocation incurred costs. in this research, we particularly explore an optimised mapping of nodes that have to be physically and statically placed in advance to one of the available rack slots in the dc. this will form a matrix comprising computing units with specific characteristics and certain resource availability level at a given time t. the goal is to create a list of candidate nodes to deliver "calculation performance" required by a user's job. when choosing the candidate nodes, the job-scheduler will evaluate the suitability of the thermally cooler nodes (which at the instant t) based on their capability to satisfy the calculation requested by a user (in order to satisfy user's sla). to enhance the job scheduler decision making, it is essential to know in advance, the type of job a user will submit to a node(s) for computation. such insight is provided by several years' worth of historical data and advanced data analytics using machine learning algorithms. through platform load sharing facility (lsf) accounting data we code user profiles into 4 macro-categories: this behavioural categorisation provides an opportunity to save energy and better allocate tasks to cluster nodes to reduce overall node temperatures. additionally, when job allocation is evenly distributed, thermal hotspots and cold spots could be avoided. the temperatures of the calculation nodes could be evened out, thus, resulting in a more even distribution of heat across the cluster. based on thermal data, it is necessary to better understand in-depth what users do and how they manage to solicit the calculation nodes for their jobs. the three main objectives of understanding users' behaviour are as follows: identify parameters based on the diversity of submitted jobs for user profiling; analyse the predictability of various resources (e.g. cpu, memory, i/o) and identify their time-based usage patterns; build predictive models for estimating future cpu and memory usage based on historical data carried out in the lsf platform. abstraction of behavioural patterns in the job submission and its associated resource consumption is necessary to predict future resource requirements. this is exceptionally vital for dynamic resource provisioning in a dc. user profile is created based on submitted job-related information and to reiterate, the 4 macro categories of user profiles are: 1) cpu-intensive, 2) diskintensive, 3) both cpu and memory-intensive, or 4) neither cpu-nor memoryintensive. a crosstab of the accounting data (provided by the lsf platform) and resource consumption data help guide the calculation of relevant thresholds that code jobs into several distinct utilisation categories. for instance, if the cpu load is high (e.g., larger than 90%) during almost 60% of the job running time for an application, then the job can be labelled as a cpu-intensive one. the goal is for the job-scheduler to optimise task scheduling when a job with the same appid (i.e. the same type of job) or same username is re-submitted to a cluster. in case of a match with the previous appid or username, relevant utilisation stats from the profiled log are retrieved. based on the utilisation patterns, this particular user/application will be placed into one of the 4 previously discussed categories. once a job is categorised, a thermally suitable node is selected to satisfy the task calculation requirements. a task with high cpu and memory requirement will not be immediately processed until the node temperature is well under a safe temperature threshold. node temperature refers to the difference between the node's outlet exhaust air and inlet air temperatures (note: this generally corresponds to the air temperature in the aisles cooled by the air conditioners). it is necessary to have a snapshot of relevant thermal parameters (e.g. temperatures of each component in the calculation nodes) for each cluster to facilitate efficient job allocation by the job-scheduler. generally, a snapshot is obtained through direct interrogation of the nodes and installed sensors in their vicinity, or inside the calculation nodes. for each individual node, the temperatures of the cpus, memories, instantaneous energy consumption and peed of the cooling fans are evaluated undeniably, the highly prioritised parameter is the difference between the node's inlet and exhaust air temperatures. if there is a marked difference, it is evident that the node is very busy (with jobs that require a lot of cpu or memory-related resource consumption). therefore, for each calculation node, relevant data is monitored in real time, and subsequently, virtually stored in a matrix that represents the state of the entire cluster. each matrix cell represents the states of a node (represented by relevant parameters). for new job allocation, the scheduling algorithm will choose a node based on its states depicted in the matrix (e.g. recency or euclidean distance). through this, generated waste heat is evenly distributed over the entire "matrix" of calculation nodes so that hotspots could be significantly reduced. additionally, a user profile is an equally important criterion for resource allocation. this is due to the fact that user profiles provide insights into user consumption patterns and the type of submitted jobs and their associated parameters. for example, if we know that a user will perform cpu-intensive jobs for 24 h, we will allocate the job in a "cell" (calculation node) or a group of cells (when the number of resources requires many calculation nodes) that are physically well distributed or with antipodal locations. this selection strategy aims to evenly spread out the high-density nodes followed by the necessary cooling needs. this will help minimise dc hotspots and ascertain efficient cooling with reduction in coolingrelated energy consumption. as previously discussed, we have created user profiles based on submitted job-related information. undeniably, these profiles are dynamic because they are constantly revised based on user resource consumption behaviour. for example, a user may have been classified as "cpu intensive" for a certain time period. however, if the user's submitted jobs are no longer cpu intensive, then the user will be re-categorised. the deployment of the thermal-aware job scheduler generally aims to reduce the overall cpu/memory temperatures, and outlet temperatures of cluster nodes. the following design principles guide the design and implementation of the job: 1) job categoriesassign an incoming job to one of these 4 categories: cpu-intensive, memory-intensive, neither cpu nor memory-intensive, and both cpu and memory-intensive tasks; 2) utilisation monitoring -monitoring cpu and memory utilisation while making scheduling decisions; 3) redline temperature control -ensure operating cpus and memory under threshold temperatures; 4) average temperatures maintenance -monitor average cpu and memory temperatures in a node and manage an average exhaust air temperature across a cluster. to reiterate, user profile categorisation is facilitated by maintaining a log profile of both cpu and memory utilisation for every job (with an associated user) processed in the cluster. a log file contains the following user-related information: (1) user id; (2) application identification; (3) the number of submitted jobs; (4) cpu utilisation; (5) memory utilisation. a list of important thermal management-related terms is as follows: 1) cpu-intensiveapplications that is computation intensive (i.e. requires a lot of processing time); 2) memory-intensive-a significant portion of these applications require ram processing and disk operations; 3) maximum (redline) temperature -the maximum operating temperature specified by a device manufacturer or a system administrator; 4) inlet air temperature -the temperature of the air flowing into a data node (i.e. temperature of the air sucked in from the front of the node); 5) exhaust air temperature -the temperature of the air coming out from a node (the temperature of the air extracted from the rear of the node). by applying these evaluation criteria, we have built an automated procedure that provides insight into the 4 user associated categories (based on present and historical data). obviously, the algorithm always makes a comparison between a job just submitted by a user and the time series (if any) of the same user. if the application launched or the type of submitted job remains the same, then the user will be grouped into one of the 4 categories (based on a supervised learning algorithm) during each job execution, the temperature variations of the cpus and memories are recorded at preestablished time intervals. finally, it continuously refines the user behaviour based on the average length of time the user uses for the job. this will provide a more accurate user (and job) profile because it provides reliable information on the type of job processed in a calculation node and its total processing time. the job scheduler will exploit such information for better job placement within an ideal array of calculation nodes in the cluster. a preliminary study is conducted. to provide insight into the functioning of the clusters. for 8 months, we have observed the power consumption ( fig. 1) and temperature (fig. 2) profiles of the nodes with workloads. we have depicted energy consumed by the various server components (cpu, memory, other) in fig. 3 and presented a graph that highlights the difference in energy consumption between idle and active nodes (fig. 4) . it is observed that for each node, an increase in load effects an increase in temperature difference between inlet and exhaust air for that particular node. figure 5 depicts the average observed inlet air temperature (blue segment, and in the cold aisle), and exhaust air temperature at their rear side (amaranth segment, in the hot aisle). note the temperature measurements are also taken two cpus adjacent to every node. the setpoints of the cooling system are approximately 18°c at the output and 24°c at the input of the cooling systemas respectively shown in fig. 5 as blue and red vertical lines. however, it appears that the lower setpoint is variable (supply air at 15-18°c) while the higher setpoint varies from 24-26°c. as observed from the graph, the cold aisle maintains the setpoint temperature at the inlet of the node, which affirms the efficient design of the cold aisle (i.e. due to the use of plastic panels to isolating the cold aisle from other spaces in the it room). however, the exhaust air temperature has registered on average, 10°c higher level than the hot aisle setpoint. notably, exhaust temperature sensors are directly located at the rear of the node (i.e. in the hottest parts of the hot aisle). therefore, it is observed that hotspots are immediately located at the back of server racks, while the hot aisle air is cooled down to the 24-26°c. this is due the cooling system at the crac (computer room air conditioning) which results in hot air intake, air circulation and cold-hot air mix in the hot aisle. meanwhile, the previously mentioned temperature difference of 10°c between the hotspots and the ambient temperature unravels the cooling system weak points because it could not directly cool the hotspots. in the long term, the constant presence of the hotspots might affect the servers' performance (i.e. thermal degradation) which should be carefully addressed by the dc operator. remarkably, although the hotspots are present at the rear of the nodes, the cooling system does cool temperatures around the nodes. cold air flows through the node and is measured at the inlet, then at cpu 2 and cpu 1 locations (directly on the cpus) and finally, at the exhaust point of the server. the differences between observed temperature ranges in these locations are averaged for all the nodes. an investigation on the observed temperature distribution contributes to the overall understanding of the thermal characteristics, as it provides an overview of the prevailing temperatures shown in fig. 5 and fig. 6 . for every type of thermal sensors, the temperature values are recorded as an integer number, so the percentage of occurrences of each value is calculated. the inlet air temperature is registered around 18°c in the majority of cases and has risen up to 28°c in around 0.0001% of cases. it could be concluded that the cold aisle temperature remains around the 15-18°c setpoint for most of the monitored period. ranges of the exhaust temperature and those of cpus 1 and 2 are in the range 20-60°c with most frequently monitored values in the intervals of 18-50°c. although these observations might incur measurement errors, they reveal severs that are at risks of frequent overheating when benchmarked with manufacturer's recommendation data sheets. additionally, this study focuses on variation between subsequent thermal measurements with the aim of exploring temperature stability around the nodes. all temperature types have distinct peaks of zero variation which decreases symmetrically and assumes a gaussian distribution. it could be concluded that temperature tends to be stable in the majority of monitored cases. however, the graphs for exhaust and cpus 1 and 2 temperature variation ( fig. 6 reveal that less than 0.001% of the recorded measurements show an amplitude of air temperature changes of 20°c or more occurring at corresponding locations. sudden infrequent temperature fluctuations are less dangerous compared to prolonged periods of constantly high temperatures. nevertheless, further investigation is needed to uncover causes of abrupt temperature changes so that appropriate measures could be undertaken by dc operators to maintain prolonged periods of constantly favourable conditions. we propose a scheduler upgrade which aims to optimise cpu and memories-related resource allocation, as well as exhaust air temperatures without relying on profile information. prescribed targets for the proposed job scheduler are shown in table 2 . the design of the proposed job schedule ought to address four issues: 1) differentiate between cpu-intensive tasks and memory-intensive tasks; 2) consider cpu and memory utilisation during the scheduling process; 3) maintain cpu and memory temperatures under the threshold redline temperatures; 4) minimise the average exhaust air temperature of nodes to reduce cooling cost. the job scheduler receives feedback of node status through queried confluent platform [15] (monitoring software installed on each node). when all the nodes are busy, the job scheduler will manage the temperatures, embarks on a load balancing procedure by keeping track of the coolest nodes in the cluster. in doing so, the scheduler continues job executions even in hot yet undamaging conditions. the job scheduler maintains the average cluster cpu and memory utilisation represented by u_{cpuavg} and u_{memavg}, cpu and memory temperatures represented by t_{cpuavg}, t_{memavg}, respectively. the goal of our enhanced job scheduler is to maximise the cop (coefficient of performance). below are the 7 constraints (at nodes level) for our enhanced scheduler: each job is assigned to utmost one node 6. minimise response time of job with the first and second constraints are satisfied, ensure that the memory and cpu temperatures remain below the threshold temperatures. if a cluster's nodes exceed the redline threshold, then optimise the temperature by assigning jobs to the coolest node in the cluster. the third constraint specifies that if the average temperature of memory or cpu rises above the maximum temperature, then the scheduler should stop scheduling tasks as it might encounter hardware failures. the fourth constraint states that the exhaust air temperature of a node should be the same or less than the average exhaust air temperature of the cluster (taking into consideration n number of nodes). the fifth constraint ensures that a node gets utmost one job at a single point in time. the last point aims at reducing the completion time of a job to achieve optimal performance. the following is the description of our algorithm: ****matrix of node with position r-ow and c-olumn**** cluster= matrix[r,c] user=getuserfromsubmittedjob_in_lsf jobtype= getjobprofile(user) ****push the values of utilization and temperature for cpu and memory into matrix***** for (i=0; i=number_of_node;i++) do nodename = getnodename(i) u i cpu = getcpu_utilization(nodename) u i memory = getmemory_utilization(nodename) t i cpu = getcpu_temperature(nodename) t i memory = getmemory_temperature(nodename) end for *************if a user is not profiled *************** if jobtype= null then **********try to understand job type at run time*********** if (ucpu <= u_threshold_cpu) && (umemory <= u_threshold_memory) then jobtype=easyjob else if (ucpu>u_threshold_cpu) && (umemory < u_threshold_memory) then jobtype=cpuintensivejob else if (ucpu u_threshold_memory) then jobtype=memoryintensivejob else jobtype=cpu&memoryintensivejob end if end if ******** i try to find the candidate nodes for each type of job*********** avgtempcluster= avgtemp(cluster) mint_nodename= gettempnodename(mintemp(cluster)) maxt_nodename=gettempnodename(maxtemp(cluster)) ***********intervals of temperatures for candidate nodes************* bestcpuintensivenode=getnode (mint_nodename, mint_nodename+25%)) bestmemoryintensivenode= getnode(mint_nodename+50%, mint_nodename+75%) bestcpu&memoryintensivenode= getnode(mint_nodename+25%, mint_nodename+50%) besteasyjob= getnode(maxt_nodename, maxt_nodename-25% ) ******************job assignments************************** if jobtype= cpuintensivejob then assignjob (bestcpuintensivenode) else if jobtype= memoryintensivejob then assignjob (bestmemoryintensivenode) else if jobtype= cpu&memoryintensivejob then assignjob(bestcpu&memoryintensivenode) else assignjob(besteasyjob) end if the algorithm feeds into the node matrix by considering the physical arrangement of every single node inside the racks. firstly, obtain the profile of the user who puts in a resource request for resources. this is done by retrieving the user's profile from a list of stored profiles. the algorithm is executed for all the nodes to appreciate resource utilisation level and temperature profiles each node. if the user profile does not exist, then when a user executes a job for the first time, the algorithm calculates a profile instantaneously. all the indicated threshold values are operating values calculated for each cluster configuration and are periodically recalculated and revised according to the use of the cluster nodes. subsequently, some temperature calculations are made from the current state of the cluster (through a snapshot of thermal profile). finally, the last step is to assign the job to the node based on the expected type of job. through this, the algorithm helps avert the emergence of hotspots and cold spots by uniformly distributing the jobs in the cluster. in order to support sustainable development goals, energy efficiency ought to be the ultimate goal for a dc with a sizeable high-performance computing facility. to reiterate, this work primarily focuses on two of major aspects: it equipment energy productivity and thermal characteristics of an it room and its infrastructure. the findings of this research are based on the analysis of available monitored thermal characteristics-related data for cresco6. these findings feed into recommendations for enhanced thermal design and load management. in this research, clustering performed on big datasets for cresco6 it room temperature measurements, has grouped nodes into clusters based on their thermal ranges followed by uncovering the clusters they frequently subsume during the observation period. additionally, a data mining algorithm has been employed to locate the hotspots and approximately 8% of the nodes have been frequently placed in the hot range category (thus labelled as hotspots). several measures to mitigate risks associated with the issue of hotspots have been recommended: more efficient directional cooling, load management, and continuous monitoring of the it room thermal conditions. this research brings about two positive effects in terms of dc energy efficiency. firstly, being a thermal design pitfall, hotspots pose as a risk of local overheating and servers thermal degradation due to prolonged exposure to high temperatures. undeniably, information of hotspots localisation could facilitate better thermal management of the it room where waste heat is evenly distributed. thus, it ought to be the focus of enhanced thermal management in the future. secondly, we discussed ways to avert hotspots through thermal-aware resource allocation (i.e. select the coolest node for a new incoming job), and selection of nodes (for a particular job) that are physically distributed throughout the it room. growth in data center electricity use greenpeace: how dirty is your data? a look at the energy choices that metrics for sustainable data centers recent thermal management techniques for microprocessors a cyber-physical systems approach to data center modeling and control for energy efficiency thermal aware workload placement with task-temperature profiles in a datacenter thermal guidelines for data processing environments -expanded data center classes and usage guidance green data centers: a survey, perspectives, and future directions. arxiv measuring energy efficiency in data centers. in: pervasive computing next generation platforms for intelligent data collection data center, a cyber-physical system: improving energy efficiency through the power management atac: ambient temperature-aware capping for power efficient datacenters thermal metrics for data centers: a critical review review on performance metrics for energy efficiency in data center: the role of thermal management confluent site optimized thermal-aware job scheduling and control of data centers energy efficiency of thermal-aware job scheduling algorithms under various cooling models key: cord-322746-28igib4l authors: gosche, john r.; vick, laura title: acute, subacute, and chronic cervical lymphadenitis in children date: 2007-06-06 journal: semin pediatr surg doi: 10.1053/j.sempedsurg.2006.02.007 sha: doc_id: 322746 cord_uid: 28igib4l lymphadenopathy refers to any disease process involving lymph nodes that are abnormal in size and consistency. lymphadenitis specifically refers to lymphadenopathies that are caused by inflammatory processes. cervical lymphadenopathy is a common problem in the pediatric age group and is largely inflammatory and infectious in etiology. although most patients are treated successfully by their primary care physician, surgical consultation is frequently required for patients who fail to respond to initial therapy or for those in whom there is an index of suspicion for a neoplastic process. this article addresses current approaches to the diagnosis and management of cervical lymphadenitis in children. although lymph nodes are located throughout the lymphatic system, they are concentrated in certain areas of the body, including the head and neck. because infectious processes involving the oropharyngeal structures are common in children, cervical lymphadenitis is also common in this age group. lymphatic drainage follows well-defined patterns. as such, the location of the enlarged lymph node is a good indication of the likely site of entry of the inciting organism ( figure 1 ). involvement of superficial or deep cervical lymph nodes is also frequently indicative of the site of entry since superficial nodal enlargement usually reflects invasion through an epithelial surface (eg, buccal mucosa, skin, scalp), whereas deep nodal enlargement results from an infectious process involving more central structures (eg, middle ear, posterior pharynx). lymph nodes contain t-and b-lymphocytes as well as antigen-presenting macrophages (dendritic cells). tissue lymph enters the lymph node via one or more afferent vessels and percolates through a series of reticuloendothe-lial-lined channels that coalesce and drain through an efferent lymphatic vessel. particulate matter is phagocytosed by macrophages lining the lymphatic channels. once phagocytized, foreign proteins become bound to major histocompatibility (mhc) antigens and are presented on the surface of macrophages. foreign proteins bound to mhc class ii molecules on the surface of dendritic cells, in combination with other cell surface receptors and secreted cellular signals (interleukins), are required for activation of t-helper lymphocytes. these lymphocytes can in turn activate naïve b-lymphocytes. alternatively, memory b-lymphocytes may be directly activated by dendritic cells. once activated, band t-lymphocytes proliferate to create a pool of lymphocytes that have the ability to recognize and bind the inciting foreign protein. in addition, activated t-lymphocytes and macrophages release cellular signals (cytokines) that induce leukocyte chemotaxis and increase vascular permeability. the symptoms associated with acute cervical lymphadenitis reflect these pathophysiologic events. nodal enlargement occurs as a result of cellular hyperplasia, leukocyte infiltration, and tissue edema. vasodilation and capillary leak in response to locally released cytokines causes erythema and edema of the overlying skin, and tenderness results from distention of the nodal capsule. a thorough history and complete physical examination often suggests the probable cause of cervical lymphadenitis. consideration of whether symptoms and presentation are acute, subacute, or chronic is often helpful in establishing a differential diagnosis. clearly, the definitions of these categories are arbitrary, and many infectious processes are associated with symptom duration that fits into more than one category. in general, however, acute lymphadenitis, which can be 2 weeks in duration, is due to either a viral or bacterial invasion. chronic lymphadenopathy is more likely to be due to a neoplastic process or invasion by an opportunistic organism. subacute lymphadenitis, which is 2 and 6 weeks in duration, encompasses a much broader group of potential etiologies. in practice, surgeons seldom are involved in the care of patients with acute lymphadenitis unless the lymph nodes become suppurative. most of these patients improve during a course of antibiotic therapy prescribed by their primary care physician. other important clinical information to obtain are the location (single or multiple sites) and progress of neck swelling (increasing, stable, or decreasing) and the presence of systemic symptoms (eg, fever, malaise, anorexia, weight loss, or arthralgias). more specific symptoms include skin changes and pain in the region of the nodal swelling, as well as at more distant sites. a history of recent upper respiratory tract symptoms, sore throat, ear pain, toothache, insect bites, superficial lacerations or rashes, and exposure to animals may suggest possible etiologies. in addition, a history of recent travel, exposure to individuals that are ill, and immunization status should be sought. finally, patient age is another important consideration, since lymphadenopathy in young children is overwhelmingly due to infectious etiologies, whereas adenopathy due to neoplasia increases in the adolescent age group. findings on physical examination may also suggest an etiology. cervical lymph nodes are frequently palpable in children; however, lymph nodes larger than 10 mm in diameter are considered abnormal. as noted previously, the location of involved nodes may indicate a potential site of entry and should prompt a detailed examination of that site. erythema, tenderness, and fluctuance suggest an acute process, most likely attributable to a bacterial invasion. involvement of bilateral cervical lymph nodes suggests a viral origin. the characteristics of the nodes are also important. nodes involved in neoplastic processes frequently are firm and fixed, whereas those due to infectious agents tend to be softer in consistency and often slightly mobile. other physical abnormalities, including respiratory findings, skin lesions, hepatosplenomegaly, and adenopathy in other parts of the body may also suggest an etiology. finally, it is important to keep in mind that not all swellings in the neck represent enlarged lymph nodes and that congenital and acquired cysts and soft tissue lesions also present as neck masses. often the nonnodal nature of these masses is suggested by the history or by the findings on physical examination. in equivocal cases, however, diagnostic imaging almost always reveals whether a particular swelling is due to nodal enlargement or to a cyst or soft tissue mass. laboratory tests are seldom required as part of the workup for acute cervical lymphadenitis. leukocyte counts and markers of inflammation (c-reactive protein and erythrocyte sedimentation rate) are usually abnormal but nonspecific. although a left shift (ie, increased percentage of immature white cells) on the leukocyte differential count suggests a bacterial etiology, this etiology frequently is suggested by the clinical presentation alone. any material that has been aspirated due to fluctuance should be sent for culture and sensitivity. these cultures may show an organism that is resistant to prior antibiotic therapy, but occasionally they are negative due to eradication of the infectious agent by a prior course of antibiotics. blood cultures should be obtained in any patient that appears toxic. cultures of other sites that appear to be the primary site of the infection (eg, pharynx) should also be obtained, although results from pharyngeal cultures may not correlate with organisms isolated from a nodal abscess. 1 in contrast, laboratory evaluation plays a crucial role in determining the etiology of subacute, chronic, and generalized lymphadenopathy. serologic tests for bartonella henselae, syphilis (vdrl), toxoplasmosis, cytomegalovirus (cmv), epstein-barr virus (ebv), tularemia, brucellosis, histoplasmosis, and coccidiomycosis may suggest an infectious agent. a strongly positive intradermal tuberculin skin test is consistent with an infection due to mycobacterium tuberculosis, whereas a lesser reaction to tuberculin skin testing is more consistent with a nontuberculous mycobacterial infection. finally, serologic testing for human immunodeficiency virus (hiv) should be considered in any patient with at-risk behaviors, generalized lymphadenitis, and unusual or recurrent infections caused by opportunistic organisms. figure 2 presents a suggested algorithm for the diagnostic evaluation of a child with cervical lymphadenitis. 2 plain radiographs are seldom necessary in patients with acute cervical lymphadenitis, but may occasionally document the primary site of an infection (eg, pneumonia, sinusitis, or dental caries). plain radiographs are more valuable in the child with chronic or generalized adenopathy. plain radiographs of the chest may suggest involvement of mediastinal lymph nodes or the lungs and are indicated in all patients with respiratory symptoms. chest radiographs with two views should also be obtained in any patient with either symptomatic or asymptomatic cervical adenopathy. this is done to rule out critical airway compression if a biopsy under general anesthesia is planned. other findings on plain radiographs may include bony lesions consistent with osteomyelitis or tumor involvement, evidence of hepatic and/or splenic enlargement, and/or calcifications involving the liver or spleen, suggesting a chronic granulomatous infection. in routine practice, however, plain radiographs of anatomic regions other than the chest are seldom required. ultrasonography (us) is the most frequently obtained and the most useful diagnostic imaging study. high-resolution us is used to assess nodal morphology, longitudinal and transverse diameter, and internal architecture. doppler us is used to assess the presence of perfusion and its distribution, as well as to obtain measures of vascular resistance. advantages of us are that it is noninvasive and avoids ionizing radiation and can be performed without sedation in almost every patient. additionally, serial us can be performed to follow nodal diameters and architecture over time. one potential drawback of us, however, is its lack of absolute specificity and sensitivity in ruling out neoplastic processes as the cause of nodal enlargement. thus, findings that are interpreted as being consistent with an infectious etiology might result in a false sense of security and delay diagnostic biopsy. us in the acute setting is primarily of value in assessing whether a cervical swelling is nodal in origin or is attributable to an infected cyst or other soft tissue mass. also, it may detect an abscess not already apparent on physical examination and that requires drainage. in patients with subacute or chronic adenopathy, us is often used in an attempt to determine whether nodal enlargement is neoplastic or infectious in origin. findings on gray-scale us shown to be consistent with reactive lymphadenopathy include a long-to short-axis ratio of greater than 2.0 (ie, oval shape), central irregular hyperechogenicity, blurred margins, and central necrosis. 3 findings on color doppler examination reported to be consistent with a reac-tive lymphadenopathy include hilar vascularity 4 and a low pulsatility index. 5 however, neither of these features, alone or in combination, have been shown to consistently distinguish between benign and malignant etiologies. 6, 7 thus, although suspicious us findings may be useful in indicating the need for biopsy, us should not be considered as a definitive means to rule out neoplasia in patients with persistent lymphadenopathy. cross-sectional diagnostic imaging techniques such as computed tomography (ct) and magnetic resonance imaging (mri) are of little value in managing most patients with cervical lymphadenitis, but may provide a useful roadmap in patients undergoing nodal excision with suspected atypical mycobacterial lymphadenitis. these studies certainly are indicated in patients with a biopsy-verified diagnosis of neoplasia. treatment varies depending on the cause and presentation of cervical lymphadenitis. as such, treatment options will be considered within the framework of specific etiologic agents. most cases of cervical adenitis in children are associated with viral infections. 8 acute viral associated cervical lymphadenitis typically develops following an upper respiratory tract infection. involved nodes are usually bilateral, multiple, and relatively small, without warmth or erythema of the overlying skin. virally induced adenopathy rarely suppurates and generally resolves spontaneously over a short period of time. many cases of cervical adenopathy associated with viral illnesses are due to reactive hyperplasia. causes of the associated upper respiratory tract infection include rhinovirus, parainfluenza virus, influenza virus, respiratory syncytial virus, coronavirus, adenovirus, and reovirus. 9 other common viral etiologies include cmv and ebv. less frequent etiologies include mumps, measles, rubella, varicella, herpes simplex, human herpesvirus 6 (roseola), and coxsackie viruses. 10 acute viral lymphadenitis is variably associated with fever, conjunctivitis, pharyngitis, and other upper respiratory tract symptoms. rashes and hepatosplenomegaly may also be present, particularly when cmv is the causative organism. in some cases (eg, rubella), lymphadenopathy precedes the onset of a diagnostic rash. both anterior and posterior cervical lymph nodes are frequently involved when associated with pharyngitis or tonsillitis, whereas preauricular adenitis occurs in 90% of patients with adenoviral-associated keratoconjunctivitis. 11 bilateral, acute cervical lymphadenitis associated with a viral upper respiratory tract infection rarely requires additional diagnostic testing or specific therapy. adenopathy typically resolves spontaneously as the viral illness wanes. treatment is directed at relieving symptoms associated with the viral illness. specific antiviral therapy is seldom indicated except in the rare patient with severe respiratory tract or hepatic involvement, or in the immunocompromised patient. large (ͼ2-3 cm) solitary, tender, unilateral cervical lymph nodes that rapidly enlarge in the preschool age child are commonly due to bacterial infection. the most commonly involved lymph nodes in decreasing order of frequency are the submandibular, upper cervical, submental, occipital, and lower cervical nodes. 10 forty percent to 80% of cases of acute unilateral cervical lymphadenitis in the 1-to 4-yearold child are due to staphylococcus aureus or streptococcus pyogenes. 10 group b streptococcal adenitis may present in the infant with unilateral facial or submandibular swelling, erythema, and tenderness, associated with fever, poor feeding, and irritability. anaerobic bacteria occur in the older child with dental caries or periodontal disease. isolated anaerobes include bacteroides sp, peptococcus sp, peptostreptococcus sp, propionibacterium acnes, and fusobacterium nucleatum. 12 less frequent etiologies of acute bacterial lymphadenitis include francisella tularensis, pasteurella multocida, yersinia pestis, and haemophilus influenza type b, whereas other organisms, such as gramnegative bacilli, streptococcus pneumoniae, group c streptococci, yersinia enterocolitica, staphylococcus epidermidis, and ␣-hemolytic streptococci, are rarely encountered. 11 patients typically present with a history of fever, sore throat, earache, or cough, and physical findings include pharyngitis, tonsillitis, acute otitis media, or impetigo. lymphadenitis due to s. pyogenes should be suspected if the patient presents with the typical vesicular, pustular, or crusted lesions of impetigo involving the face or scalp. as noted previously, cervical lymphadenitis due to anaerobic infections frequently is associated with dental caries or periodontal disease. acute cervical adenitis due to pasteurella multocida can occur following animal bites or scratches on the head, neck, or upper chest, whereas acute cervical lymphadenitis due to yersinia pestis is associated with flea bites on the head and neck and is most commonly seen in the western united states. initial antibiotic therapy is directed at the most likely organisms. because staphylococci and streptococci are the most common pathogens, initial therapy usually includes a ␤-lactamase resistant antibiotic; this agent is used because of the high incidence of penicillin resistance in isolated staphylococci. very young patients or patients with severe symptoms (eg, cellulitis, high fever, or respiratory distress) may require hospitalization for initiation of parenteral antibiotic therapy and close observation. for older patients with dental or periodontal disease, the antibiotic regimen should include coverage for anaerobic oral flora (ie, penicillin v or clindamycin). therapy is usually administered for 10 days and continued for at least 5 days beyond resolution of acute signs and symptoms. if a primary site is identified, cultures should be obtained and treatment is directed at that site as well. in most cases, symptomatic improvement should be noted after 2 to 3 days of therapy, although complete resolution of nodal enlargement may require several weeks. failure to improve, or worsening of the patient's clinical condition, should prompt further diagnostic evaluation, including aspiration and culture, and consideration of an alternate antibiotic regimen. an etiologic agent can be recovered by needle aspiration of an affected node in 60% to 88% of cases. 2 the largest node with the most direct access is typically the best target for aspiration. the node should be entered through an area of healthy skin. aspirated material should be examined by gram stain and acid-fast stain and cultured for aerobic, anaerobic bacteria, and mycobacteria. if no purulent material is aspirated, a small amount of nonbacteriostatic saline can be injected into the node and then aspirated to obtain material for culture. fluctuance develops in 25% of patients with acute bacterial adenitis. in many cases, it can be managed effectively with antibiotics and one or more needle aspirations under local anesthesia, with or without sedation. this approach is particularly attractive when treating fluctuant nodes in cosmetically important areas. however, adequate drainage by aspiration may be difficult, if not impossible, in the uncooperative child or when the abscess cavity is loculated. these patients often require operative drainage under general anesthesia. at the time of operative drainage, an attempt should be made to open and drain all loculations. specimens should be sent for gram stain and aerobic and anaerobic cultures and for acid-fast stains and mycobacterial culture. material for koh prep and fungal cultures should be sent if the patient is immunocompromised, and tissue should be sent for histologic examination if there is suspicion of neoplasia. once drained, the abscess cavity is usually packed with a gauze strip to obtain hemostasis and to prevent early skin closure. the gauze packing can usually be removed over a period of several days on an outpatient basis. reports from multiple centers have documented an increasing frequency of community-acquired methicillin-resistant staphylococcus aureus (ca-mrsa) skin and soft tissue infections, including lymphadenitis. [13] [14] [15] [16] at present, the majority of isolates of staphylococcus aureus associated with cervical lymphadenitis in most centers are methicillinsensitive. however, given the documented increasing nasopharyngeal colonization by methicillin-resistant strains of staphylococcus aureus in healthy children, 17, 18 it is possible that the ca-mrsa will become the prevalent organism responsible for cervical lymphadenitis in the pediatric age group in the future. certainly, failure to respond to appropriate first-line antibiotic therapy should prompt consideration of expanding coverage to include methicillin-resistant strains of staphylococcus aureus. failure to resolve or improve despite a 2-to 4-week period of appropriate therapy, or the presence of generalized lymphadenopathy should prompt further diagnostic testing. a variety of organisms can result in generalized or persistent lymphadenopathy. a number of the more commonly encountered etiologies are described in the following sections. chronic cervical lymphadenitis may be caused by mycobacterium tuberculosis ("scrofula") or by nontuberculous strains of mycobacteria. in the united states, 70% to 95% of cases of mycobacterial lymphadenitis are due to nontuberculous strains. the most commonly encountered strains of nontuberculous mycobacteria include mycobacterium avium-intracellulare and mycobacterium scrofulaceum. less commonly encountered strains include m. kansasii, m. fortuitum, and m. hemophilum. nontuberculous lymphadenitis is most commonly seen in caucasians, whereas tuberculous lymphadenitis is more commonly encountered in asians, hispanics, and african-americans. it also occurs in immigrants from endemic areas and likely represents reactivation of prior disease. in general, the clinical presentation of tuberculous and nontuberculous lymphadenitis is similar. patients usually present with rapid onset of nodal enlargement, followed by a gradual increase in nodal size over 2 to 3 weeks. most nodes remain less than 3 cm in diameter. constitutional signs are unusual. the skin overlying the node typically develops a pink to lilac-red hue and becomes thin and parchment-like. approximately 50% of patients with nontuberculous lymphadenitis develop fluctuance and spontaneous drainage with sinus tract formation occurs in 10%. 11 epidemiologic and clinical features do not allow differentiation of tuberculous from nontuberculous lymphadenitis; however, fulfillment of two of three criteria has been shown to be associated with 92% sensitivity for the diagnosis of tuberculous lymphadenitis. 19 these criteria include: (1) a positive tuberculin skin test reaction, (2) an abnormal chest radiograph, and (3) contact with a person with infectious tuberculosis. ppd skin tests may be positive in patients with nontuberculous infections, but are generally less reactive (ͻ15 mm induration) as compared with the strongly positive reaction associated with m. tuberculosis infections. the treatment of choice for lymphadenitis caused by m. tuberculosis is multi-agent antituberculous antibiotic ther-apy for 12 to 18 months. nodal regression typically occurs within 3 months. because of the effectiveness of antituberculous agents, the surgical excision of draining nodes and sinuses is infrequently required. in contrast, most strains of nontuberculous mycobacteria respond poorly to antituberculous drugs, and the treatment of choice is surgical excision. in general, all clinically involved nodes, associated sinus tracts, and grossly involved overlying skin should be excised en masse. care should be taken to avoid injury to adjacent structures. in patients in whom complete excision would result in unacceptable cosmetic outcomes or injury to adjacent nerves, thorough curettage may be effective. the role of multi-agent antituberculous drug therapy in patients with nontuberculous lymphadenitis is unclear. however, if chemotherapy is planned, appropriate samples should be sent to a qualified laboratory for drug susceptibility testing. cat scratch disease is a lymphocutaneous syndrome characterized by regional lymphadenitis associated with a characteristic skin lesion at the site of inoculation. cat scratch disease follows inoculation of bartonella henselae through broken skin or mucous membranes. a skin papule typically develops at the site of inoculation, followed by regional adenopathy 5 days to 2 months later. unfortunately, often the primary site of involvement has resolved by the time adenopathy is noted. the most common sites of lymphadenopathy are the axilla (52%) and the neck (28%). patients typically present with a single large (ͼ4 cm) tender node. constitutional symptoms are usually mild and include lowgrade fever, body aches, malaise, or anorexia. suppuration occurs in 30% to 50% of cases. in most patients, the diagnosis can be confirmed by serologic testing. cat scratch disease is usually self-limited. in most cases, nodal enlargement resolves spontaneously after 1 to 3 months. as such, the benefit of antibiotic therapy is controversial. azithromycin has, nevertheless, been shown to be associated with more rapid resolution of nodal enlargement. 20 aspiration of suppurative nodes may provide symptomatic relief. rarely, systemic involvement develops and may include encephalitis, granulomatous hepatitis, hepatosplenic infection, endocarditis, and osteomyelitis. effective antibiotic options for patients with systemic involvement may include rifampin, ciprofloxacin, gentamicin, trimethoprim and sulfamethoxazole, clarithromycin, or azithromycin. infections caused by nocardia sp. are infrequent in children and usually present as lung disease in immunocompromised hosts. these organisms are found in the soil or decaying vegetable matter, and infection in humans occurs via inhalation or direct skin inoculation. skin inoculation usually results in an associated skin pustule, and the diagnosis can sometimes be established by culture of the pustule or an involved lymph node. sulfonamides are the treatment of choice. actinomyces species are part of the normal oral flora in human beings. local invasion results in cervicofacial actinomycosis, presenting as brawny induration with secondary nodal involvement. diagnosis is usually made by biopsy. the organism may be difficult to isolate, though histologic examination may reveal "sulfur granules." treatment usually requires initial parenteral antibiotic therapy followed by a prolonged course of oral antibiotics for 3 to 12 months. penicillin is the antibiotic of choice. approximately 10% of patients with acquired infections due to the intracellular protozoan toxoplasma gondii (toxoplasmosis) present with cervical, suboccipital, supraclavicular, axillary, or inguinal adenopathy. infections associated with cervical adenopathy are usually acquired via the oral route by consumption of meat-or milk-containing cysts or oocytes. involved nodes are usually discrete and may be tender, but do not suppurate. the diagnosis can be made by isolation of the organism or by serologic testing. patients with lymphadenopathy alone do not require antimicrobial therapy, but patients with severe or persistent symptoms are treated with a combination of pyrimethamine, sulfadiazine, and leucovorin for at least 4 to 6 weeks. histoplasmosis, blastomycosis, and coccidiomycosis are fungal infections caused by histoplasma capsulatum, blastomyces dermatitidis, and coccidioides immitis, respectively. these organisms are soil saprophytes that have the ability to exist in a yeast form in human tissues. the diseases are endemic to certain geographic regions of the united states. most patients present with pulmonary infections, and lymphadenopathy is usually secondary to the primary pulmonary involvement. the diagnosis can be established by serologic or skin testing. most infections resolve spontaneously and do not require treatment. patients with severe respiratory or systemic symptoms may, however, require prolonged courses of antifungal therapy. hiv is a retrovirus that is transmitted by sexual contact, parenteral exposure to blood, or vertical transmission from mother to child. initial symptoms may be subtle and may include lymphadenopathy, hepatosplenomegaly, failure to thrive, and chronic or recurrent diarrhea. the diagnosis is established by serologic testing. in some cases, cervical lymph node involvement is a manifestation of a systemic disease with an inflammatory component. the following is a brief description of several of these conditions, but is not inclusive: kikuchi-fujimoto disease (histiocytic necrotizing lymphadenitis) is a rare entity of unknown etiology. it typically presents in older children with bilateral, enlarged, firm, painful, cervical lymph nodes (usually in the posterior cervical triangle). associated findings include skin lesions, fever, nausea, weight loss, night sweats, and splenomegaly. laboratory evaluation often reveals leukopenia with atypical lymphocytosis and an elevated erythrocyte sedimentation rate. perinodal inflammation is common. nodal histology is characteristic and most cases resolve spontaneously. kawasaki disease is an acute febrile vasculitis of childhood of unknown etiology. lymphadenitis is often one of the earliest manifestations of the disease. involved nodes are usually unilateral, confined to the anterior triangle, greater than 1.5 cm in diameter, and only moderately tender and nonfluctuant. the diagnosis is made clinically based on the presence of a fever for at least 5 days, accompanying several other characteristic clinical features of the disease. resolution of the cervical lymphadenopathy usually occurs early in the course of the disease. pfapa syndrome usually affects children younger than 5 years of age and is of unknown etiology. it is characterized by cyclic recurrences of the above symptom complex every 2 to 9 weeks, with spontaneous resolution after 4 to 5 days. recurrences gradually abate with time; however, systemic corticosteroids may help relieve severe symptoms. rosai-dorfman disease (sinus histiocytosis with massive lymphadenopathy) is a rare disorder that typically manifests in the first decade of life, predominantly in african-americans. cervical lymph nodes are commonly the initial site of involvement and are usually mobile, discrete and asymmetric. progression leads to massive bilateral cervical nodal enlargement and involvement of other nodal groups or extranodal sites. laboratory evaluation reveals leukocytosis, neutrophilia, an elevated erythrocyte sedimentation rate, and hypergammaglobulinemia. histopathologic analysis shows florid hyperplasia, marked histiocytosis and plasmacytosis. resolution usually occurs after 6 to 9 months. extensive or progressive disease may, however, require treatment with combination chemotherapy. 21 sarcoidosis is a chronic granulomatous disease of unknown etiology. the disease may affect almost any organ in the body, but the lung is most frequently affected. the most common physical finding in children with this disease is peripheral lymphadenopathy. involved cervical nodes are usually bilateral, discrete, firm, and rubbery. supraclavicular nodes become involved in more than 80% of patients. biopsy with histologic examination is the most valuable diagnostic test. treatment is supportive. corticosteroid therapy may suppress acute manifestations. cervical lymphadenopathy in the pediatric age group is largely inflammatory and infectious in etiology, although in some patients it may be related to neoplastic disease. it is important for the surgeon to be aware of the clinical manifestations and specific etiologies of this condition, as well as the diagnostic approaches and therapeutic options currently available. close follow-up is required to monitor the need for either additional diagnostic tests or biopsy should a patient fail to respond to appropriate initial therapy. acute neck infections in children pediatric infectious diseases: principles and practice high-resolution and color doppler ultrasonography of cervical lymphadenopathy in children an overview of neck sonography accuracy of sonographic vascular features in differentiating different causes of cervical lymphadenopathy lymph node hilus: gray scale and power doppler sonography of cervical nodes a scoring system for ultrasonographic differentiation between cervical malignant lymphoma and benign lymphadenitis cervical lymphadenopathy and adenitis childhood cervical lymphadenopathy lymphadenopathy in children cervical lymphadenitis and neck infections cervical lymphadenitis in infants and children epidemic of community-acquired methicillinresistant staphylococcus aureus infections: a 14-year study at driscoll children's hospital impact of community-associated, methicillin-resistant staphyloccous aureus on management of skin and soft tissue infections in children community-associated methicillinresistant staphylococcus aureus in pediatric patients clindamycin treatment of invasive infections caused by community-acquired, methicillin-resistant and methicillin-susceptible staphylococcus aureus in children staphylococcus aureus and mrsa nasal carriage in general population increasing rates of nasal carriage of methicillin-resistant staphylococcus aureus in healthy children mycobacterial cervical lymadenitis in children: clinical and laboratory factors of importance for differential diagnosis prospective randomized double blind placebo-controlled evaluation of azithromycin for treatment of catscratch disease rosai-dorfman disease: successful long-term results by combination chemotherapy with prednisone, 6-mercaptopurine, methotrexate and visblastine: a case report key: cord-355393-ot7hztyk authors: yuan, peiyan; tang, shaojie title: community-based immunization in opportunistic social networks date: 2015-02-15 journal: physica a: statistical mechanics and its applications doi: 10.1016/j.physa.2014.10.087 sha: doc_id: 355393 cord_uid: ot7hztyk abstract immunizing important nodes has been shown to be an effective solution to suppress the epidemic spreading. most studies focus on the globally important nodes in a network, but neglect the locally important nodes in different communities. we claim that given the temporal community feature of opportunistic social networks (osn), this strategy has a biased understanding of the epidemic dynamics, leading us to conjecture that it is not “the more central, the better” for the implementation of control strategy. in this paper, we track the evolution of community structure and study the effect of community-based immunization strategy on epidemic spreading. we first break the osn traces down into different communities, and find that the community structure helps to delay the outbreak of epidemic. we then evaluate the local importance of nodes in communities, and show that immunizing nodes with high local importance can remarkably suppress the epidemic. more interestingly, we find that high local importance but non-central nodes play a big role in epidemic spreading process, removing them improves the immunization efficiency by 25% to 150% at different scenarios. with the rapid pervasive of a new generation of smart devices, it is possible and necessary to disseminate content in networks by exploiting the human mobility and intermittent device-to-device contacts. networks with such intermittent device-to-device contacts are generally called delay-tolerant [1] or opportunistic [2] . until recently, a variety of opportunistic social networks (osn) have been studied, such as pocket switched networks [3] , publish/subscribe systems [4, 5] , and human contact networks [6] . many interesting phenomena have been observed, including the heavy-tailed distribution of contact times and node degree [7, 8] , the small world phenomenon [9] , the dynamics of epidemic spreading [10, 11] , the high clustering of aggregated social contact statistics [12] , etc. the social contact feature makes nodes in osn vulnerable to infection. therefore, when an infectious disease appears in a population (e.g., the severe acute respiratory syndrome (sars) and the h7n9 virus), designing effective immunization strategies has become very important. to achieve this goal, several immunization strategies have been recently developed, ranging from ring immunization [13] to targeted immunization [14] [15] [16] [17] [18] . the targeted strategy has shown to be more effective than the ring strategy to delay the outbreak of epidemic. the basic idea of targeted immunization is that it first ranks importance of nodes and then removes them, from highest importance to lowest, to observe their impact on epidemic spreading speed. the importance of nodes is generally measured by node's degree, closeness or betweenness centrality in the network [19] . to further improve the immunization efficiency, the authors of ref. [20] suggested that node importance should be recalculated after every step of node removal. previous work mainly concentrates on the global measures of nodes in a network, while ignoring the local importance of nodes in different groups. a number of recent studies indicate that opportunistic networks have a high clustered property [10, 21] and show a temporal community structure [22] . networks with properties at the level of community are quite different from their properties at the entire network level [23] . studies thus paying more attention to the whole network topology and neglecting temporal community structure may miss many interesting features. for example, in real world it is found that people have different average number of contacts in different social cliques. the same person in one clique may be sociable, having many contacts with others, while in another clique he/she may be more taciturn. such contact behaviors have also been seen in opportunistic social networks, where people are more frequently to contact their family or friends, while they meet accidentally with strangers [24] . if one tried to characterize such a network by statistics of the mean number of contacts a person has, one would be missing the features of the network, such as the dynamics of epidemic, resulting in a biased understanding of the epidemic spreading process. as shown in fig. 1 , suppose an infectious disease occurs in a community c 1 , and unfortunately, alice is infected. in order to delay the epidemic spreading, bob should be protected first, because of his close contact to alice and other people. in this situation, although carl has a high social status in the whole network, the infectious disease would break out in the community and then spread to the entire network if he was immunized first. to this end, we investigate the evolution of community structure in opportunistic social networks, and analyze the effect of community-based immunization strategy on epidemic spreading. we make the following contributions. we find that the spreading speed of epidemic within one community is faster than that across different communities. this result is encouraging as it indicates that the outbreak of epidemic could be delayed, if one could further break down the osn traces into lots of small communities by removing some special nodes. we observe that the most efficient immunization strategy on epidemic spreading is to remove nodes with high local importance in communities. more interestingly, we find that high local importance but non-central nodes contribute more to epidemic spreading process, leading us to conjecture that it is not ''the more central, the better'' for the implementation of control strategy. we study the role of nodes' local importance in epidemic spreading, experimentally and analytically. we do so not only by excluding nodes with high local importance from osn traces but also by developing an analytic model which formally characterizes the relationship between nodes' local importance and the community cohesion. we find that the community cohesion is heavily dependent on local importance of nodes. removing locally important nodes sharply lowers the community cohesion, and thus helps to suppress the epidemic. we organize the remainder of this paper as follows. section 2 reviews related works. section 3 introduces the osn traces and network model. the next two sections present our solutions to evaluate node's local importance and cluster nodes, respectively. we analyze the effect of community-based immunization on epidemic spreading in section 6. finally, in section 7 we conclude our paper. immunization strategies in general can be classified into two categories: the ring strategy [13] and targeted strategy [14] [15] [16] [17] [18] . the targeted immunization strategy, in which nodes playing an important role in the network are removed first, has shown to be very effective [16] . pastor-satorras et al. [14] first studied the targeted immunization strategy and found that immunizing nodes with strong connectivity is more effective than that of randomly selected nodes. holme et al. considered a more challenging scenario, where only neighborhood information of nodes can be used. they observed that the most efficient strategy is to iteratively immunize the neighbor of nodes with big out-degree [15] . subsequently, zhang et al. [17] developed a more precise model by taking the immunization willingness of individuals into account. in addition, schneider [18] tested the role of node's betweenness centrality in epidemic spreading, and obtained the similar result. recently, the concept of node importance has been introduced to the fields of routing and message diffusion in opportunistic social networks. for example, the authors of refs. [25, 12, 26] exploited the node importance to make routing decisions. they found that forwarding messages to nodes with high centrality could increase the message delivery ratio. similarly, the authors of refs. [21, 22] observed that nodes wandering from one community to another (such nodes thus have a high contact frequency with others in the network) contributed more to message diffusion. all of the above works focus on the central nodes in a network, they evaluated node's importance either by logical centrality metric or by physical contact behavior among nodes. our work supplements the previous results, exploring the role [4, 5] , multicasting [27] and location of rumor source [28] . we use real and synthetic traces in this study, both of them have their own advantages and can be complementary to each other. the former helps to observe the real epidemic spreading behavior; the latter provides an opportunity to study the dynamics of epidemic spreading in a large scale. the details of each kind of trace are discussed below. real trace (ncsu). the ncsu trace [29] is taken by twenty students who live in a campus dormitory. every week, these students carried garmin gps 60csx handheld receivers, which are waas (wide area augmentation system) capable with a position accuracy of better than three meters 95% of the time, for their daily regular activities, and altogether thirty-five trajectories were gathered from 2006-08-26 to 2006-11-16. the gps receivers perform a device discovery every 10 s to record their current positions. to reduce gps errors, the authors of ref. [29] calibrate a position at every 30 s by averaging three samples over the past thirty-second period. according to the position information in each trajectory, we assume that two nodes have a contact if their distance is less than 150 m, a realistic range for wifi transmissions. synthetical trace (slaw). considering the limitation on scalability of real trace, we need a proper mobility model that can characterize the nature of human mobility, to investigate the dynamics of epidemic spreading, both in width and depth. although many random mobility models, such as random walk and random way point, have been widely used in opportunistic social networks for evaluating routing performance or even the epidemic dynamics [30, 31] , they cannot reflect the main features of human mobility, including the truncated power-law flights and pause-times, the heterogeneously bounded mobility areas of different nodes, etc. the recent work [32] proposed a new mobility model called slaw (self-similar least action walk) that can produce synthetical trace incorporating various features of human mobility. we therefore use it to produce synthetical human traces for investigation on epidemic spreading. the simulation area is 2000 m × 2000 m 2 , the number of nodes is varying from 100 to 500 with a step 100. table 1 summarizes all slaw model parameters (for detail meanings of these parameters, please refer to ref. [32] ). traditional methods to model opportunistic social network mainly include time expanded graph and binary graph. the time expanded graph catches each snapshot of the original network, that is, new edges connecting a node and its copy are added at the next snapshot with the edges having been existed in the last snapshot. it hence incurs to the scalability issue [33] . on the other hand, the binary graph can alleviate the storage overhead. it however neglects the duration of each contact and its decayed age. this method only characterizes the time-varying network at a coarse-grained level [12] . considering these facts, we model an osn as a decayed aggregation graph dag = (v , e), where v denotes the set of nodes (|v | = n) and e denotes the set of edges. let w (t) = (w uv (t)) n×n denote its adjacency matrix and n uv (t) = {(on i , off i ), i = 1, 2, . . . , n} denote the contact series between nodes u and v in the interval [0, t], where the tuple (on i , off i ) denotes the start moment and end moment of the ith contact respectively, and n is the number of contacts. we formulate the contact strength between nodes (i.e., the value of w uv (t)) as a decayed sum problem [34] . decayed sum problem includes two components. the first one is weighted function f (i), and the second one is decayed function g(t − off i ), as shown in the following equation. definition 1 (decayed sum). given the contact series n uv (t), the goal is to estimate the decayed sum at any current time t where f (i) = off i − on i denotes the ith contact duration (i.e., the weighted part of decayed sum problem) and g(t − off i ) denotes the decayed part. we set g(t − off i ) = e −(t −off i ) as the inter-contact time between nodes generally obeys an exponential decay in osn [35] . hence, eq. (1) can be reformulated as (2) we next analyze the space complexity of dag. exact tracking of w uv (t ) needs θ(n) storage bits. considering scalability issue (in general, n ≫ n), we should further reduce the storage overhead while keeping the same calculation precision. let from theorem 1, each node only carries a single counter to exactly track the contact strength between itself and any other node, which forms the row vector w u of matrix w . we use this matrix to cluster nodes in the next section. this paper mainly concentrates on the local importance of nodes and its effect on epidemic spreading. as discussed in sections 1 and 2, the local importance of nodes reflects their social status in a community, rather than in the whole network. 1 traditional solutions for evaluating node importance mainly include the node degree, closeness and betweenness. all of them are not applicable, due to the unknown number of neighbors (for degree measure) and vulnerable end-to-end path (for closeness and betweenness measures) in opportunistic social networks. for example, if we use the betweenness to measure node importance, we first need to collect the shortest paths for each node pairs in an offline way, and then count the number of times that each node appears in these shortest paths. furthermore, the three methods have a bias towards a global measure of nodes. to deal with these issues, we use the technology of principal component analysis (pca) [36] to evaluate node's local importance, which provides an online method for evaluating node importance. principal component analysis is a powerful tool to extract relevant information from a data set by filtering noise and redundant data. this relevant information reveals the hidden, simplified structures underlying the data set. we generalize the principle of pca as follows. suppose that a node u has built the matrix w (please refer to section 3.2), and the matrix w has been centralized (i.e., subtract the corresponding mean from each column). let c w = w t w /(n − 1) denote the covariance matrix of w . let us further diagonalize the c w as where λ = diag(1, 2, . . . , n) and p is a normalized orthogonal matrix. let x i be the eigenvectors of c w and λ i the corresponding eigenvalues, and λ 1 ≥ λ 2 ≥ · · · λ n . we can see from fig. 2 that the row vector α u (α u1 , α u2 , . . . , α un ) fig. 2 . the spectral space of w and its vector representation. the main notations used in the paper. notation explanation w u the row vector of matrix w c w the covariance matrix of w p k+1 the noise components of w p k the principal components of w p the eigenvector decomposition of c w w k the dimensionality reduction matrix of w the distribution of node u in the n-dimensional spectral space the noise distribution of α u denotes the distribution of node u in the n-dimensional spectral space, and the column vector x i (α 1i , α 2i , . . . , α ni ) denotes the coordinates of all of the nodes in the ith dimension of the spectral space. in addition, once we get the orthogonal matrix p, we generally select the top k-dimensional spectral space (x 1 , x 2 , . . . , x k ) as the principal component of w , since the corresponding top k eigenvalues dominate the spectral graph features [37] . algorithm 1 describes the above computation process and table 2 lists the main notations used in the paper. mathematically, let λ k denote the diag(1, 2, . . . , k) and the matrix p k = (x 1 , x 2 , . . . , x k ). let α + u represent (|α u1 | , |α u2 | , . . . , |α ui | , . . . , |α uk |), where |α ui | denotes the absolute value of α ui , we have lemma 1. for a given decayed aggregation graph dag with k communities, the matrix p k is the projection matrix, elements of the vector α + u are the projected values of node u in such k communities. proof. let w k denote the dimensionality reduction matrix of w and the matrix c w k be the covariance matrix of w k . based on the theory of pca, c w k should been diagonalized as well, we have on the other hand, from eq. (4), we get replace λ k with eq. (5) and c w with w t w /(n − 1), respectively, we have multiply both sides by (n − 1) and use the substitution of p t hence, we conclude that the matrix p k is the projection matrix. let c i denote the principal direction of the ith eigenvector x i . based on the singular value decomposition (svd) of w k [38] , both c i and x i are satisfying: after some algebra, we obtain x i = w t k c i /λ proof. from lemma 1, we know that the projection length of node u in community i is |α ui |, and from the spectral graph theory [37] , it has been shown that the eigenvalue λ i indicates the strength of community i in a graph. hence, we get obviously, the l i u is mathematically equivalent to |α ui | if we ignore the factor λ i (considering the fact that local importance of each node in community i has the common factor λ i ). we specifically use the above equation to denote node's local importance, this is mainly because the product of the two parts (|α ui | and λ i ) has a special physical significance, it reflects the contribution of community i to node u's global importance g u , that is, with the above two equations, 2 we evaluate the local and global importance of each node, and plot their empirical distributions in fig. 3 , where the subgraphs correspond to the case in which the local importance of nodes is measured with respect to the largest community ( fig. 3(a) ), the second ( fig. 3(b) ), the third (fig. 3(c) ) and the fourth largest community (fig. 3(d) ), respectively. we observe that, for most nodes, the global importance shows a strong correlation with the local importance. furthermore, the correlation decreases with decreasing community size, as small communities play a weak role in a graph. another interesting observation is that the two social metrics of nodes have different increasing rates, which result in some central nodes having a relatively low importance in a community, and vice versa. in section 6, we study their effects on epidemic spreading. cutting a graph into small clusters has been studied widely. we use the k-means, one of the most well-known clustering algorithms [39] , to detect the temporal community structure. the advantage of the k-means algorithm compared to other methods such as cnm [40] and k-clique [41] is that it does not need to know the neighbor relationship between nodes, it only requires the adjacent matrix of a weighted graph such as the dag, while the cnm and k-clique are more appropriate to a binary graph. in addition, based on the technology of pca discussed above, it is confident to determine the number of communities, the initial elements for each community and the termination condition, three issues strongly affecting the performance of k-means. we next discuss how to detect the community structure based on the refined k-means. (1) determining k, the number of communities: pca provides a roadmap to reduce a confusing data set to a lower dimension that retains the main features of the original data set. the rationale behind this is that the eigenvalues of a network, play a big role in many important graph features. it has been shown that the maximum degree, clique number, and even the randomness of a graph are all related to λ 1 . in general, we select the top k eigenvectors to denote the main structures of the graph, where the value of k satisfies in this paper, we set r = 0.85, an experiential value 3 belonging to the default interval [0.7, 0.9] [37] . (2) excluding the noise nodes: in opportunistic social networks, there commonly exist some nodes with few contacts to other nodes. we call them noise nodes in this study. excluding the noise nodes has little effect on epidemic spreading speed. furthermore, it helps to reduce the problem size and algorithm complexity. we now discuss how to use pca to identify the noise nodes. pca divides a network into two different parts: (1) the principal components p k , and (2) the opposite p k+1 , where the p k+1 = (x k+1 , x k+2 , . . . , x n ), as shown in fig. 2 . we call the latter noise components of the network. accordingly, we divide the row vector α u by α 1,k u (α u1 , α u2 , . . . , α uk ) and α k+1,n u (α u,k+1 , α u,k+2 , . . . , α un ), the signal and noise of the node u. from the theory of pca, if node u is dominated by its noise components, that is, α u is dominated by the α k+1,n u , we can exclude this node from the graph. we use signal-to-noise ratio to identify which component dominates a node. . the snr is the ratio of signal energy over that of noise. from theorem 2, we know that the node u's local importance relative to community i is |α ui | λ i , which is also the amplitude of node u's signal in the ith dimensional spectral space. hence, the signal energy e u signal of node u can be presented as e u signal =  i∈ [1,k] (λ i |α ui |) 2 =  i∈ [1,k] (λ i α ui ) 2 , and the noise strength e u noise is equal to  j∈[k+1,n] (λ j α uj ) 2 . based on definition 2, we call node u a noise node if its snr u satisfies snr u < 1. (3) determining the initial elements for each community: after we have ascertained the number of communities and excluded the noise nodes, the next step is to determine the initial centroid m i (i = 1, 2, . . . , k) for each community c i . we select the node u, s.t. max |α ui | (u = 1, 2, . . . , n) for each eigenvector x i , as the initial node of community i, and set m i = α u . algorithm 2 describes this procedure. v ← 1 {tracking who is the maximum} 6: for u = 2 to n do 7: if |α ui | > maxvalue ∧ u is not a noise node then (11) where n i is the number of nodes belonging to c i . k-means is characterized by minimizing the sum of squared errors, it has been shown that the standard iterative method to k-means suffers seriously from the local minima problem, because of the greedy nature of the update strategy. fortunately, theorem 3 guarantees the pca-based k-means is immune to this problem. ref. [42] ). minimizing j is equivalent to maximizing trace(p t c w p) (please refer to eq. (19) of ref. [42] ), and max trace(p t c w p) = λ 1 + λ 2 + · · · + λ k . in other words, the pca-based k-means has reached the optimal performance once we cluster all of the non-noise nodes for the first time. (5) clustering nodes: for any node u, we compute the distance between itself and the centroid m i , dist(α u , m i ), and select i, s.t. min dist(α u , m i ) (i = 1, 2, . . . , k) as the community node u belongs to, where, dist(α u , m i ) = θ (u, i) = arccos α u m i t ∥α u ∥ 2 ∥m i ∥ 2 and θ(u, i) denotes the angle between α u and m i . after node u joins the community i, we update the centroid m i by eq. (11) so as to select the next node. algorithm 3 describes the clustering procedure. updating m i 8: end for 6. results and analysis we define the temporal community structure as a series of snapshots of communities underlying the traces. we take a snapshot every 120 s for the communities. compared to the five hours duration of experiment, the snapshot interval is chosen to be relatively small so as to obtain a detailed view of the community evolution and to make an unbiased understanding of epidemic spreading as much as possible. fig. 4 plots the number of communities hidden behind the traces at different snapshots, where the term ''s(number)'' denotes the slaw trace with different number of nodes. we observe that the topology is volatile over time. at ncsu, the number of communities varies from 4 to 10 with a mean of 7.3. the number of communities at s(100) varies between 8 and 16 with a mean of 11.7. table 4 summarizes the statistics of community structure for all the traces. fig. 5 shows size of the top 4 communities at the slaw trace with 500 nodes. we find that each community is not stable over time as well. the size of the largest community varies between 13% and 35% during the experiment, and 11%-20% of the nodes belong to the second largest community. the third and fourth largest community are much closer, varying from 9% to 17% and from 8% to 15%, respectively. in summary, the top 4 communities cover almost 60% nodes. other smaller communities share the rest nodes. these results suggest that nodes in opportunistic social networks do not belong to a single, stable community. instead, the network is made of many temporal clusters. we analyze their effects on epidemic spreading in the next section. having shown that the temporal communities are built on node's social contacts, it is significant to understand the role of these communities in epidemic spreading. to this end, we record the epidemic spreading time within one community and that across different communities. table 3 summarizes the results. we find that the spreading speed of epidemic within one community is faster than that from one community to another. at ncsu, the mean spreading time within one community is 19 min, and 45 min between communities. the slaw traces show similar phenomena, especially at s(300), where the inter-community spreading time is almost four times of that of intra-community. this is mainly because there exist many small communities at this scenario (as shown in table 4 , the s(300) trace has the most number of communities and the community structure changes dramatically). the small communities are more effective to refrain the epidemic spreading than the big ones. this result is encouraging as it indicates that the outbreak of epidemic could be delayed, if one could break down the osn traces into lots of small communities by removing some special nodes. previous work has suggested that removing central nodes is an effective way to delay the epidemic outbreak. this conclusion is somewhat inconsistent with our aforementioned results, since some central nodes have relatively low importance in communities and removing them does little damage to the community structure. therefore, we conjecture that it may be inefficient to suppress the epidemic spreading by removing central nodes. to validate this, we classify nodes into different categories according to their global and local importance. specifically, we select as high local importance nodes (hot) with the top 10% nodes that have the highest local importance with respect to a community, and the rest as low local importance nodes (lot). in addition, we define a central node as a node that belongs to the top 10% nodes with the highest importance in the whole network, the remaining nodes are called non-central nodes (similar selection/definition has been used in ref. [43] ). our basic idea here is to understand the role of each kind of node in epidemic spreading. we first calculate the epidemic spreading time for each trace including all nodes. we then repeat the same experiment by removing each of the four node categories (we remove the same amount of nodes for each category in order to make a fair comparison). fig. 6 presents the results. the y-axis denotes the immunization efficiency compared to the base case (the case including all nodes), i.e., higher bar means higher efficiency. the first and counter-intuitive phenomenon is that hot nodes, instead of the central, play a big role in epidemic spreading (as shown in fig. 6(a) ). compared to removing the central nodes, the immunization efficiency increases 50% on average when the hot nodes are removed. more interestingly, we observe that non-central hot nodes are responsible for most of the epidemic spreading in opportunistic social networks ( fig. 6(b) and (c) ). removing them improves the immunization efficiency by 25%-150% at all traces. in contrast, removing central lot nodes shows a more limited improvement. this phenomenon experimentally validates that it is not ''the more central, the better'' for the implementation of control strategy. we next explore the reason behind this phenomenon. section 6.2 indicates that the spreading speed of epidemic heavily depends on the temporal communities. as a result, even though hot nodes have on average lower global importance than the central, they do great damage to the community structure when removed, and thus help to suppress the epidemic. the goal of this section is to formally characterize the relationship between the cohesive community and node's importance. we generally use the community density to denote its cohesion. let d(c i ) represent density of community i, we have the following theorem (see appendix for proof). we use this equation to evaluate the impact of removing nodes on community density, and plot the changing of community density in fig. 7 , where the y -axis denotes the ratio of community density after some nodes are removed from a community over the initial density of that community. we find that community density is heavily dependent on node's importance. however, there exists a large difference of impact between the local and global importance of nodes. removing in rank order the most important to the least nodes in a community leads to a faster decline in community cohesion. in contrast, removing first the central nodes will shrink the community but with a slower speed. taken together, hot nodes appear to be crucial for implementing the immunization strategy, because of their large impact on community structure when removed. in this paper, we improve our understanding of immunization strategy on epidemic spreading in opportunistic social networks. we observe that a temporal community structure helps to control the epidemic spreading. this phenomenon is encouraging as it indicates that the outbreak of epidemic could be delayed, if we could further break down the osn traces into lots of small communities by removing some special nodes. motivated by this observation, we separate nodes into different behavioral classes from a community viewpoint. we show that hot nodes can remarkably suppress the epidemic spreading when removed. more interestingly, we find that non-central hot nodes are responsible for most of the epidemic spreading. these results reveal a counter-intuitive conclusion: it is not ''the more central, the better'' for the implementation of control strategy. for any t ∈ t 2 , since t ̸ = off i (note that off i ∈ t 1 and t 1 ∩ t 2 ≡ ∅), we have h(t) = 0. hence, = h(t ) + e −1 w uv (t − 1). we first give the following lemma. proof. from the clustering process mentioned above, we know that the centroid m i (m 1i , m 2i , . . . , m ni ) can approximately represent the line formed by nodes within the ith community (please refer to eq. (11) p. 7). on the other hand, the virtual centroid vector m i should be close to eigenvector x i . this is mainly because m i ≈m i =   u⊂c i α ui  /n i , as α ui is the dominant part of α u . hence,m i locates in the line formed by the eigenvector x i . we get the conclusion as different eigenvectors are linearly independent. we now prove eq. (9). let variable e u denote the event measuring global importance of node u (i.e., g u ) in the whole network. let variable e i u denote the event that measures node u's local importance in a community i. we have p(e u ) = p(measuring g u in the whole network) = p(measuring g u across all communities) p(e i u )/ * from the lemma 2 * /. that is, measuring node u's global importance is equal to first measuring its local importance in different communities, and then put all the components together. proof of theorem 4. consider the division of an unweighted graph u into k non-overlapped communities c 1 , c 2 , . . . , c k . let v i = (v 1i , v 2i , . . . , v ni ) be the index vector of community c i , and v ui is equal to 1 if node u belongs to c i and 0 otherwise. for community c i , its density can be expressed as d(c i ) = number of edges in c i number of nodes in c i = v t dtn: an architectural retrospective opportunities in opportunistic computing pocket switched networks and human mobility in conference environments socially-aware routin for publish-subscribe in delay-tolerant mobile ad hoc networks supporting cooperative caching in disruption tolerant networks data delivery properties of human contact networks impact of human mobility on opportunistic forwarding algorithms exploiting social interactions in mobile systems small-world behavior in time-varying graphs a reaction-diffusion model for epidemic routing in sparsely connected manets networks of strong ties bubble rap: social-based forwarding in delay-tolerant networks ring vaccination immunization of complex networks efficient local strategies for vaccination and network attack finding a better immunization strategy hub nodes inhibit the outbreak of epidemic under voluntary vaccination suppressing epidemics with a limited amount of immunization units centrality in social networks conceptual clarification tearing down the internet overlapping communities in dynamic networks: their detection and mobile applications proceedings of the thirteenth acm international symposium on mobile ad hoc networking and computing finding community structure in networks using the eigenvectors of matrices impact of strangers on opportunistic routing performance social network analysis for routing in disconnected delay-tolerant manets 2010 proceedings ieee infocom multicasting in delay tolerant networks: a social network perspective finding rumor sources on random graphs on the levy-walk nature of human mobility efficient routing in intermittently connected mobile networks: the multiple-copy case performance modeling of epidemic routing slaw: self-similar least-action human walk time-aggregated graphs for modeling spatio-temporal networks maintaining time-decaying stream aggregates power law and exponential decay of intercontact times between mobile devices principal component analysis spectral graph theory community detection in graphs finding community structure in very large networks uncovering the overlapping community structure of complex networks in nature and society he, k -means clustering via principal component analysis inferring social ties across heterogenous networks matrix perturbation theory we acknowledge the support of the national natural science foundation of china under grant nos. u1404602, u1304607, the science and technology foundation of henan educational committee under grant nos. 14a520031, 14a520068. we also wish to thank the crawdad archive project for making the dtns traces available to research community. proof of theorem 1. let us split the interval [0, t ] into two disjoined parts t 1 and t 2 , where key: cord-196353-p05a8zjy authors: backhausz, 'agnes; bogn'ar, edit title: virus spread and voter model on random graphs with multiple type nodes date: 2020-02-17 journal: nan doi: nan sha: doc_id: 196353 cord_uid: p05a8zjy when modelling epidemics or spread of information on online social networks, it is crucial to include not just the density of the connections through which infections can be transmitted, but also the variability of susceptibility. different people have different chance to be infected by a disease (due to age or general health conditions), or, in case of opinions, ones are easier to be convinced by others, or stronger at sharing their opinions. the goal of this work is to examine the effect of multiple types of nodes on various random graphs such as erdh{o}s--r'enyi random graphs, preferential attachment random graphs and geometric random graphs. we used two models for the dynamics: seir model with vaccination and a version of voter model for exchanging opinions. in the first case, among others, various vaccination strategies are compared to each other, while in the second case we studied sevaral initial configurations to find the key positions where the most effective nodes should be placed to disseminate opinions. freedom in choosing the position of these vertices. one of our main interests is epidemic spread. the accurate modelling, regulating or preventing of a possible epidemic is still a difficult problem of the 21st century. (as of the time of writing, a novel strain of coronavirus has spread to at least 16 other countries from china, although authorities have been taking serious actions to prevent a worldwide outbreak.) as for mathematical modelling, there are several approaches to model these processes, for example, using differential equations, the theory of random graphs or other probabilistic tools [10, 12, 15] . as it is widely studied, the structure of the underlying graph can have an important impact on the course of the epidemic. in particular, structural properties such as degree distribution and clustering are essential to understand the dynamics and to find the optimal vaccination strategies [7, 11] . from the point of view of random graphs, in case of preferential attachment graphs, it is also known the initial set of infected vertices can have a huge impact on the outcome of the process [3] : a small proportion infected vertices is enough for a large outbreak if the positions are chosen appropriately. on the other hand, varying susceptibility of vertices also has an impact for example on the minimal proportion of vaccinated people to prevent the outbreak [6, 4] . in the current work, by computer simulations, we study various cases when these effects are combined in a seir model with vaccination: we have a multitype random graph, and the vaccination strategies may depend on the structure of the graph and types of the vertices as well. the other family of models which we studied is a variant of the voter model. the voter model is also a common model of interacting particle systems and population dynamics, see e.g. the book of liggett [16] . this model is related to epidemics as well: durett and neuhauser [9] applied the voter model to study virus spread. the two processes can be connected by the following idea: we can see virus spread as a special case of the voter model with two different opinions (healthy and infected), but only one of the opinions (infected) can be transmitted, while any individuals with infected opinion switch to healthy opinion after a period of time. also the virus can spread only through direct contacts of individuals (edges of the graphs), while in the voter model it is possible for the particles to influence one another without being neighbors in the graph. similarly to the case of epidemics, the structure of the underlying graph has an important impact on the dynamics of the process [2, 8] . here we study a version of this model with various underlying random graphs and multiple types of nodes. we examined the virus spread with vaccination and the voter model on random graphs of different structures, where in some cases the nodes of the graph corresponding to the individuals of the network are divided into groups representing significantly distinct properties for the process. we studied the possible differences of the processes on different graphs, regarding the nature and magnitude of distinct result and both tried to find the reasons for them, to understand how can the structure of an underlying network affect outcomes. the outline of the paper is as follows. in the second section we give a description of the virus spread in continuous time, and the discretized model. parameters are chosen such that they match the realworld data from [14] . we confront outcomes on different random graphs and the numerical solutions of the differential equations originating from the continuous time counterpart of the process. we also study different possible choices of reproduction number r 0 corresponding to the seriousness of the disease. we examine different vaccination strategies (beginning at the start of the disease of a few days before), and a model with weight on edges is also mentioned. in the third section we study the discretized voter model on erdős-rényi and barabási-albert graphs firstly without, then with multiple type nodes. later we run the process on random graphs with a geometric structure on the plane. the dynamics of virus spread can be described by differential equations, therefore they are usually studied from this approach. however, differential equations use only transmission rates calculated by the number of contacts in the underlying network, while the structure of the whole graph and other properties are not taken into account. motivated by the paper "modelling the strategies for age specific vaccination scheduling during influenza pandemic outbreaks" of diána h. knipl and gergely röst [14] , we modelled the process on random graphs of different kinds. in this section we use the same notions and sets for most of the parameters. ideas for vaccination strategies are also derived from there. we examined a model in which individuals experience an incubation period, delaying the process. dynamics are also effected by a vaccination campaign started at the outbreak of the virus, or vaccination campaign coming a few days before the outbreak. in the classical seir model, each individual in the model is in exactly one of the following compartment during the virus spread: • susceptible: individuals are healthy, but can be infected. • exposed: individuals are infected but not yet infectious. • infectious: individuals are infected and infectious. • recovered: individuals are not infectious anymore, and immune (cannot be infected again). individuals can move through compartments only in the defined way above (it is not possible to miss out one in the line). the rate at which individuals leave compartments are described by probabilities (transmission rates) and the parameters of the model (incubation rate, recovery rate). individuals in r are immune, so r is a terminal point. seir with vaccination: we mix the model with a vaccination campaign. the campaign lasts for 90 days, and we vaccinate individuals according to some strategy (described later) so that at the end of the campaign 60% of the population is vaccinated (if it is possible). we vaccinate individuals only in s, but the vaccination ensures immunity only with probability q, and only after 14 days. we vaccinate individuals at most once irrespectively of the success of vaccination. however, vaccinated individuals can be infected within the first 14 days. in this case, nothing differs from the process without vaccination. to describe the underlying network, we use real-life data. we distinguish individuals according to their age. in particular, we consider 5 age groups since they have different social contact profile. the to describe the social relationships of the different age groups, we used the contact matrix obtained in [1] : where the elements c i,j represent the average number of contacts an individual in age group i has with individuals in age group j. in the sequel, the number of individuals in a given group is denoted by the label of the group according to figure 1 . the model is specified by the following family of parameters. • r 0 = 1.4: basic reproduction number. it characterizes the intensity of the epidemic. its value is the average number of infections an infectious individual causes during its infectious period in a population of only susceptible individuals (without vaccination). later we also study less severe cases with r 0 = 1.0 − 1.4. • β i,j : transmission rates. they control the rate of the infection between a susceptible individual in age group i and an infectious individual in age group j. they can be derived from r 0 and the c contact matrix. according to [1] we used β i,j = β · c i,j n j , where β = 0.0334 for r 0 = 1.4. .25: latent period. ν e is the rate of exposed individuals becoming infectious. each individual spends an average of 1 ν i in i. • 1 ν w = 14: time to develop antibodies after vaccination. • q i = 0.8 for i = 1, . . . , 4 and q 5 = 0.6: vaccine efficacy. the probability that a vaccinated individual develops antibodies and becomes immune. • δ = 0.75: reduction in infectiousness. the rate by which infectiousness of unsuccessfully vaccinated individuals is reduced. • λ i = 5 j=1 β j,i · (i j + δ · i j v ) is the total rate at which individuals of group i get infected and become exposed. • v i : vaccination rate functions determined by a strategy. this describes the rate of vaccination in group i. the dynamics of the virus spread and the vaccination campaign can be described by 50 differential equations (10 for each age group), according to [14] : we would like to create an underlying network and examine the outcome of virus spread on this given graph. we generated random graphs of different structures with n = 10000 nodes, such that each node has a type corresponding to the age of the individual. the age distributions and number of contacts in the graph between age groups comply with statistic properties detailed above. since the contact matrix c describes only the average number of contacts, the variances can be different. • erdős-rényi graphs: we create 10000 nodes and their types are defined immediately, such that the number of types comply exactly to the age distribution numbers. the relationships within each age group and the connections between different age groups are both modelled with an erdős-rényi graph in the following sense: we create an edge between every node in age group i and node in age group j independently with probability p i,j , where • preferential attachment graphs: initially we start from an erdős-rényi graph of size 100, then we keep adding nodes to the graph sequentially. every new node chooses its type randomly, with probabilities given by the observed age distribution. after that we create edges between the new node and the old ones with preferential attachment. if the new node is of type i, then we connect it with an edge independently to an old node v of type j with probability where d(v) denotes the actual degree of v, and d is the sum of degrees belonging to nodes with type j. thus the new node is more likely to attach to nodes with a high degree, resulting a few enormous degrees in each age group. on the other hand, the connection matrix c is used to ensure that the density of edges between different age groups is different. • preferential attachment mixed with erdős-rényi graphs: we create the 10000 nodes again with their types exactly according to age distribution numbers. first we create five preferential attachment graphs, the ith of size n i so that every node has an average of c i,i neighbours. in particular, the endpoints of the new edges are chosen independently, and the attachment probabilities are proportional to the degrees of the old vertices. then we attach nodes in different age groups independently with the corresponding p i,j probabilities defined above. • random graphs of minimal degree variances with configuration model: we prescribe not only a degree sequence of the nodes, but the degree of a node broken down into 5 parts regarding the age groups in a way the expectations comply with the contact matrix c, but the degrees also have a small variance. the distribution is chosen such that the variance is minimal among distribution supported on the integers, given the expectation. for example in case of c 1,4 = 2, 4847 every node in age group 1 has exactly 2 or 3 neighbours in age group 4, and the average number is 2, 4847. our configuration model creates a random graph with the given degree sequence. according to [5] , the expected value of the number of loops and multiple edges divided by the number of nodes tends to zero, thus for n = 10000 it is suitable to neglect them and to represent a network of social contacts with this model. in this section we detail how we implemented the discretization of the process on the generated random graphs. most of the parameters remained the same as in the differential equations, however to add more reality we toned them with a little variance. for the matter of the transmission rates, we needed to find different numbers to describe the probability of infections, since β was derived from c matrix and r 0 basic reproduction number. since c is built in the structure of our graphs, using different parameters β i,j would result in adding the same effect of contacts twice to the process. therefore instead of β i,j , we determined a universalβ according to the definition of r 0 . we set the disease transmissions toβ = r 0 3·12.8113 , under the assumption, that the contact profile of age groups are totally implemented in the graph structure. only the average density of the graph (without age groups), severity of the disease and average time spent in infectious period can affect the parameters. parameters ν w , δ, q i remained exactly the same, while 1 ν e = 1.25 and 1 ν i = 3 holds only in expected value. the exact distributions are given as follows: we built in reduction in infectiousness δ in the process in such way that an unsuccessfully vaccinated individual spends 3 · 0.75 = 2.25 days in average in i, instead of modifyingβ. in the discretized process, we start with 10 infectious nodes chosen randomly and independently from the age groups. we observe a 90 day period with vaccination plus 10 days without it. (in the basic scenarios we start vaccination at day 1, however we later examine the process with vaccination starting a few days before the outbreak). at a time step firstly the infectious nodes can transmit the disease to their neighbours. only nodes in s can be infected, and they cannot be infected ever again. when a node becomes infected, its position is set immediately to e, and also the number of days it spends in e is generated. secondly we check if a node reached the end of its latent/infectious period, and we set its position to i or r. (as soon as a node becomes infectious, the days it spends in i is also calculated.) then at the end of each iteration we vaccinate 0.67% of the whole population according to some strategy (if it is possible). only nodes in s get vaccination (at most once), it is generated immediately whether the vaccination is successful (with probability q i , according to its type). in case of success, the day it could become immune without any infection is also noted. if it reaches the 14th day, and still in s, its position is set to immune. the first question is whether the structure of the underlying graph can affect the process, in the case when the edge densities are described by the same contact matrix c. we can ask how it can affect the overall outcome and other properties, and how we can explain and interpret these differences regarding the structure of the graph. we compare results on different graphs with each other, and also with the numerical solution of the differential equations describing the process. in this section we study the basic scenario: our vaccination starts at day 1, we vaccinate by uniform strategy. this strategy does not distinguish age groups, every day we vaccinate 0.67% of each age group randomly (if it is still possible). we set r 0 = 1.4. using the differential equation system from [14] giving structure to the underlying social network boosted these numbers in every case, however differences are still significant between the random graphs of different properties. to study the result on the discretized model, we generated 5 random graphs with n = 10000 nodes for each graph structure, and run the process 20 times on a random graph with independent in case of most of the structures we can derive rather different outcomes on the same graph with different initial values concerning the peak of the virus. therefore using the same graphs more is acceptable.) as we can see on figure 3 (compared to figure 2 ), random graphs from the configuration model were the closest to the numerical solution of differential equations. however, the difference in outcomes can be clearly seen from every perspective: almost 20% of the population (5.7% more) was infected by the virus at the end of the time period, the infection peaked almost 10 days sooner (at day 41) and the number of infectious cases at the peak is almost twice as large. we got similar, but more severe result on erdős-rényi graphs. however still only a maximum 0.021% of the population was infected at the same time. the outcome in case of graphs with a (partial) preferential attachment structure shows that distribution of degrees do matter in this process. (this notice gave the idea initially to model a graph with minimal degree deviation with the help of the configuration model. we were curious if we can get closer results to the differential equations on such a graph.) on preferential attachment graphs 47.66% of the individuals came through the disease. what is more, 1% of the population was infected at the same time at the peak of the virus, only at day 21. however, after day 40 the infection was substantially over. with preferential attachment structure it is very likely that a node with huge degree gets infected in the early days of the process, irrespectively of initial infectious individual choices, resulting in an epidemic really fast. however, after the dense part of the graph passed through the virus around day 40, even 40% of the population is still in s, magnitude of infectious cases is really low. the process on preferential attachment mixed with erdős-rényi graphs reflects something in between, yet preferential properties dominate. it was possible to reach 60% vaccination rate during the process, except in case of preferential attachment graphs. at the end of the 100th day, 0.4 − 0.45 proportion of individuals could acquire immunity after vaccination. basic reproduction number is a representative measure for the seriousness of disease. generally, diseases with a reproduction number greater than 1 should be taken seriously, however the number is a measure of potential transmissibility. it does not actually tell, how fast a disease will spread. seasonal flu has an r 0 about 1.3, hiv and sars around 2 − 5, while according to [18] in this section we investigate how different strategies in vaccination can affect the attack rates. we study three very different strategies based on age groups or other properties of the graph. in each strategy 0.67% of the population is vaccinated at each time step (sometimes exactly, sometimes only in expected value). after a 90 days vaccination campaign 60% of the population should be vaccinated from each age group (if it is possible). we still start our vaccination campaign at day 1, and we vaccinate individuals at most once irrespectively of the success of the vaccination. • uniform strategy: this strategy does not distinguish age groups, every day we vaccinate randomly 0.67% individuals of each age group. • contacts strategy: we prioritize age groups with bigger contact number, corresponding to denser parts of the graph (concerning the 5 groups). we vaccinate the second age group for 11 days, then the third age group for 26, first age group for 10 days, forth group for 29 days, and at last age group 5 with the smallest number of contacts for 15 days. this strategy turned out to be the best in the case without any graph structure [14] . however, in conventional vaccination strategies, in the first days of the campaign, amongst others health care personnel is vaccinated which certainly makes sense, but can be also interpreted as nodes of the graph not only with high degree, but also with high probability to get infected. the effect of vaccination by degrees can be also noticed on the shape of infected individuals in age groups developing in time (see figure 6 ). not only the magnitude decreased, but the vaccination also increased the skewness, especially for age group 2. vaccination by contacts totally distorted the curve of age group 2, while the others did not changed much. we examine if vaccination before the outbreak of a virus (only a few, 5-10 days before) could influence the epidemic spread significantly. delay in development of immunity after vaccination is one of the key factors of the model, thus pre-vaccination could counterweight this effect. edges of the graph so far represented only the existence of a social contact, however relationships between the individuals could be of different quality. it is also a natural idea to make a connection between the type of the nodes (age groups of individuals) and the feature of the edge between them. for example, generally we can assume that children of age 0-9 (age group one) are more likely to catch or transmit a disease to any other individual regardless of age, since the nature of contacts with children are usually more intimate. so on the one hand, creating weights on the edges of the graph can strongly be in connection with the type of the given nodes. on the other hand regardless of age groups, individuals tend to have a few relationships considered more significant in the aspect of a virus spread (individuals sharing a household), while many social contacts are less relevant. for the reasons above, we upgrade our random graphs with a weighting on the edges, taking into account the age groups of the individuals. regardless of age, relationships divided into two types: close and distant. only 20% of the contacts of an individual can be close, transmission rates on these edges are much higher, while on distant edges they are reduced. we examine a model in which age groups do not affect weights of the edges. we double the probabilities of transmitting the disease on edges representing close contacts, and decrease probabilities on other edges at a 0.75 rate. in expected value the total r 0 of the disease has not changed. however, results on graphs can be different from the unweighted cases. with the basic scenario and in case of we experience the biggest difference on erdős-rényi-graphs, however models with edge weights give bigger attack rates of only 0.01. we get a less severe virus spread with edges only on the configuration model. in this section we study the discretized voter model in which particles exchange opinions from time to time, in connection with relationships between them. we create a simplified process to be able to examine the outcome on larger graphs. firstly, we examine this simplified process on erdős-rényi and barabási-albert graphs, then multiple type of nodes is introduced. with a possible interpretation of different types of nodes in the graphs, we generalize the voter model. later we examine the "influencer" model, in which our aim is, in opposition to the seir model, to spread one of the opinions. the process in continuous time can be modelled with a family of independent poisson process. for each pair of vertices (x, y) we have a poisson process of rate q(x, y), which describes the moments x convincing y. the rate q(x, y) increases as the distance d(x, y) decreases. in this case, every time a vertex is influenced by another one, it changes its opinion immediately. in our discretized voter process, there are two phases at each time step. first, nodes try to share their opinions and influence each other, which is successful with probabilities depending on the distance of the two vertices. more precisely, vertices that are closer to each other have higher chance that their opinion "reaches" the other one. still, every vertex can "hear" different opinions from many other vertices. in the second phase, if a node v receives the message of m 0 nodes with opinion 0, and m 1 nodes with opinion 1, then v will represent opinion 0 with probability m 0 m 0 +m 1 during the next step, and 0 otherwise. if a node v does not receive any opinions from others at a time step, then its opinion remains the same. this way, the order of influencing message in the first phase can be arbitrary, and it is also possible that two nodes exchange opinions. now we specify the probability that a vertex x manages to share its opinion to vertex y in the first phase. we transform graph distances d(x, y) into a matrix of transmission probabilities with choice q(x, y) = e −c·d(x,y) , where c is a constant. this is not a direct analogue of the continuous case, but it is still a natural choice of a decreasing function of d. (usually we use c = 2, however later we also investigate cases c ∈ {0.5, 1, 2, 3}. decreasing c escalates the process.) in the model above, on a graph on n nodes, at every time step our algorithm consists of o(n 2 ) steps, which can be problematic for bigger graphs if our aim is to make sample with viter = 100 or 200 iteration of the voter model (in the sequel, viter denotes the number of steps of the voter model). however, with c = 2 a node x convinces vertices y with d(x, y) = 3 only with a probability of e −6 = 0, 0025. thus we used the following simplified model: when we created a graph, we stored the list of edges and also calculated for each node the neighbours of distance 2. the simplified voter model spread opinions only on these reduced number of edges with the proper probabilities. we were able to run the original discretized model only on graphs with n = 100, while the simplified version can deal with n = 1000 nodes. we made the assumption that neglecting those tiny probabilities cannot significantly change the outcome of the process. from now on we only model the simplified version of the process. firstly we study the voter model on erdős-rényi(n, p) and barabási-albert(n, m) random graphs. • er(n, p): we create n nodes, and connect every possible pair x, y ∈ v independently with probability p. • ba(n, m): initially we start with a graph g 0 . at every time step we add a new node v to the graph and attach it exactly with m edges to the old nodes with preferential attachment probabilities. let d denote the sum of degrees in the graph before adding the new node, then we attach an edge independently to u with probability d(u) d . we generated graphs starting from g 0 = er 50, m (50−1) graph of complying density. multiple edges can be created by the algorithm, however loops cannot occur. attachment probabilities are not updated during a time step. multiple edges do matter in the voter model, since they somehow represent a stronger relationship between individuals: opinion on a k-multiple edge transmits with a k-times bigger probability. firstly, we examine the voter model on graphs without any nodes of multiple types to understand the pure differences of the process resulting from the structure. we compare graphs with the same density, ba(1000, m) graphs with m = {4, 5, . . . , 10} and er(1000, p), where p ∈ [0.004, 0.01]. initial probability of opinion 1 is set to 0.05 in both graphs. we compare the probability of disappearing the opinion with viter = 50 iteration of the voter model. we generated 10 different graphs from each structure and ran voter model on each 20 times with independent initial opinions. altogether the results of 200 trials were averaged. figure 8 shows the results. before the phase transition of erdős-rényi graphs, that is, with p < ln n n ≈ 0.007 with n = 1000 nodes (ba graphs of the same density are belonging to m ≤ 7) the graph consists of several as mentioned before, in this sequel we investigate extreme outcomes of the process caused by one of the most important properties of barabási-albert graphs. since nodes do not play a symmetrical role in barabási-albert graphs, fixing the proportion of nodes representing opinion 1 (we usually use v = 0.05, so 50 nodes represent opinion 1 in expected value), but changing the position of these nodes in the graph can lead to different results. we examined the following three ways of initial opinion setting: • randomly: each individual chooses opinion 1 with probability v. • "oldest nodes": we deterministically set the first 50 nodes of the graph to represent opinion 1. these nodes have usually the largest number of degrees, thus they play a crucial part in the process. not only have they large degrees, but they are also very likely to be connected to each other (this is the densest part of the graph). • "newest nodes": we deterministically set the last 50 nodes of the graph to represent opinion 1. these nodes usually have only m edges, and they are not connected to each other with a high probability. the histogram on figure 9 shows the distribution of nodes with opinion 1 with the three different choice of l 0 vectors after viter = 50 iterations of the voter model on ba(1000, 5) graphs. we experience differences in terms of probabilities of disappearing opinion 1: with random opinion distribution 11%, with l new almost one third of the cases resulted in extinction of opinion 1, while for l old this probability was negligible (0.005%). actually, for l old after only one iteration of the voter model it is impossible to see any structure in the distribution of individuals with opinion 1. vector of opinions became totally random, but with a probability of 0.12. indeed only with one step of the voter model individuals with opinion 1 could double in number, however opinion 1 cannot take advantage of any special positions in the graph anymore. all in all, giving a certain opinion to individuals who are more likely to be connected in the graph, reduces the probability of disappearing, since they can keep their opinion with a high probability, while with opinion 1 scattered across the graph (in case of l new as well as l rand ) with a dynamic parameter setting of c number of individuals with opinion 1 can reduce drastically even in a few time steps. it is a natural idea to divide the nodes of a network into separate groups according to some aspect, where the properties of different groups can affect processes on the graph. there are various ways to classify nodes into different types. we examined a simple and an other widely used method. in the following section we only have nodes with two types, however definitions still hold for multiple type cases. from now on, for purposes of discussion we only refer to the types as red and blue. we consider two different ways to assign types to the nodes: • each node independently of each other chooses to be red with probability p r , and blue with 1 − p r . (here index r corresponds to random.) • since preferential attachment graphs are dynamic models, this enables another very natural and logical way of choosing types: after a new node has connected to the graph with some edges, informally the node chooses its type with probabilities corresponding to the proportions among its neighbours' type (see also [1, 13, 17] ). this way nodes with the same type tend to connect to each other with a higher probability, forming a "cluster" in the graph. we only examined linear models. according to [2] , a few properties in the initial graph g 0 and initial types of nodes can determine the asymptotic behaviour of the proportion of types. let g n denote the graph when n nodes have been added to the initial graph g 0 . let a n and b n denote the number of red and blue nodes in g n . then the following theorem holds for the asymptotic proportion of red (a n ) and blue (b n )nodes, a n = an an+bn , b n = bn an+bn . let x n and respectively y n denote the sum of the degrees of red and blue nodes in g n . and that x 0 , y 0 ≥ 0. then a n converges almost surely as n → ∞. furthermore, the limiting distribution of a := lim n→∞ a n has full support on the interval [0, 1], has no atoms, and depends only on x 0 , y 0 , and m. this property has great significance, since we would like to compare graphs with the same proportion of red and blue nodes. the theorem ensures us about the existence of such a limiting proportion. what is more, with the generation of barabási-albert graphs with multiple edges we can examine the speed of convergence. we set types of nodes in the initial graph g 0 in such a way that not necessarily half of the nodes will be blue, but approximately the sum degree of nodes with type blue will be the half of the whole sum of degrees. (of course, in case of an initial erdős-rényi graph these will be the same in expected value. however we can get more stable proportion of types with the second method. in this case by stable we mean proportions can be closer to 1 2 .) in the voter model we can see nodes with multiple types, defined in the last section, with the following interpretation. each node (individual) has two types according to two different aspects: so each node chooses a type from both of the aspects, and the choice of types according to different aspects are independent. (since four combination of these is possible, we could say that each node chooses one type from the 4 possible pairs.) during the voter model, interaction of nodes with different types influence the process in the following way: complying with the names of the types, we expect that good reasoner nodes could convince any nodes with a higher probability than bad reasoner nodes. also any node should convince a node of unstable type with a higher probability than a node of a stable type. in a step of the voter model, when node x influences a node y, the probability of success should only depend on node x's ability to convince (good/ bad reasoner type) and node y's we investigated the model with symmetric parameter set: the probability of a good reasoner node convincing a stable one is equal to the probability of a bad reasoner node convincing an unstable one. we also made the assumption that a bad reasoner node can convince a stable node with probability 0. voter model was examined with different c(1) ≥ c(2) set of parameters, and different possible choices of types in the graph. in this sequel we examine a special case of voter model with multiple type nodes, in which the aim is to spread an initially underrepresented opinion. this problem might be related to finding good marketing strategies on online social networks, when "opinion" might be about a commercial product or a certain political convinction. we investigate the following "influencer" model: types of a node according to the different aspects is not independent, nor is the l 0 vector of initial opinions. the nodes of the graph are divided into two groups, influencers and non-influencers. influencers usually form a smaller population; they represent opinion 1, which we want to spread across the graph. they are good reasoners, and also stable, while non-influencers have bad reasoner type according to the ability to convince, while they can be stable as well as unstable. according to definitions of c values, it is impossible for a bad reasoner node to convince a stable one, resulting influencers representing opinion 1 for the whole process. firstly, we study a case in which nodes of a ba graph get a type randomly or deterministically, not according to preferential attachment. we study the equivalent of the case in subsection 3.2.1 with multiple type nodes. in each graph 10% of individuals (100 nodes) are influencers. in ba graphs influencers are situated randomly, on the "oldest nodes" or on the "newest nodes" of the graph. in er graphs influencers are situated randomly (however, since the role of nodes is symmetric, they can be situated anywhere, with no difference in the outcome). we would like to examine the differences in opinion spread. we are also interested whether it is possible to convince all the nodes of the graph to opinion 1, and in case it is, we calculate the average time needed to do so. we observed differences in the outcome for 100 runs (on 5 different random graphs), with c = [2, 1] and m = 8 parameter set (see figure 10 ) (we wanted to exclude cases in which the proportion of one of the types is negligibly small.) for these reasons, we created er and ba graphs with multiple nodes, where the proportion of good reasoners is 1 2 and according to preferential attachment (in ba graphs), we set these nodes to be the influencers (they are stable, while non-influencer individuals can be stable and unstable with probability 1 2 ). so in expected value half of the nodes are influencers, but in case of ba graphs we can experience greater deviance (in expected value half of the nodes have good reasoner type according to the ability to convince, and 3 4 of the nodes have stable type). for er graphs the only meaningful possibility to create types is the random choice, but the same proportions also hold. proportions in ba graphs are still a bit greater). however, in terms of disappearing opinions results are rather different. on er graphs in none of the cases could opinion 0 disappear, while on ba graphs it strongly depended on the exact initial proportion of influencers in the graph: on the same graph (and hence with the same proportion of influencers) opinion 0 disappears either within the first 50 iterations of the voter model, or holds a high proportion of opinion 1, yet it will never be able to reach the limit. this main difference resulted from the fact that we can not exactly set the proportion of types in ba graphs, thus the co-existence of opinions is rather sensitive to changes in the number of influencers in ba graphs. (in er graphs only 20% of the examined runs resulted in disappearance of opinion 0, even with 600 influencers.) in this section we examine the voter model on a random graph which has a geometric structure on the plane. since the graph model is not dynamic, nodes can only choose their type randomly (or according to some deterministic strategy related to the position of nodes in the plane). however firstly we study the model without multiple types, with constant c = 0.5, 0.1. since the voter model is rather time-consuming, and even in case of parameter c = 0.5 the probability of conviction q(x, y) = e −c·d(x,y) for d(x, y) = 10 is ≈ 0.0067. thus we create a reduced graph from rp (n) by erasing edges in case of d(x, y) > 10. we can assume that results on the reduced graph can approximate the outcome on the original one, since transmission of opinions on those edges are negligible. the average degree in the reduced graph is still 27.85. modifying the voter model to spread opinions only on these edges makes the algorithm less robust and manageable to run the process on graphs with many (n = 1000) nodes. firstly, we would like to understand the behaviour of the process without multiple types of the graph. in this section we make an advantage of the geometric structure of the graph, and examine different deterministic and random choices for initial opinions of l 0 . we study how these alternative options can influence the outcome (the probability of the disappearance of an opinion, expected time needed for extinction). another interesting question is whether after a given number of iterations t of the voter model we can still observe any nice shape of the situation of opinions. in both of the following four choices for initial opinions in expected value 10% of the individuals are given opinion 1, the rest of them represent opinion 0. the discretized voter model with different initial opinion vectors l 0 was performed on 400 different graphs for viter = 100 steps. with n = 1000 number of nodes, c = 0.5 and only 10% of population representing opinion 1, opinion 1 disappeared only in a few (negligible) cases for any examined l 0 . we can say, without any doubt according to picture 11, that after 100 time steps the deterministic position of initial opinions is still recognizable (even after viter = 200 steps). we can generally state that with clustering individuals with the same opinion in a group, makes proportion of opinions more stable in the process: from the different runs we observed that proportion of opinions (from initial 0.1) stayed between [0.8, 1.2] with a probability of more than 0.4 in case of opinion 1 situated in a corner of the graph, while in any other cases this was significantly lower (less than 0.3). with this placement of opinion 1, average distances within individuals with opinion 1 was the smallest, however average distances between different groups of opinions was the largest among the examined cases, resulting in moderate change of opinions. number of individuals representing 1 decreased below 50 only with probability of 0.08, while with placing opinion 1 in the center this probability is 0.1325. opinion 1 is the most likely to disappear (with probability 0.195), or reduce to an insignificant amount with random placement of opinion 1. however, inverse extreme cases are also more likely to occur, since proportions of opinion 1 exceeding 0.2 is outstandingly high with this scenario. moreover, despite the high probability of extinction, in expected value we get the highest proportion of opinion 1 after viter = 100 iteration of the voter model with random initial configuration. we also examined random graphs on a plane with a random or deterministic type choice of the nodes corresponding to two different aspect as before. we set the type pairs to form an influencer model defined before. due to the fact that average distances in this random graph model are significantly larger than in er and ba graphs of small-world property, small proportion of influencers in most in most of the cases neither can help the problem the setting of all non-influencer individual to unstable type. even with random influencer position the calculation of average time needed to convince all nodes of the graphs is challenging due to its time cost. with the increase of influencers in number to 300, in half of the runs was able to reach opinion 1 all nodes of the graph within 400 time steps. however, sometimes only in a relatively small number of iterations, suggesting that exact position on the plane of randomly chosen individuals do effect the process significantly. coexistence in preferential attachment networks evolving voter model on dense random graphs on the spread of viruses on the internet a trust model for spreading gossip in social networks random graphs. second edition on critical vaccination coverage in multitype epidemics graphs with specified degree distributions, simple epidemics, and local vaccination strategies the noisy voter model on complex networks coexistence results for some competition models random graph dynamics sir epidemics and vaccination on random graphs with clustering random graphs and complex networks preferential attachment graphs with co-existing types of different fitnesses gergely röst modelling the strategies for age specific vaccination scheduling during influenza pandemic outbreaks interacting particle systems daihai he, preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak key: cord-024437-r5wnz7rq authors: wang, yubin; zhang, zhenyu; liu, tingwen; guo, li title: slgat: soft labels guided graph attention networks date: 2020-04-17 journal: advances in knowledge discovery and data mining doi: 10.1007/978-3-030-47426-3_40 sha: doc_id: 24437 cord_uid: r5wnz7rq graph convolutional neural networks have been widely studied for semi-supervised classification on graph-structured data in recent years. they usually learn node representations by transforming, propagating, aggregating node features and minimizing the prediction loss on labeled nodes. however, the pseudo labels generated on unlabeled nodes are usually overlooked during the learning process. in this paper, we propose a soft labels guided graph attention network (slgat) to improve the performance of node representation learning by leveraging generated pseudo labels. unlike the prior graph attention networks, our slgat uses soft labels as guidance to learn different weights for neighboring nodes, which allows slgat to pay more attention to the features closely related to the central node labels during the feature aggregation process. we further propose a self-training based optimization method to train slgat on both labeled and pseudo labeled nodes. specifically, we first pre-train slgat on labeled nodes and generate pseudo labels for unlabeled nodes. next, for each iteration, we train slgat on the combination of labeled and pseudo labeled nodes, and then generate new pseudo labels for further training. experimental results on semi-supervised node classification show that slgat achieves state-of-the-art performance. in recent years, graph convolutional neural networks (gcns) [26] , which can learn from graph-structured data, have attracted much attention. the general approach with gcns is to learn node representations by passing, transforming, and aggregating node features across the graph. the generated node representations can then be used as input to a prediction layer for various downstream tasks, such as node classification [12] , graph classification [30] , link prediction [17] and social recommendation [19] . graph attention networks (gat) [23] , which is one of the most representative gcns, learns the weights for neighborhood aggregation via self-attention mechanism [22] and achieves promising performance on semi-supervised node classification problem. the model is expected to learn to pay more attention to the important neighbors. it calculates important scores between connected nodes based solely on the node representations. however, the label information of nodes is usually overlooked. besides, the cluster assumption [3] for semisupervised learning states that the decision boundary should lie in regions of low density. it means aggregating the features from the nodes with different classes could reduce the generalization performance of the model. this motivates us to introduce label information to improve the performance of node classification in the following two aspects: (1) we introduce soft labels to guide the feature aggregation for generating discriminative node embeddings for classification. (2) we use slgat to predict pseudo labels for unlabeled nodes and further train slgat on the composition of labeled and pseudo labeled nodes. in this way, slgat can benefit from unlabeled data. in this paper, we propose soft labels guided attention networks (slgat) for semi-supervised node representation learning. the learning process consists of two main steps. first, slgat aggregates the features of neighbors using convolutional networks and predicts soft labels for each node based on the learned embeddings. and then, it uses soft labels to guide the feature aggregation via attention mechanism. unlike the prior graph attention networks, slgat allows paying more attention to the features closely related to the central node labels. the weights for neighborhood aggregation learned by a feedforward neural network based on both label information of central nodes and features of neighboring nodes, which can lead to learning more discriminative node representations for classification. we further propose a self-training based optimization method to improve the generalization performance of slgat using unlabeled data. specifically, we first pre-train slgat on labeled nodes using standard cross-entropy loss. then we generate pseudo labels for unlabeled nodes using slgat. next, for each iteration, we train slgat using a combined cross-entropy loss on both labeled nodes and pseudo labeled nodes, and then generate new pseudo labels for further training. in this way, slgat can benefit from unlabeled data by minimizing the entropy of predictions on unlabeled nodes. we conduct extensive experiments on semi-supervised node classification to evaluate our proposed model. and experimental results on several datasets show that slgat achieves state-of-the-art performance. the source code of this paper can be obtained from https://github.com/jadbin/slgat. graph-based semi-supervised learning. a large number of methods for semi-supervised learning using graph representations have been proposed in recent years, most of which can be divided into two categories: graph regularization-based methods and graph embedding-based methods. different graph regularization-based approaches can have different variants of the regularization term. and graph laplacian regularizer is most commonly used in previous studies including label propagation [32] , local and global consistency regularization [31] , manifold regularization [1] and deep semi-supervised embedding [25] . recently, graph embedding-based methods inspired by the skip-gram model [14] has attracted much attention. deepwalk [16] samples node sequences via uniform random walks on the network, and then learns embeddings via the prediction of the local neighborhood of nodes. afterward, a large number of works including line [21] and node2vec [8] extend deepwalk with more sophisticated random walk schemes. for such embedding based methods, a two-step pipeline including embedding learning and semi-supervised training is required where each step has to be optimized separately. planetoid [29] alleviates this by incorporating label information into the process of learning embeddings. graph convolutional neural networks. recently, graph convolutional neural networks (gcns) [26] have been successfully applied in many applications. existing gcns are often categorized as spectral methods and non-spectral methods. spectral methods define graph convolution based on the spectral graph theory. the early studies [2, 10] developed convolution operation based graph fourier transformation. defferrard et al. [4] used polynomial spectral filters to reduce the computational cost. kipf & welling [12] then simplified the previous method by using a linear filter to operate one-hop neighboring nodes. wu et al. [27] used graph wavelet to implement localized convolution. xu et al. [27] used a heat kernel to enhance low-frequency filters and enforce smoothness in the signal variation on the graph. along with spectral graph convolution, define the graph convolution in the spatial domain was also investigated by many researchers. graphsage [9] performs various aggregators such as meanpooling over a fixed-size neighborhood of each node. monti et al. [15] provided a unified framework that generalized various gcns. graphsgan [5] generates fake samples and trains generator-classifier networks in the adversarial learning setting. instead of fixed weight for aggregation, graph attention networks (gat) [23] adopts attention mechanisms to learn the relative weights between two connected nodes. wang et al. [24] generalized gat to learn representations of heterogeneous networks using meta-paths. shortest path graph attention network (spagan) to explore high-order path-based attentions. our method is based on spatial graph convolution. unlike the existing graph attention networks, we introduce soft labels to guide the feature aggregation of neighboring nodes. and experiments show that this can further improve the semi-supervised classification performance. in this paper, we focus on the problem of semi-supervised node classification. many other applications can be reformulated into this fundamental problem. let g = (v, e) be a graph, in which v is a set of nodes, e is a set of edges. each node u ∈ v has a attribute vector x u . given a few labeled nodes v l ∈ v , where each node u ∈ v l is associated with a label y u ∈ y , the goal is to predict the labels for the remaining unlabeled nodes in this section, we will give more details of slgat. the overall structure of slgat is shown in fig. 1 . the learning process of our method consists of two main steps. we first use a multi-layer graph convolution network to generate soft labels for each node based on nodes features. we then leverage the soft labels to guide the feature aggregation via attention mechanism to learn better representations of nodes. furthermore, we develop a self-training based optimization method to train slgat on the combination of labeled nodes and pseudo labeled nodes. this enforces slgat can further benefit from the unlabeled data under the semi-supervised learning setting. in the initial phase, we need to first predict the pseudo labels for each node based on node features x. the pseudo labels can be soft (a continuous distribution) or hard (a one-hot distribution). in practice, we observe that soft labels are usually more stable than hard labels, especially when the model has low prediction accuracy. since the labels predicted by the model are not absolutely correct, the error from hard labels may propagate to the inference on other labels and hurt the performance. while using soft labels can alleviate this problem. we use a multi-layer graph convolutional network [12] to aggregate the features of neighboring nodes. the layer-wise propagation rule of feature convolution is as follows: here, a = a + i is the adjacency matrix with added self-connections. i is the identity matrix, is a layer-specific trainable transformation matrix. σ (·) denotes an activation function such as denotes the hidden representations of nodes in the l th layer. the representations of nodes f (l+1) are obtained by aggregating information from the features of their neighborhoods f (l) . initially, f (0) = x. after going through l layers of feature convolution, we predict the soft labels for each node u based on the output embeddings of nodes: now we will present how to leverage the previous generated soft labels for each node to guide the feature aggregation via attention mechanism. the attention network consists of several stacked layers. in each layer, we first aggregate the label information of neighboring nodes. then we learn the weights for neighborhood aggregation based on both aggregated label information of central nodes and feature embeddings of neighboring nodes. we use a label convolution unit to aggregate the label information of neighboring nodes, and the layer-wise propagation rule is as follows: where w g is a layer-specific trainable transformation matrix, and g (l) ∈ r |v |×d (l) g denotes the hidden representations the label information of nodes. the label information g (l+1) are obtained by aggregating from the label information g (l) of neighboring nodes. initially, g (0) = softmax f (l) according to eq. 2. then we use the aggregated label information to guide the feature aggregation via attention mechanism. unlike the prior graph attention networks [23, 28] , we use label information as guidance to learn the weights of neighboring nodes for feature aggregation. we enforce the model to pay more attention to the features closely related to the labels of the central nodes. a single-layer feedforward neural network is applied to calculate the attention scores between connected nodes based on the central node label information g (l+1) and the neighboring node features h (l) : are layer-specific trainable transformation matrices, h (l) ∈ r |v |×d (l) h denotes the hidden representations of node features. · represents transposition and || is the concatenation operation. then we obtain the attention weights by normalizing the attention scores with the softmax function: where n i is the neighborhood of node i in the graph. then, the embedding of node i can be aggregated by the projected features of neighbors with the corresponding coefficients as follows: finally, we can achieve better predictions for the labels of each node u by replacing the eq. 2 as follows: where ⊕ is the mean-pooling aggregator. grandvalet & bengio [7] argued that adding an extra loss to minimize the entropy of predictions on unlabeled data can further improve the generalization performance for semi-supervised learning. thus we estimate pseudo labels for unlabeled nodes based on the learned node representations, and develop a self-training based optimization method to train slgat on both labeled and pseudo labeled nodes. int this way, slgat can further benefit from the unlabeled data. for semi-supervised node classification, we can minimize the cross-entropy loss over all labeled nodes between the ground-truth and the prediction: where c is the number of classes. to achieve training on the composition of labeled and unlabeled nodes, we first estimate the labels of unlabeled nodes using the learned node embeddings as follows: where τ is an annealing parameter. we can set τ to a small value (e.g. 0.1) to further reduce the entropy of pseudo labels. then the loss for minimizing the entropy of predictions on unlabeled data can be defined as: the joint objective function is defined as a weighted linear combination of the loss on labeled nodes and unlabeled nodes: where λ is a weight balance factor. we give a self-training based method to train slgat which is listed in algorithm. 1. the inputs to the algorithm are both labeled and unlabeled nodes. we first use labeled nodes to pre-train the model using cross-entropy loss. then we use the model to generate pseudo labels on unlabeled nodes. afterward, we train the model by minimizing the combined cross-entropy loss on both labeled and unlabeled nodes. finally, we iteratively generate new pseudo labels and further train the model. in this section, we evaluate our proposed slgat on semi-supervised node classification task using several standard benchmarks. we also conduct an ablation study on slgat to investigate the contribution of various components to performance improvements. we follow existing studies [12, 23, 29] and use three standard citation network benchmark datasets for evaluation, including cora, citeseer and pubmed. in all these datasets, the nodes represent documents and edges are citation links. node features correspond to elements of a bag-of-words representation of a document. class labels correspond to research areas and each node has a class label. in each dataset, 20 nodes from each class are treated as labeled data. the statistics of datasets are summarized in table 1 . we compare against several traditional graph-based semi-supervised classification methods, including manifold regularization (manireg) [1] , semi-supervised embedding (semiemb) [25] , label propagation (lp) [32] , graph embeddings (deepwalk) [16] , iterative classification algorithm (ica) [13] and planetoid [29] . training validation test cora 2,708 5,429 1,433 7 140 500 1,000 citeseer 3,327 4,732 3,703 6 120 500 1,000 pubmed 19,717 44,338 500 3 60 500 1,000 furthermore, since graph neural networks are proved to be effective for semisupervised classification, we also compare with several state-of-arts graph neural networks including chebynet [4] , monet [15] , graph convolutional networks (gcn) [12] , graph attention networks (gat) [23] , graph wavelet neural network (gwnn) [27] , shortest path graph attention network (spagan) [28] and graph convolutional networks using heat kernel (graphheat) [27] . we train a two-layer slgat model for semi-supervised node classification and evaluate the performance using prediction accuracy. the partition of datasets is the same as the previous studies [12, 23, 29] with an additional validation set of 500 labeled samples to determine hyper-parameters. weights are initialized following glorot and bengio [6] . we adopt the adam optimizer [11] for parameter optimization with initial learning rate as 0.05 and weight decay as 0.0005. we set the hidden layer size of features as 32 for cora and citeseer and 16 for pubmed. we set the hidden layer size of soft labels as 16 for cora and citeseer and 8 for pubmed. we apply dropout [20] with p = 0.5 to both layers inputs, as well as to the normalized attention coefficients. the proper setting of λ in eq. 11 affects the semi-supervised classification performance. if λ is too large, it disturbs training for labeled nodes. whereas if λ is too small, we cannot benefit from unlabeled data. in our experiments, we set λ = 1. we anticipate the results can be further improved by using sophisticated scheduling strategies such as deterministic annealing [7] , and we leave it as future work. furthermore, inspired by dropout [20] , we ignore the loss in eq. 10 with p = 0.5 during training to prevent overfitting on pseudo labeled nodes. we now validate the effectiveness of slgat on semi-supervised node classification task. following the previous studies [12, 23, 29] , we use the classification accuracy metric for quantitative evaluation. experimental results are summarized in table 2 . we present the mean classification accuracy (with standard deviation) of our method over 100 runs. and we reuse the results already reported in [5, 12, 23, 27, 28] for baselines. we can observe that our slgat achieves consistently better performance than all baselines. when directly compared to gat, slgat gains 1.0%, 2.3% and 3.2% improvements for cora, citeseer and pubmed respectively. the performance gain is from two folds. first, slgat uses soft labels to guide the feature aggregation of neighboring nodes. this indeed leads to more discriminative node representations. second, slgat is trained on both labeled and pseudo labeled nodes using our proposed self-training based optimization method. slgat benefits from unlabeled data by minimizing the entropy of predictions on unlabeled nodes. following shchur et al. [18] , we also further validate the effectiveness and robustness of slgat on random data splits. we created 10 random splits of the cora, citeseer, pubmed with the same size of training, validation, test sets as the standard split from yang et al. [29] . we compare slgat with other most related competitive baselines including gcn [12] and gat [23] on those random data splits. 1 we run each method with 10 random seeds on each data split and report the overall mean accuracy in table 3 . we can observe that slgat consistently outperforms gcn and gat on all datasets. this proves the effectiveness and robustness of slgat. in this section, we conduct an ablation study to investigate the effectiveness of our proposed soft label guided attention mechanism and the self-training based optimization method for slgat. we compare several variants of slgat on node classification, and the results are reported in table 4 . we observe that slgat has better performance than the methods without soft labels guided attention in most cases. this demonstrates that using soft labels to guide the neighboring nodes aggregation is effective for generating better node embeddings. note that attention mechanism seems has little contribution to performance on pubmed when using self-training. the reason behind such phenomenon is still under investigation, we presume that it is due to the label sparsity of pubmed. 2 the similar phenomenon is reported in [23] that gat has little improvement on pubmed compared to gcn. we also observe that slgat significantly outperforms all the methods without self-training. this indicates that our proposed self-training based optimization method is much effective to improve the generalization performance of the model for semi-supervised classification. in this work, we propose slgat for semi-supervised node representation learning. slgat uses soft labels to guide the feature aggregation of neighboring nodes for generating discriminative node representations. a self-training based optimization method is proposed to train slgat on both labeled data and pseudo labeled data, which is effective to improve the generalization performance of slgat. experimental results demonstrate that our slgat achieves state-ofthe-art performance on several semi-supervised node classification benchmarks. one direction of the future work is to make slgat going deeper to capture the features of long-range neighbors. this perhaps helps to improve performance on the dataset with sparse labels. manifold regularization: a geometric framework for learning from labeled and unlabeled examples spectral networks and locally connected networks on graphs cluster kernels for semi-supervised learning convolutional neural networks on graphs with fast localized spectral filtering semi-supervised learning on graphs with generative adversarial nets understanding the difficulty of training deep feed for leveraging graph wavelet transform to address the short-comings of previous spectral graphrd neural networks semi-supervised learning by entropy minimization node2vec: scalable feature learning for networks inductive representation learning on large graphs deep convolutional networks on graph-structured data adam: a method for stochastic optimization semi-supervised classification with graph convolutional networks link-based classification efficient estimation of word representations in vector space geometric deep learning on graphs and manifolds using mixture model cnns deepwalk: online learning of social representations schnet: a continuous-filter convolutional neural network for modeling quantum interactions pitfalls of graph neural network evaluation deep collaborative filtering with multi-aspect information in heterogeneous networks dropout: a simple way to prevent neural networks from overfitting line: large-scale information network embedding attention is all you need graph attention networks heterogeneous graph attention network deep learning via semi-supervised embedding a comprehensive survey on graph neural networks graph wavelet neural network. in: international conference on learning representations (iclr spagan: shortest path graph attention network revisiting semi-supervised learning with graph embeddings an end-to-end deep learning architecture for graph classification learning with local and global consistency semi-supervised learning using gaussian fields and harmonic functions key: cord-322890-w78tftva authors: suran, jantra ngosuwan; wyre, nicole rene title: imaging findings in 14 domestic ferrets (mustela putorius furo) with lymphoma date: 2013-06-06 journal: vet radiol ultrasound doi: 10.1111/vru.12068 sha: doc_id: 322890 cord_uid: w78tftva lymphoma is the most common malignant neoplasia in domestic ferrets, mustela putorius furo. however, imaging findings in ferrets with lymphoma have primarily been described in single case reports. the purpose of this retrospective study was to describe imaging findings in a group of ferrets with confirmed lymphoma. medical records were searched between 2002 and 2012. a total of 14 ferrets were included. radiographs (n = 12), ultrasound (n = 14), computed tomography (ct; n = 1), and magnetic resonance imaging (mri; n = 1) images were available for review. median age at the time of diagnosis was 5.2 years (range 3.25–7.6 years). clinical signs were predominantly nonspecific (8/14). the time between the first imaging study and lymphoma diagnosis was 1 day or less in most ferrets (12). imaging lesions were predominantly detected in the abdomen, and most frequently included intra‐abdominal lymphadenopathy (12/14), splenomegaly (8/14), and peritoneal effusion (11/14). lymphadenopathy and mass lesions were typically hypoechoic on ultrasound. mild peritoneal effusion was the only detected abnormality in two ferrets. mild pleural effusion was the most common thoracic abnormality (3/12). expansile lytic lesions were present in the vertebrae of two ferrets with t3‐l3 myelopathy and the femur in a ferret with lameness. hyperattenuating, enhancing masses with secondary spinal cord compression were associated with vertebral lysis in ct images of one ferret. the mri study in one ferret with myelopathy was inconclusive. findings indicated that imaging characteristics of lymphoma in ferrets are similar to those previously reported in dogs, cats, and humans. l ymphoma is the most common malignant neoplasia in domestic ferrets, mustela putorius furo. following insulinoma and adrenocortical neoplasia, it is the third most common neoplasia of domestic ferrets overall. [1] [2] [3] [4] in ferrets, lymphoma can be classified based on tissue involvement, including multicentric, mediastinal, gastrointestinal, cutaneous, and extranodal. [1] [2] [3] the presentation and organ distribution of lymphoma has been associated with the age of onset. 5, 6 mediastinal lymphoma is more prevalent in young ferrets, particularly less than 1 year of age. these ferrets tend to have an acute presentation and may present with dyspnea. ferrets with mediastinal lymphoma may also have multicentric involvement. 2, 5, 6 ferrets 3 years of age and greater have a variable presentation with multicentric disease being more prevalent. clinical signs in older ferrets may be chronic and nonspecific depending on organ involvement. some ferrets may have intermittent signs over several months, while others may be asymptomatic with lymphoma being diagnosed incidentally, detected either during routine physical examination or during evaluation of comorbidities. [1] [2] [3] despite lymphoma being common in domestic ferrets and the use of radiography and ultrasonography being touted as part of the minimum database in the diagnosis of lymphoma, 1, 7 imaging findings in ferrets with lymphoma have been limited to a few case reports. 1, 2, 5, 6, [8] [9] [10] [11] the goal of this retrospective study was to describe radiography, ultrasonography, computed tomography (ct), and magnetic resonance imaging (mri) findings in a series of ferrets with a confirmed diagnosis of lymphoma. medical records at the matthew j ryan veterinary hospital of the university of pennsylvania were searched for domestic ferrets with a diagnosis of lymphoma confirmed with cytology or histopathology that had radiography, ultrasonography, ct, or mri performed between january 2002 and april 2012. signalment, clinical signs, laboratory findings, and any prior or concurrent disease processes were recorded. vet radiol ultrasound, vol. 54, no. 5, 2013, pp 522-531. radiographs, ct, mri, static ultrasound images, and when available, ultrasound cine loops were retrospectively evaluated, and abnormal findings were recorded by j.n.s. the imaging reports generated by a board-certified veterinary radiologist at the time the study was performed were also reviewed. ferrets were excluded if the imaging studies were unavailable for review. any imaging studies obtained after the reported start of clinical signs until a diagnosis was achieved were included. in addition to the reported ultrasonographic findings, the maximal splenic and lymph node thicknesses were measured from the available images if they were not reported. as splenomegaly is common in ferrets, most frequently due to extramedullary hematopoiesis, 2 splenomegaly was subjectively graded as within incidental variation ("incidental splenomegaly") and larger than expected for "incidental splenomegaly." those with subjectively normal spleens or "incidental splenomegaly" were referred to as normal for the purposes of this paper, unless otherwise specified. retrieved ct images were reconstructed with a high-frequency high-resolution algorithm (bone algorithm with edge enhancement) in 1.25 mm slice thickness. the maximal lymph node thickness was measured on precontrast images. fourteen ferrets met the inclusion criteria. lymphoma was diagnosed from ultrasound-guided aspirates, surgical biopsies, and/or necropsy; three ferrets had two diagnostic procedures performed. ultrasound-guided aspirate cytology was performed in nine ferrets, surgical biopsy in three, and necropsy in five. both aspirates and biopsy were performed in two ferrets, and both aspirates and necropsy in one ferret. one ferret in this study was previously described. 8 the median age at the time of lymphoma diagnosis was 5.2 years (range 3.25-7.6 years). eight of the ferrets were neutered males, and six were spayed females. prior disease histories included adrenal disease (n = 5), cardiovascular disease (5), cutaneous mast cell tumors (4), diarrhea (3), insulinoma (2), cataracts (2) , and one each of granulomatous lymphadenitis secondary to mycobacteriosis, renal insufficiency, and a chronic pelvic limb abscess. cardiovascular disease included second degree atrioventricular block (3), systemic hypertension (1), hypertrophic obstructive cardiomyopathy (1) , and in one individual both aortic insufficiency and arteriosclerosis. for most ferrets, the time between the first imaging study and a diagnosis of lymphoma was 1 day or less (12/14) . in one ferret each, the time between the initial imaging study and final diagnosis of lymphoma was 7 days and 6.9 months. the duration of clinical signs prior to reaching the diagnosis of lymphoma ranged from less than 1 day to 8 months with a mode of less than 1 day and a median of 6 days. clinical signs include lethargy (n = 9), diarrhea (8) , inappetence (7), weight loss (6), ataxia (4), lameness (1), and vomiting (1) . diarrhea was chronic in three ferrets and consistent with melena in two. two ferrets did not have overt clinical signs. physical exam findings included palpable abdominal masses (8) , generalized splenomegaly (5), palpable splenic nodules or splenic masses (3), dehydration (5), paraparesis and ataxia (4), abdominal pain (4), hypotension (3), lumbar pain (2), abdominal effusion (1), inguinal and popliteal lymphadenopathy (1), pyrexia (1), urinary and fecal incontinence (1), and a right femoral mass (1). paraparesis and ataxia were attributed to t3-l3 myelopathy in three ferrets and hypoglycemia in one. one ferret also presented with ptyalism and tremors, which resolved with dextrose administration. blood analyses, including a complete blood count and chemistry profile, were performed in 12 of the 14 ferrets. blood glucose evaluation alone was performed in one ferret. abnormalities included azotemia (5), elevated liver enzymes (5), nonregenerative anemia (4), hypoglycemia (4), hypoalbuminemia (3), lymphocytosis (2), hyperglobulinemia (2), hypercalcemia (1), and elevated total bilirubin (1). two of the four ferrets with hypoglycemia had a previous diagnosis of insulinoma. radiographs were performed in 12/14 ferrets. each of these 12 studies included the thorax and abdomen-six studies included a right or left lateral projection and a ventrodorsal projection and six included a left lateral, right lateral, and ventrodorsal projections. one ferret's radiographs were in an analog format (screen-film), and the remaining 11 were digital (rapidstudy, eklin medical systems inc., santa clara, ca). ultrasound was performed in 14/14 ferrets using a 13-15 mhz linear transducer (ge medical logiq 9 ultrasound imaging system, general electric medical systems, milwaukee, wi). abdominal ultrasonography was performed in 13/14 ferrets. ultrasonography of a rib mass and a femoral mass was performed in one ferret each. abdominal ultrasonography was performed twice in one ferret after the start of the reported clinical signs, but prior to the diagnosis of lymphoma. ultrasounds were performed by a board-certified radiologist or a radiology resident under direct supervision from a board-certified radiologist. in the one ferret with two ultrasounds performed prior to diagnosis of lymphoma, the later scan was used for measurements. computed tomography of the thorax and abdomen was performed in one ferret. the ferret was scanned under general anesthesia in dorsal recumbency using a 16-slice multidetector ct unit (ge brightspeed, general electric company, milwaukee, wi) in medium-frequency soft tissue algorithms (2.5 mm slice thickness, 1.375 pitch) before and immediately after iv administration of nonionic iodinated contrast (iohexol, 350 mgi/ml, dosage 770 mgi/kg, [omnipaque, ge healthcare, inc., princeton, nj]). contrast was manually injected through an iv catheter preplaced in a cephalic vein. one ferret underwent mri evaluation of the lumbar and sacral spine. magnetic resonance imaging was performed using a 1.5 t mri unit (ge medical system, milwaukee, wi) with the patient in dorsal recumbency under general anesthesia. image sequences included t2 weighted (t2w) image series in a sagittal and transverse plane, t2w fat-saturated images in a sagittal and transverse plane, and a sagittal single-shot fast spine echo. additional sequences including t1 weighted (t1w) images and administration of gadolinium were not performed due to anesthetic concerns for the patient. radiographic and ultrasonographic findings are summarized in table 1 . radiographic abnormalities were predominantly noted in the abdomen. decreased abdominal serosal detail was present in 10 of the 12 ferrets with radiographs. this was interpreted to be potentially due to a poor body condition in two ferrets. abdominal serosal detail was additionally mottled in seven ferrets. sonographically, peritoneal effusion was detected in 11/13 ferrets, and was considered mild in 7, moderate in 4, anechoic in 8, and echogenic in 3. the spleen was considered enlarged in 7 out of 12 ferrets radiographically and in 8 out of 13 ferrets in which abdominal ultrasonography was performed. the remaining five ferrets were considered to have "incidental splenomegaly" on ultrasound and were radiographically considered normal (n = 4) and enlarged (1) . of the eight spleens sonographically considered abnormal, multifocal hypoechoic splenic nodules were present in six ferrets (fig. 1) ; one of these six ferrets was considered to have a radiographically normal spleen. an isoechoic to hypoechoic mass with central hypoechoic nodules was present in one ferret. the spleen had a mottled echotexture in one ferret. three ferrets were sedated with butorphanol (torbugesic, fort dodge animal health, fort dodge, ia) and midazolam (hospira inc., lake forrest, il) prior to abdominal ultrasonography; sedation was not performed in any ferret prior to radiographs. all three sedated ferrets had enlarged spleens with one spleen having a mottled echotexture and the other two spleens having multifocal hypoechoic nodules. splenic cytology or histopathology was not performed in any of the ferrets that received sedation for ultrasound. cytology or histopathology was available in seven ferrets-aspirates were performed in 2/7, necropsy in 4/7, and both aspirates and necropsy in 1/7. lymphoma was confirmed in three out of six ferrets with splenic nodules, the one ferret with a splenic mass, and one out of five ferrets with "incidental splenomegaly." in three of the five ferrets with "incidental splenomegaly," marked splenic extramedullary hematopoiesis with splenic congestion (1) and without concurrent splenic congestion (1) was diagnosed. the other seven ferrets did not have cytologic evaluation of the spleen. splenic thickness in ferrets where the spleen was considered within incidental variation ranged from 8.0 to 16.2 mm (median, 15.0 mm, mean 13.4 mm, standard deviation ±3.5 mm; n = 5), while in ferrets with splenomegaly splenic thickness ranged from 14.2 to 33.1 mm (median 19.9 mm, mean 20.9 mm, standard deviation ±6.4 mm; n = 8). single or multiple, round to oblong, soft-tissue opaque, abdominal masses consistent with enlarged lymph nodes were visible radiographically in 7/12 ferrets (fig. 2) . one of these ferrets had a large cranial abdominal mass, which was subsequently confirmed to be a markedly enlarged pancreatic lymph node. one ferret had splenic lymphadenopathy detected on the radiographs retrospectively after evaluation of the sonographic findings. lymphadenopathy was reported in 11 of the 13 ferrets with abdominal ultrasonography and the one ferret in which whole body ct was performed. sonographically, abnormal lymph nodes were hypoechoic, rounded, variably enlarged, and surrounded by a fig. 3 . ultrasound image of an enlarged hepatic lymph node in the same ferret as fig. 2 . the hepatic lymph node is markedly enlarged and lobular. although the node is predominantly hypoechoic, there are patchy hyperechoic regions and smaller, round, hypoechoic, nodule-like regions within (arrow). the surrounding fat is hyperechoic, producing a halo around the lymph node (arrow heads). scale at the top of the image is 10 mm between the major ticks. hyperechoic rim (fig. 3 ). some lymph nodes had patchy hyperechoic regions within or a reticular, nodular appearance. abdominal lymph nodes involved included the mesenteric (n = 10), hepatic (7), sublumbar (5), splenic (4), gastric (3), gastroduodenal (3), colonic (3), pancreatic (2), ileocolic (1), renal (1), and inguinal (1) lymph nodes. in one ferret other lymph nodes were reported to be involved in addition to the mesenteric and sublumbar nodes, but were not specifically identified. nine of the 11 ferrets with abdominal ultrasound and the one ferret with ct had involvement of 2 or more lymph nodes reported; 2/11 ferrets had only one reportedly abnormal lymph node (1 splenic lymph node, 1 mesenteric lymph node). lymph nodes measured from 4.8 to 29.5 mm thick (median 8.5, mean 11.6 mm, standard deviation ±8.1 mm). the lymph node thickness in ferrets with radiographically evident lymphadenopathy ranged from 6.2 mm to 29.5 mm with a median of 9.7, and mean of 14.5, and standard deviation of ±9.7 mm (n = 7). in ferrets in which lymphadenopathy was detected sonographically but not radiographically, lymph nodes measured 4.8 mm, 5.2 mm, and 11.7 mm thick (n = 3). lymph node cytology or histopathology was available in seven ferrets-aspirates were performed in 3/7 ferrets, both aspirates and surgical biopsy in 1/7, and necropsy in 3/7. it was not clear if lymph nodes were histopathologically assessed in 2/5 ferrets in which a necropsy was performed. lymphoma was confirmed in 6/11 ferrets with sonographically abnormal lymph nodes (range 5.2-29.5 mm thick, median 14.8 mm, mean ± standard deviation 17.2 ± 9.3 mm). in 1/11 ferrets, lymphoid hyperplasia was identified postmortem (6.2 mm thick). cytologic evaluation of lymph nodes was not performed in the remaining ferrets. in ferrets with lymphadenopathy, eight had concurrent splenomegaly with fig. 4 . ultrasound image of the gastric antrum. the wall of the gastric antrum, especially the muscularis layer, is circumferentially thickened (arrows). a small amount of fluid and gas (*) is present in the lumen. splenic nodules (6), a splenic mass (1), or a mottled splenic echotexture (1) . lymphoma was identified in the liver of 2/14 ferrets by surgical biopsy (1) and at necropsy (1) . mild hepatomegaly with a normal echotexture was noted on ultrasound in the ferret with lymphoma diagnosed by biopsy. additional findings in this ferret included mild peritoneal effusion and lymphadenopathy. the ferret with hepatic lymphoma identified at necropsy had no radiographic or reported sonographic liver abnormalities. this ferret also had hepatic lipidosis. imaging findings in this ferret included an aggressive vertebral lesion, splenomegaly with splenic nodules, and lymphadenopathy. lymphoma was confirmed in each of these organs, as well as in the pancreas, which was reportedly normal on ultrasound. in addition to these two ferrets, hepatic histopathology from necropsy was available in four other ferrets. hepatic lipidosis was identified each of these four ferrets; one ferret also had extramedullary hematopoiesis. none of these four ferrets had radiographic or reported sonographic abnormalities. cytology or histopathology was not available in the remaining 8/14 ferrets. of these, 1/8 had no radiographic or sonographic abnormalities noted. mild hepatomegaly was noted radiographically in 5/8 ferrets; however, sonographic hepatic changes were not noted in these ferrets and cytologic evaluation was not performed. on ultrasound, one ferret had moderate hepatomegaly with a hypoechoic, mottled echotexture (radiography was not performed in this ferret). additionally one ferret had two hypoechoic cystic masses, one mass of which had central mineralization, detected with ultrasound (radiography was not performed in this ferret). gastrointestinal lymphoma was confirmed at necropsy in two ferrets. in one ferret, thickening of the gastric antrum up to 4.5 mm and blurring of wall layering was identified sonographically (fig. 4) . (to the authors' knowledge, the normal gross or sonographic wall thickness of the gastrointestinal tract in ferrets has not been previously reported.) fig. 5 . ultrasound image of the right kidney. there is a round, hypoechoic nodule in the parenchyma, which bulges from the renal contour. the calipers (+) denote the renal length (1) and margins of the nodule (2, 3). no abnormalities were noted in the small or large intestines. at necropsy, lymphoma was identified in the stomach and small intestines, in addition to a chronic ulcerative gastroenteritis. in the second ferret, aside from poor abdominal serosal detail attributable to poor body condition and mild anechoic peritoneal effusion, there were no other radiographic or sonographic abnormalities. in this ferret, lymphoma was identified postmortem in the descending colon. the colon was discolored, but there was no reported gross colonic wall thickening. renal masses were present in two ferrets. in each ferret, a single renal mass was detected with ultrasound and was well defined, hypoechoic, centered on the cortex, and protruded from the kidney. the mass was right-sided, rounded, and measured 8.6 mm in diameter in one of these two ferrets. in the second ferret, the mass was left-sided, lobular, and measured up to 17.9 mm in diameter. cytology of the right renal mass in the first ferret was not performed; however, the patient received chemotherapy for the treatment of lymphoma, confirmed from aspiration and biopsy of an enlarged lymph node, and the mass was seen to decrease in size during follow-up studies (fig. 5) . although the masses in both ferrets had similar ultrasound characteristics, the renal mass in the second ferret was diagnosed as a spindle cell sarcoma. additionally that mass had been identified sonographically 1 year prior to the diagnosis of lymphoma, was progressively increasing in size, and did not decrease in size following administration of chemotherapy for the treatment of lymphoma. concurrent sonographic changes in both ferrets included lymphadenopathy and peritoneal effusion. a large lobular retroperitoneal mass was present in one ferret. the mass was on midline, extending into the right and left sides of the retroperitoneal space, laterally displacing the right kidney. the left kidney was not visualized radiographically. in the right cranial retroperitoneal space, cranial to the right kidney, there was a cluster of heterogeneous mineral opacities in an adjacent second, smaller mass. sonographically the large retroperitoneal mass was heterogeneous, hypoechoic with patchy hyperechoic regions, and laterally displaced both kidneys. the smaller mineralized mass was confirmed to be an enlarged right adrenal gland sonographically. at necropsy, the retroperitoneal mass was confirmed to be lymphoma; however, a specific tissue of origin was not determined. as a normal left adrenal gland could not be identified sonographically or at postmortem, an adrenal origin for this mass was considered most likely, although adrenal tissue was not identified histopathologically within the mass. alternatively the mass may have arisen from a retroperitoneal lymph node or retroperitoneal adnexa. concurrent abdominal imaging findings considered incidental to the diagnosed lymphoma included renal cysts (8) , cystic lymph nodes (7), adrenomegaly in ferrets with diagnosed adrenal disease (5), and pancreatic nodules in ferrets diagnosed with insulinoma (2). on thoracic radiographs pleural fissure lines, consistent with a small volume of pleural effusion, were present in 3/12 ferrets. pericardial and pleural effusions were noticed during abdominal ultrasonography in one ferret. possible sternal (2) and tracheobronchial lymphadenopathy (1) were seen radiographically. in one ferret, sternal and cranial mediastinal lymphadenopathy were detected with ct. an interstitial pulmonary pattern was present in two ferrets, but was potentially attributable to the radiographic projections being relatively expiratory. aggressive osseous lesions were detected radiographically in three ferrets. the one ferret with a history of lameness had a soft-tissue mass involving the entire right femur with marked, multifocal areas of geographic to motheaten, expansile lysis throughout. smooth to mildly irregular periosteal reaction was present along the femoral diaphysis and greater trochanter. the adjacent acetabulum and ileum were questionably involved based on the radiographs. sonographically the soft-tissue components of the mass were homogeneously hypoechoic. cortical irregularities and disruption, consistent with lysis, were also present. histopathology of the mass following limb amputation was consistent with plasmablastic lymphoma. this ferret was previously described. 8 vertebral lysis was apparent radiographically in two of the three ferrets with t3-l3 myelopathy. in one of these two ferrets, there was geographic lysis of l1 involving the majority of the vertebral body and a pathologic fracture of the cranial end plate (fig. 6) . other radiographic changes present in this ferret included splenomegaly and decreased abdominal serosal detail likely due to poor body condition. on ultrasound, peritoneal effusion, splenomegaly with splenic nodules, and lymphadenopathy were detected. at necropsy, intramedullary lymphoma was found in the l1 vertebra with epidural extension of the tumor. lymphoma was also found affecting the spleen, liver, pancreas, and fig. 6. right lateral radiograph cropped and centered on l1. at l1 there is geographic lysis, including cortical thinning or loss, of the cranial two-thirds of the vertebral body and the cranial aspect of the pedicles (arrows). the cranial end plate of l1 has a concave indentation, presumptively secondary to a pathologic fracture (arrow head). mesenteric lymph nodes. in the second ferret with vertebral lysis, there was geographic lysis of the cranial two-thirds of the body and pedicles of t14. there was also possible lysis of the cranial body and pedicles of l4. the dorsal half of the left 9th rib was lytic and no longer visible. associated with this rib, there was a large, ill-defined, soft-tissue mass, which extended into the thoracic cavity. the adjacent ribs and vertebra were not appreciably involved. additional radiographic findings in this ferret included splenomegaly, hepatomegaly, and abdominal masses consistent with enlarged lymph nodes. with ct, expansile lysis of the left t14 vertebral body and pedicle was seen associated with a hyperattenuating (to muscle), strongly enhancing mass (fig. 7) . the mass occupied the ventral two-thirds of the spinal canal and resulted in severe spinal cord compression. a possible pathologic fracture was present in the cranial endplate. at l4, there was lysis of the left pedicle and body associated with a mildly compressive, hyperattenuating, contrast-enhancing mass. an additional, similar mass lesion was seen at t4, with lysis of the midvertebral body and mild spinal cord compression. the rib mass was isoattenuating (to muscle), heterogeneous, mildly enhancing, and resulted in severe, expansile lysis. cytology of the rib mass obtained by ultrasound-guided fine-needle aspiration was diagnostic for lymphoma; cytological assessment of the other lesions was not performed in this ferret. one ferret with t3-l2 myelopathy did not have gross skeletal pathology. radiographic changes included splenomegaly and poor, mottled serosal detail. sonographically, a mild peritoneal effusion was present, and the spleen was considered within incidental variation. an mri of the lumbar spine revealed an ill-defined area of suspect intramedullary t2w hyperintensity within the spinal cord at the level of l3. differential diagnoses for this lesion considered at the time included an artifact, prior infarct, gliosis, edema, myelitis, neoplasia, and hydromyelia. at the postmortem examination performed 5 months after the mri, lymphoma was detected in the brain, meninges, choroid plexus, spinal cord, and extracapsular accessory adrenal tissue. additionally there was multifocal spinal cord malacia and hemorrhage. the lesions in the spinal cord were identified in histopathologic samples obtained at intervals from the cervical spine at c1 through to the lumbar spine at l5, including at the level of l3. specific correlation between the suspect mri lesion and histopathologic findings was not performed. splenic changes were consistent with congestion and extramedullary hematopoiesis. multicentric lymphoma was the most common presentation in this study. this is consistent with prior reports in which multicentric lymphoma is the most common presentation in ferrets older than 3 years of age. 1, 5, 6, 12 the most common imaging findings in this study were intraabdominal lymphadenopathy and splenomegaly with mildto-moderate peritoneal effusion. lymphadenopathy consisted of multiple enlarged, predominantly intra-abdominal lymph nodes, particularly including the mesenteric lymph node. only one ferret had peripheral lymphadenopathy, consisting of enlargement of the inguinal and popliteal lymph nodes, in addition to abdominal lymphadenopathy. lymph nodes greater than 6.2 mm thick sonographically were generally appreciable radiographically as round to oblong, soft-tissue nodules or masses in their respective locations. of the three ferrets in which sonographically detected lymphadenopathy was not appreciable radiographically, only one had a lymph node thickness greater than 6.2 mm. that ferret also had a large, retroperitoneal mass that likely accounted for a lack of visualization of the enlarged mesenteric lymph node due to silhouetting and displacement. previous studies in normal ferrets using ultrasound have reported the normal thickness of mesenteric lymph nodes as 5.3 ± 1.39 mm and 7.6 ± 2.0 mm. 13, 14 given that some radiographically visible lymph nodes measured as small as 6.2 mm (which is within the reported normal ranges for mesenteric lymph nodes) it is possible that normal lymph nodes may be radiographically appreciable. in the authors' experiences, however, visualization of normal, small abdominal lymph nodes on radiographs of ferrets is uncommon. normal abdominal lymph nodes are not radiographically distinguishable in dogs and cats. 15, 16 although some lymph nodes in this study that were considered abnormal measured within the reported normal ranges, there were other changes to those nodes to suggest pathology, such as hypoechogenicity. in dogs and cats, sonographic changes that have been associated with malignancy include an increase in maximal short and long axis diameter (enlarged), an increase in short-to-long axis length ratio (more rounded appearance), hypoechogenicity, hyperechoic perinodal fat with an irregular nodal contour, and heterogeneity. [17] [18] [19] similar to previous reports, the spleen was the most common extranodal site of neoplastic infiltration with lymphoma in the current study. 5 in a prior study splenomegaly was attributable to neoplastic infiltration in 67% of ferrets with lymphoma and extramedullary hematopoiesis in 33%. 5 in general, splenomegaly secondary to extramedullary hematopoiesis is common in ferrets. 2 to the authors' knowledge, there is no reference for normal splenic size in ferrets using ultrasound. grossly the normal spleen has been reported to measure 5.1 cm in length, 1.8 cm in width, and 0.8 cm thick. 20 given that these are gross measurements, however, they were likely obtained postmortem. splenic size is variable and decreases postmortem, so these measurements may not be translatable to antemortem studies with sonographic measurements. 21 the smallest splenic thickness in this study was 0.8 cm; using the gross measurement guidelines all spleens in this study would be considered enlarged. the degree of splenomegaly was therefore subjectively characterized as within incidental variation and larger than expected for "incidental splenomegaly" based on the authors' experiences. of the seven ferrets in which splenic cytology was available, lymphoma was confirmed in the four ferrets in which the spleen was considered abnormal and cytology was performed. of the three ferrets spleens in which cytology was performed and the spleen was considered within incidental variation, lymphoma was identified in one, and extramedullary hematopoiesis was confirmed in the other two. potential differential diagnoses for multicentric lymphadenopathy with splenomegaly in ferrets include reactive lymphadenopathy secondary to gastrointestinal disease with splenic extramedullary hematopoiesis, systemic mycobacteriosis, granulomatous inflammatory syndrome, and aleutian disease. 1, [22] [23] [24] [25] with systemic mycobacteriosis, ferrets can have other lymph nodes affected in addition to the abdominal lymph nodes, with the retropharyngeal lymph nodes being affected as commonly as the mesenteric lymph nodes. 25 as with lymphoma, clinical signs of mycobacteriosis in the ferret depend on the organs that are affected and can include lethargy, anorexia, vomiting, and diarrhea; but as with other infectious diseases, changes in white blood cell counts can be seen. mycobacteriosis can be diagnosed on cytology and biopsy of the affected lymph node or organ. 25 granulomatous inflammatory syndrome is a newly recognized systemic disease associated with coronavirus that causes inflammation in the spleen and lymph nodes. 23 this syndrome results in a severe granulomatous disease that can affect the gastrointestinal tract, mesenteric lymph nodes, liver, and spleen. unlike lymphoma, it is usually seen in younger ferrets, but like lymphoma, clinical signs are nonspecific and depend on the organ that is affected. patients with this syndrome usually have polyclonal gammopathy that can also be seen with aleutian's disease virus and lymphoma. definitive diagnosis requires cytology or biopsy of the affected organs. 23 aleutian's disease is a parvovirus that can cause lymphadenopathy and splenomegaly. as with lymphoma and granulomatous inflammatory syndrome, aleutian's disease can cause a polyclonal gammopathy. ferrets with this virus usually present with generalized signs of illness (lethargy, weight loss) as well as neurologic signs such as paresis or tremors. 24 aspirates and biopsy samples of lymph nodes and the spleen can be difficult to interpret as the disease causes lymphoplasmacytic inflammation that can be easily confused with other diseases such as small cell lymphoma and epizootic catarrhal enteritis. 24 in the one ferret with colonic lymphoma, there were minimal imaging findings including poor abdominal serosal detail and mild peritoneal effusion. at postmortem, lymphoma with mucosal erosions was detected in the colon. segmental lymphoplasmatic enteritis was identified in the small intestines. this ferret presented cachexic, hypotensive, anemic, had melena, and died within 24 h of presentation. given that melena is referable to upper gastrointestinal bleeding, the clinical findings in this ferret could have been attributable to both helicobacter mustelidae gastritis and lymphoma. it is also possible that lymphoma was present in other portions of the gastrointestinal tract, but was not detected postmortem. after the spleen, the next most common extranodal sites of neoplastic involvement with lymphoma in ferrets have been reported to be the liver, kidneys, and lungs. 5 in this study, two ferrets had confirmed hepatic infiltration-one of which had mild hepatomegaly on ultrasound (subjectively normal on radiographs) and the other of which had no reported hepatic abnormalities. hepatic lipidosis, identified in four ferrets, was not associated with radiographic or sonographic changes and may have been due to inappetence. 26 given these findings, ultrasound does not appear to be sensitive for the detection of hepatic lymphoma in ferrets. sensitivity of ultrasound for hepatic lymphoma has also been reported to be low in dogs, cats, and humans. 27, 28 one of two ferrets with renal masses had probable renal lymphoma based on the response to treatment. the second renal mass, a confirmed renal sarcoma, was not sonographically differentiable from the presumptive renal lymphoma. pulmonary involvement was not identified in this study. the small number of individuals in this study precludes extensive comparisons of the affected organ distribution to prior studies. the most common thoracic finding in this study was mild pleural effusion, which was present in four ferrets. there were no ferrets with a mediastinal mass in this study. mediastinal involvement, in general, is more prevalent in ferrets less than 3 years of age, and has been reported to be the more common presentation of lymphoma in that age group with or without concurrent multicentric involvement. [1] [2] [3] ferrets with mediastinal lymphoma may present for tachypnea or dyspnea secondary to the space-occupying effect of a large mediastinal mass, as well as concurrent pleural effusion. no ferrets in this study were less than 3 years old, which may have accounted for the lack of mediastinal involvement in this cohort. additionally, although this institution also provides primary care to nontraditional small mammal species, it is also a tertiary care facility and the population of ferret patients may not have been representative of the general domestic ferret population. there may have been a selection bias for ferrets with more insidious signs, which tend to occur in ferrets greater than 3 years of age, as opposed to younger ferrets, which may have a more acute and more rapidly progressive presentation. aggressive osseous lesions were present in three ferrets with skeletal lymphoma involvement. to the authors' knowledge, only three other ferrets with osseous involvement have been described previously. 2, 9 in those ferrets, lytic lesions were present in the tibia, in the lumbar spine, and in the lumbosacral spine. 2, 9 based on those ferrets and the ferrets in this study, it is possible that the lumbar spine is a predilection site for vertebral lymphoma; however, this remains speculative. an alternate possibility is that lysis may be relatively easier to detect in the lumbar spine where there is less superimposition of structures over the vertebrae, compared to the thoracic vertebrae where the ribs proximally are superimposed on the vertebrae. in humans with primary bone lymphoma, three radiographs patterns are described: the lytic-destructive pattern, which is predominantly lytic with or without a lamellated or interrupted periosteal reaction or cortical lysis; the blastic-sclerotic pattern in which there are mixed lytic and sclerotic regions; and "near-normal" findings in which there are only subtle radiographic changes and additional imaging (scintigraphic bone scans or mri) is required. 29 osseous lesions seen in the ferrets of this study and the prior reports are similar to the lytic-destructive pattern, and had cortical disruption. this is also the pattern most typically seen in canines and felines with osseous involvement from lymphoma or other round cell neoplasms. diffuse central nervous system infiltration with lymphoma was present in one ferret. as lymphoma outside of the central nervous system was only detected in accessory adrenal tissues, this ferret presumably had a primary central nervous system lymphoma. primary central nervous system lymphoma in dogs and cats has not been reported to have an extraparenchymal vs. an intraparenchymal predilection. 30 in humans, lesions with primary central nervous system lymphoma are most frequently intraparenchymal, and metastatic central nervous system lymphoma is more frequently extraparenchymal. 30, 31 in this ferret, the meninges and choroid plexus were involved in addition to the brain and spinal cord. protracted clinical signs in that ferret consisted of variable paraparesis and lumbar pain over 8 months from the start of clinical signs to the final diagnosis of lymphoma at necropsy. gradual progression of signs and the protracted clinical signs suggests a relatively slow-growing process. prednisone, administered for palliative treatment, was started approximately 3 months after the initial clinical signs. radiographs and ultrasound, performed prior to starting prednisone, had minimal, nonspecific findings. magnetic resonance imaging, performed 1 month after initiation of the prednisone regimen and 5 months prior to the diagnosis of lymphoma, was inconclusive. administration of prednisone prior to the mri may have resulted in partial regression of lymphoma, therefore making it more difficult to identify; however, the ferret did not demonstrate improvement of the clinical signs so whether prednisone affected detection of neoplastic infiltration or not is speculative. additionally the mri was limited in that only t2w images were obtained. perhaps if additional sequences were performed, particularly t1w postcontrast images, or if a follow-up mri was performed at a later date, meningeal or parenchymal abnormalities may have been detected. also because the necropsy was performed 5 months following mri, it is likely that the extent of the lesions seen postmortem had progressed com-pared to at the time of imaging. magnetic resonance imaging lesions in dogs and cats with primary central nervous system lymphoma (compared to white matter) have been reported to be predominantly t2w hyperintense with indistinct margins, t1w hypointense, contrast enhancing, had perilesional hyperintensity on flair consistent with perilesional edema, and had a mass effect. 30 in humans, lesions have similar signal characteristics (compared to white matter) being t2w hyperintense, t1w iso-to hypointense, and contrast enhancing. 30, 31 these findings are considered nonspecific in dogs, cats, and humans, and lesions may not be detected with mri at the onset of clinical signs. 30, 31 two ferrets had no clinical signs referable to lymphoma. in one ferret, the owner palpated a markedly enlarged abdominal lymph node. radiographic and sonographic findings consisted of multicentric lymphadenopathy, peritoneal effusion, a renal mass, a hyperechoic liver, and "incidental splenomegaly." in the other ferret, progressive lymphocytosis was detected during routine treatment and monitoring of adrenocortical disease. lymphoma was identified in the peritoneal effusion of that ferret. additional sonographic findings included a multicentric lymphadenopathy, splenomegaly with splenic nodules, a cystic hepatic mass, and a renal mass (sarcoma). adrenal disease was a common comorbidity seen with lymphoma, as found in other studies. 32 this is likely because adrenal disease is common in older ferrets in general. 1, 4, 33 other relatively common comorbidities found in this study were cardiovascular disease and cutaneous mast cell tumors, both of which also commonly occur in older ferrets. 1, 33 one ferret had a history of granulomatous lymphadenitis suspected to be secondary to mycobacteriosis. although this study describes the imaging findings in a small number of ferrets with lymphoma, it provides an important source of information for practicing clinicians. the small number of ferrets able to be included during the time frame of the study likely reflects that imaging is not performed in every ferret with suspected or confirmed lymphoma, and that a definitive diagnosis was not always attained prior to treatment in individuals with suggestive clinical and imaging findings. ultrasound-guided aspirates of lymph nodes, spleens, and aggressive osseous lesions performed in ferrets of this study were each diagnostic for or strongly suggestive of lymphoma. although aspirates are often the initial tissue sampling procedure, previous reports have cautioned the use of lymph node aspirates in the diagnosis of lymphoma as inflammatory and reactive changes may be misinterpreted as lymphoma. 1, 3 this is particularly true of the gastric lymph node in ferrets with gastrointestinal signs. a false positive diagnosis of lymphoma is considered not likely to have occurred in the ferrets included for in this study. lack of a definitive diagnosis (i.e., false negative results) from aspirate samples likely resulted in exclusion of some individuals from this study. analysis of the frequency of misdiagnosis and nonconfirmatory aspirate samples in patients with lymphoma was not performed. this study was also limited in that histopathology was not performed on all organs in each individual, and therefore, whether or not the changes seen were each attributable to lymphoma cannot be confirmed. additionally, because ultrasound findings were based on the reports and images obtained; some structures were unable to be reassessed. this is particularly the case in which multiple lymph nodes were affected. images of each lymph node may not have been attained, the imaging report may not have been complete in describing which nodes were affected, and measurement performed retrospectively on the available static images may not have reflected the actual maximal nodal thickness in that individual. in conclusion, findings from the current study indicated that imaging characteristics of lymphoma in ferrets are similar to those previously reported for dogs, cats, and humans. lymphoma may most commonly be multicentric in ferrets. imaging findings frequently included intra-abdominal lymphadenopathy, splenomegaly, and peritoneal effusion. lymphadenopathy and mass lesions were typically hypoechoic on ultrasound. osseous lesions, when present, were predominantly lytic. lack of imaging abnormalities did not preclude the diagnosis of lymphoma. ferrets, rabbits, and rodents. saint louis: w.b. saunders hematopoietic diseases ferret lymphoma: the old and the new neoplastic diseases in ferrets: 574 cases (1968-1997) clinical and pathologic findings in ferrets with lymphoma: 60 cases malignant lymphoma in ferrets: clinical and pathological findings in 19 cases ferrets: examination and standards of care diagnosis and treatment of myelo-osteolytic plasmablastic lymphoma of the femur in a domestic ferret t cell lymphoma in the lumbar spine of a domestic ferret (mustela putorius furo) t-cell lymphoma in a ferret (mustela putorius furo) malignant b-cell lymphoma with mott cell differentiation in a ferret (mustela putorius furo) cytomorphological and immunohistochemical features of lymphoma in ferrets anatomia ultrassonográfica dos linfonodos abdominais de furões europeus hígidos ultrasonography and fine needle aspirate cytology of the mesenteric lymph node in normal domestic ferrets (mustela putorius furo) bsava manual of canine and feline abdominal imaging. gloucester: british small animal veterinary association the peritoneal space characterization of normal and abnormal canine superficial lymph nodes using gray-scale b-mode, color flow mapping, power, and spectral doppler ultrasonography: a multivariate study observations upon the size of the spleen splenomegaly in the ferret. gainesville: eastern states veterinary association ferret coronavirus-associated diseases aleutian disease in the ferret mycobacterial infection in the ferret gastrointestinal diseases diagnostic accuracy of gray-scale ultrasonography for the detection of hepatic and splenic lymphoma in dogs ultrasongraphic findings in hepatic and splenic lymphosarcoma in dogs and cats primary bone lymphoma: radiographic-mr imaging correlation mri features of cns lymphoma in dogs and cats primary cns lymphoma in the spinal cord: clinical manifestations may precede mri delectability bienzle d. laboratory findings, histopathology, and immunophenotype of lymphoma in domestic ferrets the senior ferret (mustela putorius furo) key: cord-005090-l676wo9t authors: gao, chao; liu, jiming; zhong, ning title: network immunization and virus propagation in email networks: experimental evaluation and analysis date: 2010-07-14 journal: knowl inf syst doi: 10.1007/s10115-010-0321-0 sha: doc_id: 5090 cord_uid: l676wo9t network immunization strategies have emerged as possible solutions to the challenges of virus propagation. in this paper, an existing interactive model is introduced and then improved in order to better characterize the way a virus spreads in email networks with different topologies. the model is used to demonstrate the effects of a number of key factors, notably nodes’ degree and betweenness. experiments are then performed to examine how the structure of a network and human dynamics affects virus propagation. the experimental results have revealed that a virus spreads in two distinct phases and shown that the most efficient immunization strategy is the node-betweenness strategy. moreover, those results have also explained why old virus can survive in networks nowadays from the aspects of human dynamics. the internet, the scientific collaboration network and the social network [15, 32] . in these networks, nodes denote individuals (e.g. computers, web pages, email-boxes, people, or species) and edges represent the connections between individuals (e.g. network links, hyperlinks, relationships between two people or species) [26] . there are many research topics related to network-like environments [23, 34, 46] . one interesting and challenging subject is how to control virus propagation in physical networks (e.g. trojan viruses) and virtual networks (e.g. email worms) [26, 30, 37] . currently, one of the most popular methods is network immunization where some nodes in a network are immunized (protected) so that they can not be infected by a virus or a worm. after immunizing the same percentages of nodes in a network, the best strategy can minimize the final number of infected nodes. valid propagation models can be used in complex networks to predict potential weaknesses of a global network infrastructure against worm attacks [40] and help researchers understand the mechanisms of new virus attacks and/or new spreading. at the same time, reliable models provide test-beds for developing or evaluating new and/or improved security strategies for restraining virus propagation [48] . researchers can use reliable models to design effective immunization strategies which can prevent and control virus propagation not only in computer networks (e.g. worms) but also in social networks (e.g. sars, h1n1, and rumors). today, more and more researchers from statistical physics, mathematics, computer science, and epidemiology are studying virus propagation and immunization strategies. for example, computer scientists focus on algorithms and the computational complexities of strategies, i.e. how to quickly search a short path from one "seed" node to a targeted node just based on local information, and then effectively and efficiently restrain virus propagation [42] . epidemiologists focus on the combined effects of local clustering and global contacts on virus propagation [5] . generally speaking, there are two major issues concerning virus propagation: 1. how to efficiently restrain virus propagation? 2. how to accurately model the process of virus propagation in complex networks? in order to solve these problems, the main work in this paper is to (1) systematically compare and analyze representative network immunization strategies in an interactive email propagation model, (2) uncover what the dominant factors are in virus propagation and immunization strategies, and (3) improve the predictive accuracy of propagation models through using research from human dynamics. the remainder of this paper is organized as follows: sect. 2 surveys some well-known network immunization strategies and existing propagation models. section 3 presents the key research problems in this paper. section 4 describes the experiments which are performed to compare different immunization strategies with the measurements of the immunization efficiency, the cost and the robustness in both synthetic networks (including a synthetic community-based network) and two real email networks (the enron and a university email network), and analyze the effects of network structures and human dynamics on virus propagation. section 5 concludes the paper. in this section, several popular immunization strategies and typical propagation models are reviewed. an interactive email propagation model is then formulated in order to evaluate different immunization strategies and analyze the factors that influence virus propagation. network immunization is one of the well-known methods to effectively and efficiently restrain virus propagation. it cuts epidemic paths through immunizing (injecting vaccines or patching programs) a set of nodes from a network following some well-defined rules. the immunized nodes, in most published research, are all based on node degrees that reflect the importance of a node in a network, to a certain extent. in this paper, the influence of other properties of a node (i.e. betweenness) on immunization strategies will be observed. pastor-satorras and vespignani have studied the critical values in both random and targeted immunization [39] . the random immunization strategy treats all nodes equally. in a largescale-free network, the immunization critical value is g c → 1. simulation results show that 80% of nodes need to be immunized in order to recover the epidemic threshold. dezso and barabasi have proposed a new immunization strategy, named as the targeted immunization [9] , which takes the actual topology of a real-world network into consideration. the distributions of node degrees in scale-free networks are extremely heterogeneous. a few nodes have high degrees, while lots of nodes have low degrees. the targeted immunization strategy aims to immunize the most connected nodes in order to cut epidemic paths through which most susceptible nodes may be infected. for a ba network [2] , the critical value of the targeted immunization strategy is g c ∼ e −2 mλ . this formula shows that it is always possible to obtain a small critical value g c even if the spreading rate λ changes drastically. however, one of the limitations of the targeted immunization strategy is that it needs to know the information of global topology, in particular the ranking of the nodes must be clearly defined. this is impractical and uneconomical for handling large-scale and dynamic-evolving networks, such as p2p networks or email networks. in order to overcome this shortcoming, a local strategy, namely the acquaintance immunization [8, 16] , has been developed. the motivation for the acquaintance immunization is to work without any global information. in this strategy, p % of nodes are first selected as "seeds" from a network, and then one or more of their direct acquaintances are immunized. because a node with higher degree has more links in a scale-free network, it will be selected as a "seed" with a higher probability. thus, the acquaintance immunization strategy is more efficient than the random immunization strategy, but less than the targeted immunization strategy. moreover, there is another issue which limits the effectiveness of the acquaintance immunization: it does not differentiate nodes, i.e. randomly selects "seed" nodes and their direct neighbors [17] . another effective distributed strategy is the d-steps immunization [12, 17] . this strategy views the decentralized immunization as a graph covering problem. that is, for a node v i , it looks for a node to be immunized that has the maximal degree within d steps of v i . this method only uses the local topological information within a certain range (e.g. the degree information of nodes within d steps). thus, the maximal acquaintance strategy can be seen as a 1-step immunization. however, it does not take into account domain-specific heuristic information, nor is it able to decide what the value of d should be in different networks. the immunization strategies described in the previous section are all based on node degrees. the way different immunized nodes are selected is illustrated in fig. 1 1 an illustration of different strategies. the targeted immunization will directly select v 5 as an immunized node based on the degrees of nodes. suppose that v 7 is a "seed" node. v 6 will be immunized based on the maximal acquaintance immunization strategy, and v 5 will be indirectly selected as an immunized node based on the d-steps immunization strategy, where d = 2 fig. 2 an illustration of betweenness-based strategies. if we select one immunized node, the targeted immunization strategy will directly select the highest-degree node, v 6 . the node-betweenness strategy will select v 5 as it has the highest node betweenness. the edge-betweenness strategy will select one of v 3 , v 4 and v 5 because the edges, l 1 and l 2 , have the highest edge betweenness the highest-degree nodes from a network, many approaches cut epidemic paths by means of increasing the average path length of a network, for example by partitioning large-scale networks based on betweenness [4, 36] . for a network, node (edge) betweenness refers to the number of the shortest paths that pass through a node (edge). a higher value of betweenness means that the node (edge) links more adjacent communities and will be frequently used in network communications. although [19] have analyzed the robustness of a network against degree-based and betweenness-based attacks, the spread of a virus in a propagation model is not considered, so the effects of different measurements on virus propagation is not clear. is it possible to restrain virus propagation, especially from one community to another, by immunizing nodes or edges which have higher betweenness. in this paper, two types of betweenness-based immunization strategies will be presented, i.e. the node-betweenness strategy and the edge-betweenness strategy. that is, the immunized nodes are selected in the descending order of node-and edge-betweenness, in an attempt to better understand the effects of the degree and betweenness centralities on virus propagation. figure 2 shows that if v 4 is immunized, the virus will not propagate from one part of the network to another. the node-betweenness strategy will select v 5 as an immunized node, which has the highest node betweenness, i.e. 41. the edge-betweenness strategy will select the terminal nodes of l 1 or l 2 (i.e. v 3 , v 4 or v 4 , v 5 ) as they have the highest edge betweenness. as in the targeted immunization, the betweenness-based strategies also require information about the global betweenness of a network. the experiments presented in this paper is to find a new measurement that can be used to design a highly efficient immunization strategy. the efficiency of these strategies is compared both in synthetic networks and in real-world networks, such as the enron email network described by [4] . in order to compare different immunization strategies, a propagation model is required to act as a test-bed in order to simulate virus propagation. currently, there are two typical models: (1) the epidemic model based on population simulation and (2) an interactive email model which utilizes individual-based simulation. lloyd and may have proposed an epidemic propagation model to characterize virus propagation, a typical mathematical model based on differential equations [26] . some specific epidemic models, such as si [37, 38] , sir [1, 30] , sis [14] , and seir [11, 28] , have been developed and applied in order to simulate virus propagation and study the dynamic characteristics of whole systems. however, these models are all based on the mean-filed theory, i.e. differential equations. this type of black-box modeling approach only provides a macroscopic understanding of virus propagation-they do not give much insight into microscopic interactive behavior. more importantly, some assumptions, such as a fully mixed (i.e. individuals that are connected with a susceptible individual will be randomly chosen from the whole population) [33] and equiprobable contacts (i.e. all nodes transmit the disease with the same probability and no account is taken of the different connections between individuals) may not be valid in the real world. for example, in email networks and instant message (im) networks, communication and/or the spread of information tend to be strongly clustered in groups or communities that have more closer relationships rather than being equiprobable across the whole network. these models may also overestimate the speed of propagation [49] . in order to overcome the above-mentioned shortcomings, [49] have built an interactive email model to study worm propagation, in which viruses are triggered by human behavior, not by contact probabilities. that is to say, the node will be infected only if a user has checked his/her email-box and clicked an email with a virus attachment. thus, virus propagation in the email network is mainly determined by two behavioral factors: email-checking time intervals (t i ) and email-clicking probabilities (p i ), where i ∈ [1, n ] , n is the total number of users in a network. t i is determined by a user's own habits; p i is determined both by user security awareness and the efficiency of the firewall. however, the authors do not provide much information about how to restrain worm propagation. in this paper, an interactive email model is used as a test-bed to study the characteristics of virus propagation and the efficiency of different immunization strategies. it is readily to observe the microscopic process of worm propagating through this model, and uncover the effects of different factors (e.g. the power-law exponent, human dynamics and the average path length of the network) on virus propagation and immunization strategies. unlike other models, this paper mainly focuses on comparing the performance of degree-based strategies and betweenness-based strategies, replacing the critical value of epidemics in a network. a detailed analysis of the propagation model is given in the following section. an email network can be viewed as a typical social network in which a connection between two nodes (individuals) indicates that they have communicated with each other before [35, 49] . generally speaking, a network can be denoted as e = v, l , where v = {v 1 , v 2 , . . . , v n } is a set of nodes and l = { v i , v j | 1 ≤ i, j ≤ n} is a set of undirected links (if v i in the hit-list of v j , there is a link between v i and v j ). a virus can propagate along links and infect more nodes in a network. in order to give a general definition, each node is represented as a tuple . -id: the node identifier, v i .i d = i. -state: the node state: i f the node has no virus, danger = 1, i f the node has virus but not in f ected, in f ected = 2, i f the node has been in f ected, immuni zed = 3, i f the node has been immuni zed. -nodelink: the information about its hit-list or adjacent neighbors, i.e. v i .n odelink = { i, j | i, j ∈ l}. -p behavior : the probability that a node will to perform a particular behavior. -b action : different behaviors. -virusnum: the total number of new unchecked viruses before the next operation. -newvirus: the number of new viruses a node receives from its neighbors at each step. in addition, two interactive behaviors are simulated according to [49] , i.e. the emailchecking time intervals and the email-clicking probabilities both follow gaussian distributions, if the sample size goes to infinity. for the same user i, the email-checking interval t i (t) in [49] has been modeled by a poisson distribution, i.e. t i (t) ∼ λe −λt . thus, the formula for p behavior in the tuple can be written as p 1 behavior = click prob and p 2 behavior = checkt ime. -clickprob is the probability of an user clicking a suspected email, -checkrate is the probability of an user checking an email, -checktime is the next time the email-box will be checked, v i .p 2 behavior = v i .checkt ime = ex pgenerator(v i .check rate). b action can be specified as b 1 action = receive_email, b 2 action = send_email, and b 3 action = update_email. if a user receives a virus-infected email, the corresponding node will update its state, i.e. v i .state ← danger. if a user opens an email that has a virus-infected attachment, the node will adjust its state, i.e. v i .state ← in f ected, and send this virus email to all its friends, according to its hit-list. if a user is immunized, the node will update its state to v i .state ← immuni zed. in order to better characterize virus propagation, some assumptions are made in the interactive email model: -if a user opens an infected email, the node is infected and will send viruses to all the friends on its hit-list; -when checking his/her mailbox, if a user does not click virus emails, it is assumed that the user deletes the suspected emails; -if nodes are immunized, they will never send virus emails even if a user clicks an attachment. the most important measurement of the effectiveness of an immunization strategy is the total number of infected nodes after virus propagation. the best strategy can effectively restrain virus propagation, i.e. the total number of infected nodes is kept to a minimum. in order to evaluate the efficiency of different immunization strategies and find the relationship between local behaviors and global dynamics, two statistics are of particular interest: 1. sid: the sum of the degrees of immunized nodes that reflects the importance of nodes in a network 2. apl: the average path length of a network. this is a measurement of the connectivity and transmission capacity of a network where d i j is the shortest path between i and j. if there is no path between i and j, d i j → ∞. in order to facilitate the computation, the reciprocal of d i j is used to reflect the connectivity of a network: if there is no path between i and j, d −1 i j = 0. based on these definitions, the interactive email model given in sect. 2.3 can be used as a test-bed to compare different immunization strategies and uncover the effects of different factors on virus propagation. the specific research questions addressed in this paper can be summarized as follows: 1. how to evaluate network immunization strategies? how to determine the performance of a particular strategy, i.e. in terms of its efficiency, cost and robustness? what is the best immunization strategy? what are the key factors that affect the efficiency of a strategy? 2. what is the process of virus propagation? what effect does the network structure have on virus propagation? 3. what effect do human dynamics have on virus propagation? the simulations in this paper have two phases. first, a existing email network is established in which each node has some of the interactive behaviors described in sect. 2.3. next, the virus propagation in the network is observed and the epidemic dynamics are studied when applying different immunization strategies. more details can be found in sect. 4. in this section, the simulation process and the structures of experimental networks are presented in sects. 4.1 and 4.2. section 4.3 uses a number of experiments to evaluate the performance (e.g. efficiency, cost and robustness) of different immunization strategies. specifically, the experiments seek to address whether or not betweenness-based immunization strategies can restrain worm propagation in email networks, and which measurements can reflect and/or characterize the efficiency of immunization strategies. finally, sects. 4.4 and 4.5 presents an in-depth analysis in order to determine the effect of network structures and human dynamics on virus propagation. the experimental process is illustrated in fig. 3 . some nodes are first immunized (protected) from the network using different strategies. the viruses are then injected into the network in order to evaluate the efficiency of those strategies by comparing the total number of infected nodes. two methods are used to select the initial infected nodes: random infection and malicious infection, i.e. infecting the nodes with maximal degrees. the user behavior parameters are based on the definitions in sect. 2.3, where μ p = 0.5, σ p = 0.3, μ t = 40, and σ t = 20. since the process of email worm propagation is stochastic, all results are averaged over 100 runs. the virus propagation algorithm is specified in alg. 1. many common networks have presented the phenomenon of scale-free [2, 21] , where nodes' degrees follow a power-law distribution [42] , i.e. the fraction of nodes having k edges, p(k), decays according to a power law p(k) ∼ k −α (where α is usually between 2 and 3) [29] . recent research has shown that email networks also follow power-law distributions with a long tail [35, 49] . therefore, in this paper, three synthetic power-law networks and a synthetic community-based network, generated using the glp algorithm [6] where the power can be tuned. the three synthetic networks all have 1000 nodes with α =1.7, 2.7, and 3.7, respectively. the statistical characteristics and visualization of the synthetic community-based network are shown in table 1 and fig. 4c , f, respectively. in order to reflect the characteristics of a real-world network, the enron email network 1 which is built by andrew fiore and jeff heer, and the university email network 2 which is complied by the members of the university rovira i virgili (tarragona) will also be studied. the structure and degree distributions of these networks are shown in table 2 and fig. 4 . in particular, the cumulative distributions are estimated with maximum likelihood using the method provided by [7] . the degree statistics are shown in table 9 . in this section, a comparison is made of the effectiveness of different strategies in an interactive email model. experiments are then used to evaluate the cost and robustness of each strategy. input: nodedata[nodenum] stores the topology of an email network. timestep is the system clock. v 0 is the set of initially infected nodes. output: simnum[timestep] [k] stores the number of infected nodes in the network in the k th simulation. (1) for k=1 to runtime //we run 100 times to obtain an average value (2) nodedata[nodenum] ← initializing an email network as well as users' checking time and clicking probability; (3) nodedata[nodenum] ← choosing immunized nodes based on different immunization strategies and adjusting their states; (4) while timestep < endsimul //there are 600 steps at each time (5) for i=1 to nodenum (6) if nodedata[i].checktime==0 (7) prob← computing the probability of opening a virus-infected email based on user's clickprob and virusnum (8) if send a virus to all friends according to its hit-list (12) endif (13) endif (14) endfor (15) for i=1 to nodenum (16) update the next checktime based on user's checkrate (17) nodedata the immunization efficiency of the following immunization strategies are compared: the targeted and random strategies [39] , the acquaintance strategy (random and maximal neighbor) [8, 16] , the d-steps strategy (d = 2 and d = 3) [12, 17] (which is introduced in sect. 2.1), the bridges between different communities: 100 the whole network: α=1.77, k =8.34 and the proposed betweenness-based strategy (node-and edge-betweenness). in the initial set of experiments, the proportion of immunized nodes (5, 10, and 30%) are varied in the synthetic networks and the enron email network. table 3 shows the simulation results in the enron email network which is initialized with two infected nodes. figure 5 shows the average numbers of infected nodes over time. tables 4, 5 , and 6 show the numerical results in three synthetic networks, respectively. the simulation results show that the node-betweenness immunization strategy yields the best results (i.e. the minimum number of infected nodes, f) except for the case where 5% of the nodes in the enron network are immunized under a malicious attack. the average degree of the enron network is k = 3.4. this means that only a few nodes have high degrees, others have low degrees (see table 9 ). in such a network, if nodes with maximal degrees are infected, viruses will rapidly spread in the network and the final number of infected nodes will be larger than in other cases. the targeted strategy therefore does not perform any better than the node-betweenness strategy. in fact, as the number of immunized nodes increases, the efficiency of the node-betweenness immunization increases proportionally there are two infected nodes with different attack modes. if there is no immunization, the final number of infected nodes is 937 with a random attack and 942 with a malicious attack, and ap l = 751.36(10 −4 ). the total simulation time t = 600 more than the targeted strategy. therefore, if global topological information is available, the node-betweenness immunization is the best strategy. the maximal s i d is obtained using the targeted immunization. however, the final number of infected nodes (f) is consistent with the average path length (ap l) but not with the s i d. that is to say, controlling a virus epidemic does not depend on the degrees of immunized nodes but on the path length of a whole network. this also explains why the efficiency of the node-betweenness immunization strategy is better than that of the targeted immunization strategy. the node-betweenness immunization selects nodes based on the average path length, while the targeted immunization strategy selects based on the size of degrees. a more in-depth analysis is undertaken by comparing the change of the ap l with respect to the different strategies used in the synthetic networks. the results are shown in fig. 6 . figure 7a , b compare the change of the final number of infected nodes over time, which correspond to fig. 6c , d, respectively. these numerical results validate the previous assertion that the average path length can be used as a measurement to design an effective immunization strategy. the best strategy is to divide the whole network into different sub-networks and increase the average path length of a network, hence cut the epidemic paths. in this paper, all comparative results are the average over 100 runs using the same infection model (i.e. the virus propagation is compared for both random and malicious attacks) and user behavior model (i.e. all simulations use the same behavior parameters, as shown in sect. 4.1). thus, it is more reasonable and feasible to just evaluate how the propagation of a virus is affected by immunization strategies, i.e. avoiding the effects caused by the stochastic process, the infection model and the user behavior. it can be seen that the edge-betweenness strategy is able to find some nodes with high degrees of centrality and then integrally divide a network into a number of sub-networks (e.g. v 4 in fig. 2) . however, compared with the nodes (e.g. v 5 in fig. 2 ) selected by the node-betweenness strategy, the nodes with higher edge betweenness can not cut the epidemic paths as they can not effectively break the whole structure of a network. in fig. 2 , the synthetic community-based network and the university email network are used as examples to illustrate why the edge-betweenness strategy can not obtain the same immunization efficiency as the node-betweenness strategy. to select two nodes as immunized nodes from fig. 2 , the node-betweenness immunization will select {v 5 , v 3 } by using the descending order of node betweenness. however, the edge-betweenness strategy can select {v 3 , v 4 } or {v 4 , v 5 } because the edges, l 1 and l 2 , have the highest edge betweenness. this result shows that the node-betweenness strategy can not only effectively divide the whole network into two communities, but also break the interior structure of communities. although the edgebetweenness strategy can integrally divided the whole network into two parts, viruses can also propagate in each community. many networks commonly contain the structure shown in fig. 2 , for example, the enron email network and university email networks. table 7 and fig. 8 present the results of the synthetic community-based network. table 8 compares different strategies in the university email network, which also has some self-similar community structures [18] . these results further validate the analysis stated above. from the above experiments, the following conclusions can be made: tables 4-8 , ap l can be used as a measurement to evaluate the efficiency of an immunization strategy. thus, when designing a distributed immunization strategy, attentions should be paid on those nodes that have the largest impact on the apl value. 2. if the final number of infected nodes is used as a measure of efficiency, then the nodebetweenness immunization strategy is more efficient than the targeted immunization strategy. 3. the power-law exponent (α) affects the edge-betweenness immunization strategy, but has a little impact on other strategies. in the previous section, the efficiency of different immunization strategies is evaluated in terms of the final number of infected nodes when the propagation reaches an equilibrium state. by doing experiments in synthetic networks, synthetic community-based network, the enron email network and the university email network, it is easily to find that the node-betweenness immunization strategy has the highest efficiency. in this section, the performance of the different strategies will be evaluated in terms of cost and robustness, as in [20] . it is well known that the structure of a social network or an email network constantly evolves. it is therefore interesting to evaluate how changes in structure affect the efficiency of an immunization strategy. -the cost can be defined as the number of nodes that need to be immunized in order to achieve a given level of epidemic prevalence ρ. generally, ρ → 0. there are some parameters which are of particular interest: f is the fraction of nodes that are immunized; f c is the critical value of the immunization when ρ → 0; ρ 0 is the infection density when no immunization strategy is implemented; ρ f is the infection density with a given immunization strategy. figure 9 shows the relationship between the reduced prevalence ρ f /ρ 0 and f. it can be seen that the node-betweenness immunization has the lowest prevalence for the smallest number of protected nodes. the immunization cost increases as the value of α increases, i.e. in order to achieve epidemic prevalence ρ → 0, the node-betweenness immunization strategy needs 20, 25, and 30% of nodes to be immunized, respectively, in the three synthetic networks. this is because the node-betweenness immunization strategy can effectively break the network structure and increase the path length of a network with the same number of immunized nodes. -the robustness shows a plot of tolerance against the dynamic evolution of a network, i.e. the change of power-law exponents (α). figure 10 shows the relationship between the immunized threshold f c and α. a low level of f c with a small variation indicates that the immunization strategy is robust. the robustness is important when an immunization strategy is deployed into a scalable and dynamic network (e.g. p2p and email networks). figure 10 also shows the robustness of the d-steps immunization strategy is close to that of the targeted immunization; the node-betweenness strategy is the most robust. [49] have compared virus propagation in synthetic networks with α = 1.7 and α = 1.1475, and pointed out that initial worm propagation has two phases. however, they do not give a detailed explanation of these results nor do they compare the effect of the power-law exponent on different immunization strategies during virus propagation. table 9 presents the detailed degree statistics for different networks, which can be used to examine the effect of the power-law exponent on virus propagation and immunization strategies. first, virus propagation in non-immunized networks is discussed. figure 11a shows the changes of the average number of infected nodes over time; fig. 11b gives the average degree of infected nodes at each time step. from the results, it can be seen that 1. the number of infected nodes in non-immunized networks is determined by attack modes but not the power-law exponent. in figs. 11a , b, three distribution curves (α = 1.7, 2.7, and 3.7) overlap with each other in both random and malicious attacks. the difference between them is that the final number of infected nodes with a malicious attack is larger than that with a random attack, as shown in fig. 11a , reflecting the fact that a malicious attack is more dangerous than a random attack. 2. a virus spreads more quickly in a network with a large power-law exponent than that with a small exponent. because a malicious attack initially infects highly connected nodes, the average degree of the infected nodes decreases in a shorter time comparing to a random attack (t 1 < t 2). moreover, the speed and range of the infection is amplified by those highly connected nodes. in phase i, viruses propagate very quickly and infect most nodes in a network. however, in phase ii, the number of total infected nodes grows slowly (fig. 11a) , because viruses aim to infect those nodes with low degrees (fig. 11b) , and a node with fewer links is more difficult to be infected. in order to observe the effect of different immunization strategies on the average degree of infected nodes in different networks, 5% of the nodes are initially protected against random and malicious attacks. figure 12 shows the simulation results. from this experiment, it can be concluded that 1. the random immunization has no effect on restraining virus propagation because the curves of the average degree of the infected nodes are basically coincident with the curves in the non-immunization case. 2. comparing fig. 12a , b, c and d, e, f, respectively, it can be seen that the peak value of the average degree is the largest in the network with α=1.7 and the smallest in the network with α=3.7. this is because the network with a lower exponent has more highly connected nodes (i.e. the range of degrees is between 50 and 80), which serve as amplifiers in the process of virus propagation. 3. as α increases, so does the number of infected nodes and the virus propagation duration (t 1 < t 2 < t 3). because a larger α implies a larger ap l , the number of infected nodes will increase; if the network has a larger exponent, a virus need more time to infect those nodes with medium or low degrees. fig. 14 the average number of infected nodes and the average degree of infected nodes, with respect to time when virus spreading in different networks. we apply the targeted immunization to protect 30% nodes in the network first, consider the process of virus propagation in the case of a malicious attack where 30% of the nodes are immunized using the edge-betweenness immunization strategy. there are two intersections in fig. 13a . point a is the intersection of two curves net1 and net3, and point b is the intersection of net2 and net1. under the same conditions, fig. 13a shows that the total number of infected nodes is the largest in net1 in phase i. corresponding to fig. 13b , the average degree of infected nodes in net1 is the largest in phase i. as time goes on, the rate at which the average degree falls is the fastest in net1, as shown in fig. 13b . this is because there are more highly connected nodes in net1 than in the others (see table 9 ). after these highly connected nodes are infected, viruses attempt to infect the nodes with low degrees. therefore, the average degree in net3 that has the smallest power-law exponent is larger than those in phases ii and iii. the total number of infected nodes in net3 continuously increases, exceeding those in net1 and net2. the same phenomenon also appears in the targeted immunization strategy, as shown in fig. 14. the email-checking intervals in the above interactive email model (see sect. 2.3) is modeled using a poisson process. the poisson distribution is widely used in many real-world models to statistically describe human activities, e.g. in terms of statistical regularities on the frequency of certain events within a period of time [25, 49] . statistics from user log files to databases that record the information about human activities, show that most observations on human behavior deviate from a poisson process. that is to say, when a person engages in certain activities, his waiting intervals follow a power-law distribution with a long tail [27, 43] . vazquez et al. [44] have tried to incorporate an email-sending interval distribution, characterized by a power-law distribution, into a virus propagation model. however, their model assumes that a user is instantly infected after he/she receives a virus email, and ignores the impact of anti-virus software and the security awareness of users. therefore, there are some gaps between their model and the real world. in this section, the statistical properties associated with a single user sending emails is analyzed based on the enron dataset [41] . the virus spreading process is then simulated using an improved interactive email model in order to observe the effect of human behavior on virus propagation. research results from the study of statistical regularities or laws of human behavior based on empirical data can offer a valuable perspective to social scientists [45, 47] . previous studies have also used models to characterize the behavioral features of sending emails [3, 13, 22] , but their correctness needs to be further empirically verified, especially in view of the fact that there exist variations among different types of users. in this paper, the enron email dataset is used to identify the characteristics of human email-handling behavior. due to the limited space, table 10 presents only a small amount of the employee data contained in the database. as can be seen from the table, the interval distribution of email sent by the same user is respectively measured using different granularities: day, hour, and minute. figure 15 shows that the waiting intervals follow a heavy-tailed distribution. the power-law exponent as the day granularity is not accurate because there are only a few data points. if more data points are added, a power-law distribution with long tail will emerge. note that, there is a peak at t = 16 as measured at an hour granularity. eckmann et al. [13] have explained that the peak in a university dataset is the interval between the time people leave work and the time they return to their offices. after curve fitting, see fig. 15 , the waiting interval exponent is close to 1.3, i.e. α ≈ 1.3 ± 0.5. although it has been shown that an email-sending distribution follows a power-law by studying users in the enron dataset, it is still not possible to assert that all users' waiting intervals follow a power-law distribution. it can only be stated that the distribution of waiting intervals has a long-tail characteristic. it is also not possible to measure the intervals between email checking since there is no information about login time in the enron dataset. however, combing research results from human web browsing behavior [10] and the effect of non-poisson activities on propagation in the barabasi group [44] , it can be found that there are similarities between the distributions of email-checking intervals and email-sending intervals. the following section uses a power-law distribution to characterize the behavior associated with email-checking in order to observe the effect human behavior has on the propagation of an email virus. based on the above discussions, a power-law distribution is used to model the email-checking intervals of a user i, instead of the poisson distribution used in [49] , i.e. t i (τ ) ∼ τ −α . an analysis of the distribution of the power-law exponent (α) for different individuals in web browsing [10] and in the enron dataset shows that the power-law exponent is approximately 1.3. in order to observe and quantitatively analyze the effect that the email-checking interval has on virus propagation, the email-clicking probability distribution (p i ) in our model is consistent with the one used by [49] , i.e. the security awareness of different users in the network follows a normal distribution, p i ∼ n (0.5, 0.3 2 ). figure 16 shows that following a random attack viruses quickly propagate in the enron network if the email-checking intervals follow a power-law distribution. the results are consistent with the observed trends in real computer networks [31] , i.e. viruses initially spread explosively, then enter a long latency period before becoming active again following user activity. the explanation for this is that users frequently have a short period of focused activity followed by a long period of inactivity. thus, although old viruses may be killed by anti-virus software, they can still intermittently break out in a network. that is because some viruses are hidden by inactive users, and cannot be found by anti-virus software. when the inactive users become active, the virus will start to spread again. the effect of human dynamics on virus propagation in three synthetic networks is also analyzed by applying the targeted [9] , d-steps [17] and aoc-based strategy [24] . the numerical results are shown in table. 11 and fig. 17 . from the above experiments, the following conclusions can be made: 1. based on the enron email dataset and recent research on human dynamics, the emailchecking intervals in an interactive email model should be assigned based on a power-law distribution. 2. viruses can spread very quickly in a network if users' email-checking intervals follow a power-law distribution. in such a situation, viruses grow explosively at the initial stage and then grow slowly. the viruses remain in a latent state and await being activated by users. in this paper, a simulation model for studying the process of virus propagation has been described, and the efficiency of various existing immunization strategies has been compared. in particular, two new betweenness-based immunization strategies have been presented and validated in an interactive propagation model, which incorporates two human behaviors based on [49] in order to make the model more practical. this simulation-based work can be regarded as a contribution to the understanding of the inter-reactions between a network structure and local/global dynamics. the main results are concluded as follows: 1. some experiments are used to systematically compare different immunization strategies for restraining epidemic spreading, in synthetic scale-free networks including the community-based network and two real email networks. the simulation results have shown that the key factor that affects the efficiency of immunization strategies is apl, rather than the sum of the degrees of immunized nodes (sid). that is to say, immunization strategy should protect nodes with higher connectivity and transmission capability, rather than those with higher degrees. 2. some performance metrics are used to further evaluate the efficiency of different strategies, i.e. in terms of their cost and robustness. simulation results have shown that the d-steps immunization is a feasible strategy in the case of limited resources and the nodebetweenness immunization is the best if the global topological information is available. 3. the effects of power-law exponents and human dynamics on virus propagation are analyzed. more in-depth experiments have shown that viruses spread faster in a network with a large power-law exponent than that with a small one. especially, the results have explained why some old viruses can still propagate in networks up till now from the perspective of human dynamics. the mathematical theory of infectious diseases and its applications emergence of scaling in random networks the origin of bursts and heavy tails in human dynamics cluster ranking with an application to mining mailbox networks small worlds' and the evolution of virulence: infection occurs locally and at a distance on distinguishing between internet power law topology generators power-law distribution in empirical data efficient immunization strategies for computer networks and populations halting viruses in scale-free networks dynamics of information access on the web a simple model for complex dynamical transitions in epidemics distance-d covering problem in scalefree networks with degree correlation entropy of dialogues creates coherent structure in email traffic epidemic threshold in structured scale-free networks on power-law relationships of the internet topology improving immunization strategies immunization of real complex communication networks self-similar community structure in a network of human interactions attack vulnerability of complex networks targeted local immunization in scale-free peer-to-peer networks the large scale organization of metabolic networks probing human response times periodic subgraph mining in dynamic networks. knowledge and information systems autonomy-oriented search in dynamic community networks: a case study in decentralized network immunization characterizing web usage regularities with information foraging agents how viruses spread among computers and people on universality in human correspondence activity enhanced: simple rules with complex dynamics network motifs simple building blocks of complex networks epidemics and percolation in small-world network code-red: a case study on the spread and victims of an internet worm the structure of scientific collaboration networks the spread of epidemic disease on networks the structure and function of complex networks email networks and the spread of computer viruses partitioning large networks without breaking communities epidemic spreading in scale-free networks epidemic dynamics and endemic states in complex networks immunization of complex networks computer virus propagation models the enron email dataset database schema and brief statistical report exploring complex networks modeling bursts and heavy tails in human dynamics impact of non-poissonian activity patterns on spreading process predicting the behavior of techno-social systems a decentralized search engine for dynamic web communities a twenty-first century science an environment for controlled worm replication and analysis modeling and simulation study of the propagation and defense of internet e-mail worms chao gao is currently a phd student in the international wic institute, college of computer science and technology, beijing university of technology. he has been an exchange student in the department of computer science, hong kong baptist university. his main research interests include web intelligence (wi), autonomy-oriented computing (aoc), complex networks analysis, and network security. department at hong kong baptist university. he was a professor and the director of school of computer science at university of windsor, canada. his current research interests include: autonomy-oriented computing (aoc), web intelligence (wi), and self-organizing systems and complex networks, with applications to: (i) characterizing working mechanisms that lead to emergent behavior in natural and artificial complex systems (e.g., phenomena in web science, and the dynamics of social networks and neural systems), and (ii) developing solutions to large-scale, distributed computational problems (e.g., distributed scalable scientific or social computing, and collective intelligence). prof. liu has contributed to the scientific literature in those areas, including over 250 journal and conference papers, and 5 authored research monographs, e.g., autonomy-oriented computing: from problem solving to complex systems modeling (kluwer academic/springer) and spatial reasoning and planning: geometry, mechanism, and motion (springer). prof. liu has served as the editor-in-chief of web intelligence and agent systems, an associate editor of ieee transactions on knowledge and data engineering, ieee transactions on systems, man, and cybernetics-part b, and computational intelligence, and a member of the editorial board of several other international journals. laboratory and is a professor in the department of systems and information engineering at maebashi institute of technology, japan. he is also an adjunct professor in the international wic institute. he has conducted research in the areas of knowledge discovery and data mining, rough sets and granular-soft computing, web intelligence (wi), intelligent agents, brain informatics, and knowledge information systems, with more than 250 journal and conference publications and 10 books. he is the editor-in-chief of web intelligence and agent systems and annual review of intelligent informatics, an associate editor of ieee transactions on knowledge and data engineering, data engineering, and knowledge and information systems, a member of the editorial board of transactions on rough sets. key: cord-125089-1lfmqzmc authors: chandrasekhar, arun g.; goldsmith-pinkham, paul; jackson, matthew o.; thau, samuel title: interacting regional policies in containing a disease date: 2020-08-24 journal: nan doi: nan sha: doc_id: 125089 cord_uid: 1lfmqzmc regional quarantine policies, in which a portion of a population surrounding infections are locked down, are an important tool to contain disease. however, jurisdictional governments -such as cities, counties, states, and countries -act with minimal coordination across borders. we show that a regional quarantine policy's effectiveness depends upon whether (i) the network of interactions satisfies a balanced-growth condition, (ii) infections have a short delay in detection, and (iii) the government has control over and knowledge of the necessary parts of the network (no leakage of behaviors). as these conditions generally fail to be satisfied, especially when interactions cross borders, we show that substantial improvements are possible if governments are proactive: triggering quarantines in reaction to neighbors' infection rates, in some cases even before infections are detected internally. we also show that even a few lax governments -those that wait for nontrivial internal infection rates before quarantining -impose substantial costs on the whole system. our results illustrate the importance of understanding contagion across policy borders and offer a starting point in designing proactive policies for decentralized jurisdictions. global problems, from climate change to disease control, are hard to address without policy coordination across borders. in particular, pandemics, like covid-19, are challenging to contain because governments fail to coordinate efforts. without vaccines or herd immunity, governments have responded to infections by limiting constituents' interactions in areas where an outbreak exceeds a threshold of infections. such regional quarantine policies are used by towns, cities, counties, states, and countries, and trace to the days of the black plague. over the past 150 years, regional quarantines have been used to combat cholera, diphtheria, typhoid, flus, polio, ebola, and covid-19 [1, 2, 3, 4] , but rarely with coordination across borders. decentralized policies across jurisdictions have two major shortcomings. first, governments care primarily about their own citizens and do not account for how their infections impact other jurisdictions: the resulting lack of coordination can lead to worse overall outcomes than a global policy [5, 6, 7] . second, some governments only pay attention to what goes on within their borders, which leads them to under-forecast their own infection rates. we examine three types of quarantine policies to understand the impact of non-coordination: (i) those controlled by one actor with control of the whole society -"single regime policies," (ii) those controlled by separate jurisdictions that only react to internal infection rates -"myopic jurisdictional policies," and (iii) those controlled by separate jurisdictions that are proactive and track infections outside of their jurisdiction as well as within when deciding on when to quarantine -"proactive jurisdictional policies. " we use a general model of contagion through a network to study these policies. we first consider single regime policies. a government can quarantine everyone at once under a "global quarantine," but those are very costly (e.g., lost days of work). less costly (in the short run), and hence more common, alternatives are "regional quarantines" in which only people within some distance of observed infections are quarantined. regional quarantines, however, face two challenges. first, many diseases are difficult to detect, because individuals are either asymptomatically contagious (e.g., hiv, covid-19) [8, 9, 10] , or a government lacks resources to quickly identify infections [11, 12] . second, it may be infeasible to fully quarantine a part of the network, because of difficulties in identifying whom to quarantine or non-compliance by some people -by choice or necessity [13, 14, 15, 16, 17, 18, 19] . either way, tiny leakages can spread the disease. we show that regional quarantines curb the spread of a disease if and only if: (i) there is limited delay in observing infections, (ii) there is sufficient knowledge and control of the network to prevent leakage of infection, and (iii) the network has a certain "balanced-growth" structure. the failure of any of these conditions substantially limits quarantine effectiveness. we then examine jurisdictional policies, which are regional quarantine policies conducted by multiple, uncoordinated regimes. the regions that need to be quarantined cross borders, leading to leakage that limits their effectiveness. as we show, myopic policies do much worse than proactive ones, as they do not forecast the impact of neighboring infection rates on their own population. moreover, a few lax jurisdictions, which wait for higher infection rates before quarantining, substantially worsen outcomes for all jurisdictions. consider a large network of nodes (individuals). our theory is asymptotic, applying as the population grows (details in the si). an infectious disease begins with an infection of a node i 0 , the location of which is known, and expands via (directed) paths from i 0 . in each discrete time period, the infection spreads from each currently infected node to each of its susceptible contacts independently with probability p. a node is infectious for θ periods, after which it recovers and is no longer susceptible, though our results extend to the case in which a node can become susceptible again. the disease may exhibit a delay of τ ≤ θ periods during which an infected and contagious person does not test positive. this can be a period of asymptomatic infectiousness, a delay in testing, or healthcare access [8, 10, 20, 11, 12, 21] . after that delay, the each infected node's infection is detected with probability α < 1 (for simplicity, in the first period after the delay). α incorporates testing accuracy, availability, and decisions to test. this framework nests the susceptible-infected-recovered (sir) model and its variations including exposure, multiple infectious stages, and death [22, 23, 24, 18, 25] , agent-based models [26, 27, 28] , and others. we begin by analyzing a single jurisdiction with complete control. a (k, x)-regional policy is triggered once x or more infections are observed within distance k from the seed node i 0 ; at which point it quarantines all nodes within distance k + 1 of the seed for θ periods. this captures a commonly used policy where regions that are exposed to the disease are shut down in response to detection. we give the policymaker the advantage of knowing which node is the seed and study subsequent containment efforts. in practice, the estimation of the infection origin is an additional challenge. whether a regional policy halts infection in this setting is fully characterized by what we call growth-balance (formally defined in the si). this requires that the network have large enough expansion properties and that the expansion rate not drop too low in any part of the network. to better understand growth-balance, consider an example of a disease that is beginning to spread with a reproduction number of 3.5 and such that one in ten cases are detected in a timely manner (α = 0.1). first, consider a part of the network in which each infected person infects 3.5 others on average. if we monitor nodes within distance k = 3 of an infected node, a "typical" chain of infection would lead to roughly 3.5 + 3.5 2 + 3.5 3 = 58.625 expected cases. the chance that this goes undetected is tiny: 0.9 58.625 = 0.002. next, suppose the infection starts in part of the network where each infected person infects just one other, on average. now a chain of depth 3 leads to 1 + 1 + 1 = 3 infections. the chance that this spread remains undetected is very high: 0.9 3 = 0.72. many different networks can lead to the same average reproduction number, but have very different structures. if the distribution of reproduction numbers around the network has no pockets in which it is too low, then it is highly likely that any early infection will be detected before it goes beyond a distance of three away from the first infected node. if instead, the distribution of reproduction numbers gives a nontrivial chance that the disease starts out on a chain with lower reproduction numbers, like the 1, 1, 1, chain, then there is a high chance that it can travel several steps before being detected. given the short distances in many networks [29, 30, 31] , this allows it to be almost anywhere. supplementary figure c.1 in the si pictures a network that has a high average reproduction number, but is not growth-balanced and allows the infection to travel far from the initially infected node without detection. in the si (theorem 1) we prove that, with no delays in detection and no leakage, a (k, x)regional policy halts infection among all nodes beyond distance k +1 from i 0 with probability approaching 1 (as the population grows) if and only if the network satisfies growth-balance. growth-balance is satisfied by many, but not all, sequences of random graph models, provided that the average degree d satisfies d k → ∞ (corollary 1, si). without growth balance, a regional policy fails non-trivially even under idealized conditions. the effectiveness of a regional policy breaks down, even if a network is growth-balanced, once there is leakage (due to imperfect information, enforcement, or jurisdictional boundaries) or sufficient delay in detection. to understand how delays in detection affect a regional policy, consider two extremes. if the delay is short relative to the infectious period, the policymaker can still anticipate the disease and adjust by enlarging the area of the quarantine to include a buffer. an easy extension of the above theorem is that a regional policy with a buffer works if and only if the network is growth-balanced and the delay in detection is shorter than the diameter of the network (theorem 2, si). given that real-world networks have short average distances between nodes [32], non-trivial delays in detection allow the disease to escape a regional quarantine. next, we consider how leakage -inability to limit interactions [13] or mistakes in identifying portions of a network to quarantine [17, 18] -diminishes the effectiveness of regional policies. although minimizing leakage increases the chance that a regional quarantine will be successful, we show (theorem 3, si) that even a small amount of leakage leads to a nontrivial probability that a regional policy will fail. we can use the results from regional quarantines as a starting point to understand jurisdictional policies. for instance, leakage generally applies when interactions cross jurisdictions. figure 1 pictures two jurisdictions that fail to nicely tessellate the network. given leakage across jurisdictional borders, unless policies are fully coordinated across jurisdictions, our theoretical results indicate that they will fail to contain infections. the theory provides insights into the various hurdles that quarantine policies face, but does not provide insight into how well different types policies will fare in slowing infection and at what costs. to explore this, we simulate a contagion on a network of 140000 nodes that mimics realworld data [33, 34, 35, 21] . these simulations illustrate our theoretical results and also show the improvements that proactive policies provide relative to myopic ones. the results are robust to choices of parameters (si). the network is divided into 40 locations, each with a population of 3500. we generate the network using a geographic stochastic block model (si). the probability of interacting declines with distance. the average degree is 20.49 and nodes have 79.08% of their interactions within their own locations and 20.92% outside of their location (calibrated to data from india and the united states, including data collected during covid-19 [33, 34, 35, 21], si). we set the basic reproduction rate r 0 = 3.5 to mimic covid-19 [36], and θ = 5 and α = 0.1 ([20, 37, 38] , si). the simulated network is fairly symmetric in degree and therefore approximates satisfying growth-balance, and thus the attention in our simulations is focused on leakage and detection delay. before introducing jurisdictions, we first illustrate the effects of leakage as well as delays in detection. in figure 2 , the entire network is governed by a single policymaker using a (k, x) = (3, 1)-regional quarantine. figure 2a shows the outcomes for no delay in detection nor any leakage. consistent with theorem 1 the policy is effective: on average 277 nodes per million are infected (0.028% of the population), with 803956 node-days of quarantine per million nodes. figure 2b introduces a delay in detection. with a delay of τ = 3, infections increase, with 2256 nodes per million eventually infected (0.23% of the population) and 2301414 node-days of quarantine per million nodes. adding a buffer to correspond to the detection delay effectively makes the regional policy global, as the buffered region contains 99.98% of the population on average. figure 2c adds leakage to the setup of figure 2b , making only 95% of the intended nodes quarantined. the number of cumulative infections per million nodes increases to 5138 (0.50% of the population). the leakage increases the number of quarantined node-days to 6478055 per million nodes. we now introduce jurisdictions to the same network as before, and each location becomes its own jurisdiction. we compare two types of jurisdictional policies. in myopic policies each jurisdiction quarantines based entirely on internal infections. in proactive policies, jurisdictions track infections in other jurisdictions and predict their own -possibly undetected -infections and base their quarantines off of predicted infections (calculation details in si). in both cases, if a jurisdiction enters quarantine, all links within and to the jurisdiction are removed. figure 3 illustrates the improvement a proactive policy offers relative to myopic internal jurisdictional policies. in figure 3a finally, we also add a few "lax" jurisdictions to the setting. these are jurisdictions that are myopic and have a high threshold of internal infections before quarantining. we examine how these few lax jurisdictions worsen the outcomes for all jurisdictions. in figures 3c and 3d , four lax jurisdictions react only to infections within their own borders and wait until they have detected five infections before quarantining (simulation details, si). figure global quarantines (closing the entire network at once) and single-jurisdiction regional quarantines (with leakage) do the best on both dimensions. once jurisdictions are introduced, proactive jurisdictions quarantine earlier and have fewer recurrences than myopic jurisdictions. lax jurisdictions cause an overall higher number of quarantines, over a longer time. the proactive jurisdictional policy trades off more quarantine days for substantially fewer infection days compared to the myopic internal policy, but proactive policies do significantly better than myopic policies on both dimensions when mixed with jurisdictions using lax policies. figure 4b plots the number of person-day infections (per million) against the number of personday quarantines (per million) for six key policy scenarios. the global policy does the best on both dimensions, and the second best is the single-jurisdiction myopic strategy (which does worse than the global because of leakage). with 40 jurisdictions, both proactive policies outperform the internal, myopic policies. by far the worst, on both dimensions, is the internal, myopic policy with some lax jurisdictions. these results come from the same solutions that produce figures 2 and 3. we have shown that regional quarantine policies are likely to fail unless leakage and delays in detection are limited. multiple jurisdictions using independent policies are even less effective, as leakage occurs across jurisdictional borders. we have also shown that there are substantial improvements from proactive policies, and that a few lax jurisdictions greatly worsen the outcomes for all jurisdictions. jurisdictional policies tend to be aimed at the welfare of their internal populations, yet the external effects are large. our results underscore the importance of timely information sharing and coordination in both the design and execution of policies across jurisdictional boundaries [39] . the results also underscore the global importance of aiding poor jurisdictions. indeed, there is mounting evidence that a lack of coordination across boundaries has been damaging in the case of covid-19 [6] . the use of masks (decreasing p), social distancing (decreasing d), and increasing testing (increasing α), all help attenuate contagion, but unless they maintain the reproduction number below one, the problems identified here remain. even tiny fractions of interactions across boundaries are enough to lead to spreading in large populations. with modern interand intranational trade being a sizable portion of all economies, such interaction is difficult to avoid. nonetheless, our analysis offers insights into managing infections at smaller scales; e.g., within schools, sports, and businesses. by creating a network of interactions that is highly modular, keeping cross-modular interactions to a minimum and making sure that they are highly traceable, together with aggressive testing (especially of cross-module actors), one can come close to satisfying the conditions of our first theorem. our results also suggest caution in using statistical models to identify regions to quarantine. although contagion models are helpful for informing policy about the magnitude of an epidemic and broad dynamics, the models can give false comfort in our ability to engage in highly targeted policies, whose results can be influenced by small deviations from idealized assumptions. our growth-balance condition also points out that not all parts of a network are equal in their potential for undetected transmission. in places where the reproduction number is lower, so is the probability of observing outbreaks, enabling undetected leakage of infections. [32] watts, d. j. small worlds: the dynamics of networks between order and randomness (princeton university press, 2004). supplementary material interacting regional policies in containing a disease by chandrasekhar, goldsmith-pinkham, jackson, thau the model there are n > 1 nodes (individuals) in an unweighted, and possibly directed, network. we study the course of a disease through the network. time is discrete, with periods indexed by t ∈ n. an initial infected node, indexed by i 0 ∈ v , is the only node infected at time 0. we call this node the seed. we track the network via neighborhoods that expand outwards via (directed) paths from i 0 . let n k be all the nodes who are at (directed) distance k from node i 0 . we use n k to denote the cardinality of n k . for any node in j ∈ n k , for k < k, let n j be the number of its direct descendants and n j k be the number of its (possibly indirect) descendants in n k that are reached by never passing beyond distance k from i 0 . all unweighted network models are admitted here. additionally, all results extend directly to any weighted model in which weights are bounded above and below (e.g., probabilities of interaction). note also, that the network can be directed or undirected. the infection process proceeds as follows. in every time period t ∈ {1, 2, . . .}, an infected node i transmits the disease to each of i's neighbors independently with probability p. a newly infected node is infectious for θ ≥ 1 periods after which the node recovers and is never again infectious. the model can easily be extended to accommodate renewed susceptibility. there may be a delay in the ability to detect the disease. the number of periods of delay is given by τ with 0 ≤ τ ≤ θ. delay is a general term that can capture many things. for example, it can correspond to (a) asymptomatic infectiousness, (b) a delay in accessing health care given the onset of an infectious period, (c) any delay in the administration of testing, and so on. in the first period of an infected node's infectious period -after delay -there is a probability α that the policymaker detects it as being infected. so, potential detection happens exactly once during the first period in which the node can be detected. detection is independently and identically distributed. our results are easily extended to have a random period for detection after the delay. finally, the policymaker may face some error in their knowledge of the network. this can come from their inability to enforce exactly the interactions they wish to allow or limit, this can come from random variation in data collected to estimate interaction networks, or this can come from misspecification. if there is error, we will track a share of nodes that are within a k-neighborhood of the seed but are estimated by the policymaker to be outside the k-neighborhood. let regional policy of distance k and threshold x be such that once there are at least x infections (other than the seed) detected within distance k from the initial seed, then all nodes within distance k + 1 of i 0 are quarantined for at least θ periods. a quarantine implies all connections between nodes are severed to avoid any further transmission and the infection waits out its duration θ and dies out. implicit in this definition is that a quarantine is not instantaneous, but that infected people could have infected their friends before being shut down, which is why the nodes at distance k + 1 are quarantined. all the results below extend if we assume that it is instantaneous, but with quarantines moved back one step and path lengths in definitions correspondingly adjusted. we have assumed the policymaker knows the "seed," for simplicity -and which may take some time in reality. this provides an advantage to the policymaker, but we see substantial. containment failures despite this advantage. in order to conduct asymptotic analysis, a useful device to study the probabilities of events in question in large networks, we study a sequence of networks g(n) with n → ∞ and an associated sequence of parameters (α, p, τ, θ, k) = (α(n), p(n), τ (n), θ(n), k(n)). in what follows when we drop the index n, and it is implied unless otherwise stated. consider a network and a distance k from the initially infected node i 0 . a path of potential infection to k + 2 is a sequence of nodes i 0 , i 1 , . . . i with i ∈ n k+1 , i j+1 being a direct descendant of i j for each j ∈ {0, . . . , − 1}, and for which i has a descendant in n k+2 . consider a sequence of networks and k(n)s. we say that there are bounded paths of potential infection to k(n) + 2 if there exists some finite m and for each n there is a path of potential infection to k(n) + 2, i 0 , i 1 , . . . i of length less than m , with n j < m for every j ∈ {0, . . . , − 2}. we say that a sequence of networks is growth-balanced relative to some k(n) if there are no bounded paths of potential infection to k(n) + 2. growth balance is essentially a condition that requires a minimum bound of expansion along all paths from some initial infection. the intuition behind the condition is clear: in order to be sure to detect an infection, within distance k of the seed, it has to be that many of the nodes within distance k have been exposed to the disease by the time it reaches distance k. what is ruled out is a relatively short path that gets directly to that distance without having many nodes be exposed along that path. 1 supplementary figure , it is much more likely to be detected. however, that only happens with some moderate probability in this network, and so growth balance fails. we begin with a benchmark case in which there is no delay in detection (τ = 0) and the policymaker can completely enforce a quarantine at some distance k + 1. 2 we allow the size of the quarantine region k to depend on n in any way, as the theorem still applies. we work with an arbitrary but fixed threshold x, in order to allow infections to be detected. what is important is that x not grow too rapidly, as otherwise there is no chance of observing that many infections within some distance of the seed. 3 theorem 1. consider any sequence of networks and associated k(n) < k(n) − 1 where k(n) is the maximum k for which n k > 0; 4, such that each node in n k(n)+1 has at least one descendent at distance k(n) + 2, and let x be any fixed positive integer. let the sequence of associated diseases have α(n) and p(n) bounded away from 0 and 1, 5 no delay in detection, and any θ(n) ≥ 1. a regional quarantining policy of distance k(n) and threshold x halts all infections past distance k(n) + 1 with a probability tending to 1 if and only if the sequence is growth-balanced with respect to k(n). note that the growth-balance condition implies that the number of nodes within distance k(n) from i 0 must be growing without bound. theorem 1 thus implies that in order for a regional policy to work, the region must be growing without bound, and also must satisfy a particular balance condition. proof of theorem 1. to prove the first part, note that if the infection never reaches distance k then the result holds directly since it can then not go beyond k + 1. we show that if the sequence of networks is growth-balanced relative to k, then conditional upon an infection reaching level k with the possibility of reaching k + 2 within two periods, the probability that it infects more than x nodes within distance k before any nodes beyond k tends to 1. suppose that infection reaches some node at distance k that can reach a node in n k+1 . consider the corresponding sequence of paths of infected nodes i 0 , i 1 , . . . i with i ∈ n k+1 , i j+1 being a direct descendant of i j for each j ∈ {0, . . . , − 1}, and note that by assumption i has a descendant in n k+2 . by the growth-balance condition, for any m , there is a large enough n for which either the length of the path is longer than m or else there is at least one i j with j ≤ − 2 along the path that has more than m descendants. in the latter case, the probability that i j has more than x descendants who become infected and are detected is at least 1 − f m,m (x) where f m,m is the binomial distribution with m draws each with probability m, where pα > m for some fixed m. given that x and m are fixed, this tends to probability 1 as m grows. in the former case, the sequence exceeds length m , all of which are infected and so given that α is bounded below, the probability that at least x of them are detected goes to 1 as m grows. in both cases, as n grows, the minimal m across such paths of potential infection to k + 1 grows without bound, and so the probability that there are at least x infections that are detected by the time that i −1 is reached tends to 1 as n grows. to prove the converse, suppose that the network is not growth-balanced. consider a sequence of bounded paths of potential infection to k + 2, with associated sequences of nodes i 0 , i 1 , . . . i of length less than m with i ∈ n k+1 , i j+1 being a direct descendant of i j for each j ∈ {0, . . . , − 1}, with n j < m for every j ∈ {0, . . . , − 2}, and for which i has a descendant in n k+2 . the probability that each of the nodes i 1 , . . . i −2 becomes infected and growth-balance condition becomes more complicated, as the m in that definition adjusts with the rate of growth of x. 4 otherwise, it is actually a global policy. 5 the cases of p or α equal to 1 are degenerate. no other nodes are infected within distance k − 1, and that all infected nodes are undetected is at least (p(1 − α)(1 − p) m ) m . this is fixed and so bounded away from 0. this implies that probability that the infection gets to nodes at distance k, and i −1 in particular, without any detections is bounded below. thus, there is a probability bounded below of reaching i before any detections, and then by the time the quarantine is enacted, there is at least a p times this probability that it escapes past n k+1 , which is thus also bounded away from 0. we note that theorem 1 admits essentially all sequences of (unweighted) networks. thus, for every type of network, one can determine whether a regional policy of some k, x will succeed or fail. the only thing that one needs to check is growth-balance. if it is satisfied, a regional policy works, and otherwise it will fail with nontrivial probability. the following corollary details the implications of the theorem for some prominent random network models. 1. for a sequence of block models (with erdos-renyi as a special case), 6 a regional policy with a bounded k has a probability going to 1 of halting the disease on the randomly realized network if and only if the seed node's expected out degree d is such that d k → ∞. 2. for a regular expander graph with outdegree d, a regional policy works if and only if the expansion rate d k → ∞. 3. for a regular lattice of degree d, a regional policy works if and only if d k → ∞. 4. for a rewired lattice with a fraction links that are randomly rewired, a regional policy with a bounded k has a probability going to 1 of halting the disease on the randomly realized network if and only if d k → ∞. for a sequence of random networks with a scale-free degree distribution, a regional policy works (with probability 1) if and only if k → ∞. thus, whether a regional policy works in almost any network model requires that either the degree of almost all nodes grows without bound, or else the size of the quarantine grows without bound. for a scale free distribution, there is always a nontrivial probability on small degrees, and hence in order for a regional policy to work, the size of the neighborhood must grow without bound. in practice, even very sparse networks will have a large d k (e.g., if people have hundreds of contacts, 100 3 is already a million and even with a very low α many infections will be detected within a few steps of the initial node). 7 what the growth-balance condition rules out is that some nontrivial part of the network have neighborhoods with many fewer contacts -so there cannot be people who have just a few contacts, since that will allow for a nontrivial probability of undetected escape (e.g., 2 3 = 8 and so with only 8 infections, it is possible that none are detected and the disease escapes beyond 3 steps). as many realworld network structures have substantial heterogeneity, with some people having very low numbers of interactions, such an escape becomes possible even under idealized assumptions of no delay in detection and no leakage [41, 42, 43, 44, 45] . the detection delay, τ , is distributed over the support {1, . . . , τ max }. this includes degenerate distributions with τ max being the maximal value of the support with positive mass. the policymaker may or may not know τ max and we study both cases. the latter is important as in practice we estimate delay periods so there is bound to be uncertainty. when τ is known, we can simply say τ = τ max . let a regional policy with trigger k and threshold x and buffer h be such that once there are at least x infections detected within distance k + h from the initial seed, then all nodes within distance k + h + 1 of i 0 are quarantined/locked down for at least θ periods. there are two differences between this definition of regional policy from the one considered before. first, it is triggered by infections within distance k + h (not within distance k), and it also has a buffer in how far the quarantine extends beyond the k-th neighborhood. we extend the definition of growth balance to account for buffers. consider a network and a distance k from the initially infected node i 0 and an h ≥ 1. a path of potential infection to k + h + 2 is a sequence of nodes i 0 , i 1 , . . . i with i ∈ n k+h+1 , i j+1 being a direct descendant of i j for each j ∈ {0, . . . , − 1}. consider a sequence of networks, n, and associated k(n), h(n). we say that there are bounded paths of potential infection to k(n) + h(n) + 2 if there exists some finite m and for each n there is a path of potential infection to k + h + 2, i 0 , i 1 , . . . i of length less than m , with n j < m for every j ∈ {0, . . . , − h − 2}. we say that a sequence of networks is growth-balanced relative to some k(n) and buffers h(n) if there are no bounded paths of potential infection to k(n) + h(n) + 2. theorem 2. consider any sequence of networks and k(n) < k(n) − h − 1 where k(n) is the maximum k for which n k > 0, such that each node in n k for k > k has at least one descendent at distance k + 1, and let x be any fixed positive integer. let the sequence of associated diseases have α(n) and p(n) bounded away from 0 and 1, θ(n) ≥ 1, and have a detection delay distributed over some set {1, . . . , τ max } with τ max > 1 (with probability on τ max bounded away from 0). 8 a regional policy with trigger k(n), threshold x, and buffer τ max halts all infections past distance k(n) + τ max +1 with a probability tending to 1 if and only if the sequence is growth-balanced with respect to k(n). the proof of theorem 2 is a straightforward extension of the previous proof and so it is omitted. this result shows several things. first, if the detection delay is small relative to the diameter of the graph, one can use a regional quarantine policy -adjusted for the detection delay -along the lines of that from theorem 1 and ensure no further spread. this is true even if the period is stochastic as long as the upper bound is known to be small. second, and in contrast, if the detection delay is large compared to the diameter of the graph, then a regional policy is insufficient. by the time infections are observed, it is too late to quarantine a subset of the graph. this condition will tend to bind in the case of real world networks, as they exhibit small world properties and have small diameters [30, 31]. as a result, even short detection delays may correspond to rapidly moving wavefronts that spread undetected. next we turn to the case of in which there is some leakage in the quarantine, which may come for a variety of reasons. the policymaker may have measurement error in knowing the network structure of the network and who should be quarantined. second, and distinctly, the policymaker may be unable to control some nodes or interactions. third, the network may leak across jurisdictions and some nodes within distance k of i 0 may be outside of the policymaker's jurisdiction. to keep the analysis uncluttered, we assume no detection delay, but the arguments extend directly to the delay case with the appropriate buffer. theorem 3. consider any sequence of networks. let the sequence of associated diseases have α and p bounded away from 0 and 1, and be such that θ ≥ 1, with no detection delay. consider any k(n) < k − 1 where k is the maximum k for which n k > 0, suppose that each node in n k(n) has at least one descendent at distance k(n) + 1, and let x be any positive integer. suppose that a random share of ε n of nodes within distance k of i 0 are not included in a regional quarantine policy and connected to nodes of distance greater than k + 1 -because of a lack of jurisdiction, misclassification by a policymaker, or lack of complete control over people's behaviors. then: 1. if ε n = o(( k ≤k n k ) −1 ) and the network is growth-balanced, then a regional policy of distance k and threshold x halts all infections past distance k + 1 with a probability tending to 1. 2. if ε n ≥ min[1/x, η] for all n for some η > 0 or the network is not growth-balanced, then a regional policy of distance k(n) and threshold x fails to halt all infections past distance k(n) + 1 with a probability bounded away from 0. proof of theorem 3. part 1 follows from the fact that if ε n = o(( k ≤k n k ) −1 ) then the probability of having all nodes in n k correctly identified as being in n k tends to 1, and then theorem 1 can be applied. for part 2, suppose that some x infections are detected. the probability that at least one of them is misclassified is at least 1 − (1 − ε n ) x . given that ε n ≥ min[1/x, η] for any η > 0, it follows that (1 − ε n ) x is bounded away from 1. there is a probability bounded away from 0 that at least one of the infected nodes is misclassified, and not subject to the quarantine, and connected to a node outside of distance k + 1. the theorem implies that the effectiveness of a regional policy is sensitive to any small fixed ε amount of leakage. to illustrate the processes described in the main text, we run several simulations. first, we construct a large network with many jurisdictions. we directly study the content of the theorems with several versions of (k, x) quarantines with an sir infection process on a network. we use the same process and network to show the issues with regional containment, studying regional and adaptive policies. we model real world network structure as follows. 1. there are l locations distributed uniformly at random on the unit sphere. each location has a population of m nodes with a total of n = ml nodes in the network. 2. the linking rates across locations are given as in a spatial model [41, 48] . the probability of nodes i ∈ and j ∈ for locations = linking depends only on the locations of the two nodes and declines in distance: where dist( , ) is the distance between the two locations on the sphere and a, b < 0. every interaction between every pair of nodes is drawn independently from the observed spatial distribution, with distances being along the surface of the unit sphere. 3. the linking patterns within a location are given as in a mixture of random geometric (rgg) [42] and erdos-renyi (er) random graphs [49] . specifically, as spheres are locally euclidean, we model nodes in a location (e.g., in a city) as residing in a square in the tangent space to the location. the probability that two nodes within a location link declines in their distance in this square. we set d rgg as the desired degree from the the rgg. nodes are uniformly distributed on the unit square [0, 1] 2 , and links are formed between nodes within radius r [42]. to obtain the desired degree we set: the remaining links within location are drawn identically and independently with probability where d is the desired average degree for all nodes within location . 4. next, we uniformly add links to create a small world effect, with identical and independently distributed probability s = 1 cn , where c is an arbitrary constant and n is the total number of nodes in the network [29]. 5. finally, we designate a single location as a "hub," to emulate the idea that certain metro areas may have more connections to all other regions. to do so, we select a hub uniformly at random and add links independently and identically distributed with probability h from the hub location to every other location. we first take l = 40 and m = 3500 for all locations. we set a = −4 and b = −15. next, we set d = 15.5, and d rgg = 13.5 for all locations. next, we set c = 2. finally, we set h = 2.85 × 10 −6 . this process results in a graph that emulates real world networks in the united states and india [33, 34, 35, 21] . this includes data from india during the covid-19 lockdowns about interactions within six feet, meaning that it is conservative [21] . we fix a graph to use in all versions of the simulations. the network we generate is sparse, clustered, and has small average distances, as in real world data. finally, we recalculate the connection probability matrix to accurately reflect rates of connection across regions, which we call q. we set parameters as follows: the duration of infection is θ = 5, detection delay (when incorporated) is τ = 3, and set thresholds x for quarantine based on the simulation type. we set transmission probability p as whered is the mean degree. we take r 0 = 3.5, based on estimates of covid-19 [36]. cases are detected i.i.d. with rate α, which we define as α = p(symptomatic) · p(tested) · p(test positive|truly positive) we take the symptomatic rate as 43.2% [56] , and the power of the test as 79% [57] . following estimates from the literature (5-15%), we set α = 0.1 [37, 38] . so, the detection rate is roughly 1 in 3. in the simulations, each node is either detected or not during the first period in which it can be detected, and no information comes after that. when τ = 0, any detection occurs as soon as they are infected and when τ > 0 this happens in the τ + 1th period of infection. as outlined in the main text, we begin by using θ = 5 and τ = 3 [20, 37, 38, 61]. each time period in the simulation progresses in four parts, which happen sequentially. the simulations run as follows: 1. the policy maker sees the detected infections from the previous period, and calculates if a quarantine is necessary in the next period. 2. the disease progresses for a period. this includes new infections and recoveries. 3. infected nodes that have just finished their detection delay of τ periods are independently detected with probability α. 4. new quarantines are enacted based on decisions made in part one of the process. quarantines that have taken place for θ periods end. a node that becomes infected in period t with a detection delay of τ and total disease length θ, is tested in period t + τ , results are processed in t + τ + 1, and they will be quarantined (if necessary) starting at the end of t + τ + 1 (under the fourth item above). this means that they have τ + 1 time periods during which they can infect other nodes. for instance, if τ = 0 this allows a node that becomes infected but (that was not already under quarantine for other reasons) one opportunity to infect others. this process reflects that neither detection nor quarantining of individuals (or jurisdictions) happens instantaneously. in addition, we stipulate that the seed node, i 0 is not counted in the quarantining testing and calculations. this is meant to reflect that it may be unclear whether the disease is spreading or not. nodes that are detected are marked as such until recovery. a random node i 0 is selected and the epidemic begins there. we study the epidemic curve, the number total node-days of infection, and the number of node-days of quarantine for a variety of containment strategies. we examine a number of scenarios using the (k, x) policy model outlined in theorems 1-3. in the case that the quarantine fails, but there are infections outside of the quarantine radius, the policy maker deals with them individually. the policy maker treats each detected case outside of the initial quarantine as a new seed, and immediately quarantines all nodes with the same radius as the initial quarantine. begin by using a simple objective function to find the optimal threshold for triggering the initial quarantine. we minimize a linear combination of the number of infected person periods and quarantined person periods. for all linear combinations where some weight is given to both terms, the optimal threshold is x = 1. the logic is as follows: if the initial quarantine is successful, the number of quarantined person periods will be fixed and also the minimum number of quarantined person periods. therefore, the problem reduces to minimizing the number of infections, which is done by setting x = 1. we study three versions of a (k, x) policy. first, we simulate (k, x) = (3, 1) with no detection delay. then, we incorporate a detection delay of τ = 3, still using a policy of (k, x) = (3, 1) with no buffer. lastly, we study a (3, 1) policy with no buffer and enforcement failures. in this case, a fraction = 0.05 of nodes do not ever quarantine. a global quarantine policy imagines the state as an actor which quarantines every node for θ periods when more than x = 1 infections are detected globally. we study this in the case with a detection delay, to compare to the (k, x), regional, jurisdiction based, and proactive policies. for both the myopic-internal and proactive policies, we take each location as a single jurisdiction. myopic internal quarantine policies. jurisdictions respond only to detections within their own borders, setting x independently of one another. in addition, states act independently: jurisdictions do not take detected infections outside of their borders into account. we set x = 1 for all jurisdictions, the most conservative possible threshold unless otherwise specified. proactive quarantine policies. we examine a more sophisticated approach to deciding when to quarantine. with this policy, each jurisdiction decides to quarantine based on not only defections within their borders, but within neighboring jurisdictions as well. in each period, each jurisdiction calculates their expected detected infections w as follows: w ,t = max{w ,t−1 + y ,t − r ,t , z ,t } we use y ,t to denote the number of expected new infections in region at time t, and use r ,t to denote the number of expected recoveries in at t. each state calculates y ,t as: not quarantined at t-1 m q , w ,t−1 the summation includes the term for spread from to still within . if is quarantined at time t, then y ,t = 0. expected recovery at each period r ,t is calculated as: finally, we set w ,t < 0.01 to be zero, to avoid implementation issues with floating point calculations. setting a lower value to truncate at would improve the performance of the proactive jurisdiction policies, as they would be more sensitive to detected cases in other jurisdictions. uniform and lax policies we run two simulation variants for both the proactive and internally-based policy: one in which all states are as conservative as possible, setting x = 1 and a second in which four regions set a higher threshold of x = 5; in the proactive case, these lax regions also act myopically, following the internal jurisdiction-based policy. we choose x = 5 to simulate lax thresholds. in the united states, new york state issued a stay at home order when 0.07% of the state population was infected, which scaled to our populations of 3500 that is equivalent to a threshold of 2.73 [62, 63] . when scaled to match our population of 3500, florida began re-opening with a threshold of 6.15, and some countries never locked down [64, 63, 65] . in our stylized model, quarantines are more aggressive as they cut contact completely. we include the results of the simulations detailed in the main text in the tables below. in addition, we run simulations with two sets of varied parameters: first, we take α = 0.05, second we take θ = 8 and τ = 5. within the united states, estimates for the detection rate range from 5% to 15%, and in countries with less developed testing infrastructure, the detection rate is undoubtedly lower [37] . because disease parameters are estimated, we use a different estimated of the disease lifespan of covid-19 [61] . for all simulations, we fix r 0 = 3.5. [42] penrose, m. random geometric graphs, vol. 5 (oxford university press, 2003). cholera, quarantine and the english preventive system the concept of quarantine in history: from plague to sars lessons from the history of quarantine, from plague to influenza a. emerging infectious diseases diffusion and contagion in networks with heterogeneous agents and homophily interdependence and the cost of uncoordinated responses to covid-19 variations in governmental responses to and the diffusion of covid-19: the role of political decentralization transmission of influenza: implications for control in health care settings contributions from the silent majority dominate dengue virus transmission presumed asymptomatic carrier transmission of covid-19 characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention failing the test: waiting times for covid diagnostic tests across the u.s. the state of the nation: a 50-state covid-19 survey covid-19the law and limits of quarantine can we contain the covid-19 outbreak with the same measures as for sars? the role of sexual partnership networks in the epidemiology of gonorrhea gonorrhoea and chlamydia core groups and sexual networks in manitoba containing bioterrorist smallpox statistical properties of community structure in large social and information networks diffusion of microfinance a network formation model based on subgraphs classes of small-world networks the average distances in random graphs with given expected degrees using aggregated relational data to feasibly identify network structure without network data on random graphs collective dynamics of small-world networks how many people do you know?: efficiently estimating personal network size changes in social network structure in response to exposure to formal credit markets can network theory-based targeting increase technology adoption? messages on covid-19 prevention in india increased symptoms reporting and adherence to preventive behaviors among 25 million recipients with similar effects on non-recipient members of their communities reconstruction of the full transmission dynamics of covid-19 in wuhan suppression of covid-19 outbreak in the municipality of vo variation in false-negative rate of reverse transcriptase polymerase chain reaction-based sars-cov-2 tests by time since exposure the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application estimating the fraction of unreported infections in epidemics with a known epicenter: an application to covid-19 substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2) evidence supporting transmission of severe acute respiratory syndrome coronavirus 2 while presymptomatic or asymptomatic continuing temporary suspension and modification of laws relating to the disaster emergency the covid tracking project. totals by state phase 1: safe. smart. step-by-step. plan for florida's recovery emergency information from swedish authorities. restrictions and prohibitions key: cord-276178-0hrs1w7r authors: bangotra, deep kumar; singh, yashwant; selwal, arvind; kumar, nagesh; singh, pradeep kumar; hong, wei-chiang title: an intelligent opportunistic routing algorithm for wireless sensor networks and its application towards e-healthcare date: 2020-07-13 journal: sensors (basel) doi: 10.3390/s20143887 sha: doc_id: 276178 cord_uid: 0hrs1w7r the lifetime of a node in wireless sensor networks (wsn) is directly responsible for the longevity of the wireless network. the routing of packets is the most energy-consuming activity for a sensor node. thus, finding an energy-efficient routing strategy for transmission of packets becomes of utmost importance. the opportunistic routing (or) protocol is one of the new routing protocol that promises reliability and energy efficiency during transmission of packets in wireless sensor networks (wsn). in this paper, we propose an intelligent opportunistic routing protocol (iop) using a machine learning technique, to select a relay node from the list of potential forwarder nodes to achieve energy efficiency and reliability in the network. the proposed approach might have applications including e-healthcare services. as the proposed method might achieve reliability in the network because it can connect several healthcare network devices in a better way and good healthcare services might be offered. in addition to this, the proposed method saves energy, therefore, it helps the remote patient to connect with healthcare services for a longer duration with the integration of iot services. . sensor node architecture with application in e-healthcare. with the ever-increasing use of term green computing, the energy efficiency of wsn has seen a considerable rise. recently, an approach for green computing towards iot for energy efficiency has been proposed, which enhances the energy efficiency of wsn [4] . different types of methods and techniques were proposed and developed in the past to address the issue of energy optimization in wsn. another approach that regulates the challenge of energy optimization in sensor-enabled iot with the use of quantum-based green computing, makes routing efficient and reliable [5] . the problem of energy efficiency during the routing of data packets from source to target in case of iotoriented wsn is significantly addressed by another network-based routing protocol known as greedi [6] . it is imperative to mention here that iot is composed of energy-hungry sensor devices. the constraint of energy in sensor nodes has affected the transmission of data from one node to another and therefore, requires boundless methods, policies, and strategies to overcome this challenge [7] . with the ever-increasing use of term green computing, the energy efficiency of wsn has seen a considerable rise. recently, an approach for green computing towards iot for energy efficiency has been proposed, which enhances the energy efficiency of wsn [4] . different types of methods and techniques were proposed and developed in the past to address the issue of energy optimization in wsn. another approach that regulates the challenge of energy optimization in sensor-enabled iot with the use of quantum-based green computing, makes routing efficient and reliable [5] . the problem of energy efficiency during the routing of data packets from source to target in case of iot-oriented wsn is significantly addressed by another network-based routing protocol known as greedi [6] . it is imperative to mention here that iot is composed of energy-hungry sensor devices. the constraint of energy in sensor nodes has affected the transmission of data from one node to another and therefore, requires boundless methods, policies, and strategies to overcome this challenge [7] . the focus of this paper was to put forward an intelligent opportunistic routing protocol so that the consumption of resources particularly during communication could be optimized, because the sensors 2020, 20, 3887 3 of 21 alleyway taken to transmit a data packet from a source node to the target node is determined by the routing protocol. routing is a complex task in wsn because it is different from designing a routing protocol in traditional networks. in wsn, the important concern is to create an energy-efficient routing strategy to route packet from source to destination, because the nodes in the wsn are always energy-constrained. the problem of energy consumption while routing is managed with the use of a special type of routing protocol known as the opportunistic routing protocol. the opportunistic routing (or) is also known as any path routing that has gained huge importance in the recent years of research in wsn [8] . this protocol exploits the basic feature of wireless networks, i.e., broadcast transmission of data. the earlier routing strategies consider this property of broadcasting as a disadvantage, as it induces interference. the focal notion behind or is to take the benefit of spreading the behavior of the wireless networks such that broadcast from one node can be listened by numerous nodes. rather than selecting the next forwarder node in advance, the or chooses the next forwarder node robustly at the time of data transmission. it was shown that or gives better performance results than traditional routing. in or, the best opportunities are searched to transmit the data packets from source to destination [9] . the hop-by-hop communication pattern is used in the or even when there is no source-to-destination linked route. the or protocols proposed in recent times by different researchers are still belligerent with concerns pertaining to energy efficiency and the reliable delivery of data packets. the proposed or routing protocol given in this paper was specifically meant for wsn, by taking into account the problems that surface during the selection of relay candidates and execution of coordination protocol. the proposed protocol intelligently selects the relay candidates from the forwarder list by using a machine learning technique to achieve energy efficiency. the potential relay node selection is a multi-class with multiple feature-based probabilistic problems, where the inherent selection of relay node is dependent upon each node's characteristics. the selection of a node with various characteristics for a node is a supervised multiclass non-linearly separable problem. in this paper, the relay node selection algorithm is given using naïve baye's machine learning model. the organization of this paper is as follows. section 2 presents the related work in the literature regarding or and protocols. the various types of routing protocols are given in section 3. section 4 describes or with examples, followed by the proposed intelligent or algorithm for forwarder node selection in section 5. section 6 depicts the simulation results of the proposed protocol by showing latency, network lifetime, throughput, and energy efficiency. section 7 presents a proposed framework for integration iot with wsn for e-healthcare. this architecture can be useful in many e-healthcare applications. section 8 presents the conclusion and future. achieving reliable delivery of data and energy efficiency are two crucial tasks in wsns. as the sensor nodes are mostly deployed in an unattended environment and the likelihood of any node going out of order is high, the maintenance and management of topology is a rigorous task. therefore, the routing protocol should accommodate the dynamic nature of the wsns. opportunistic routing protocols developed in the recent past years provided trustworthy data delivery but they are still deficient in providing energy-efficient data transmission between the sensor nodes. some latest research on or, experimented by using the formerly suggested routing metrics and they concentrated on mutual cooperation among nodes. geraf [10] (geographic random forwarding) described a novel forwarding technique based on the geographical location of the nodes involved and random selection of the relaying node via contention among receivers. exclusive opportunistic multi-hop routing for wireless networks [11] (exor) is an integrated routing and mac protocol for multi-hop wireless networks, in which the best of multiple receivers forwards each packet. this protocol is based on the expected transmission count (etx) metric. the etx was measured by hop count from the source to the destination and the data packet traveled through the minimum number of hops. exor achieves higher throughput than traditional sensors 2020, 20, 3887 4 of 21 routing algorithms but it still has few limitations. exor contemplates the information accessible at the period of transmission only, and any unfitting information because of recent updates could worsen its performance and could lead to packet duplication. other than this, there is another limitation with exor, as it always seeks coordination among nodes that causes overhead, in case of large networks. minimum transmission scheme-optimal forwarder list selection in opportunistic routing [12] (mts) is another routing protocol that uses mts instead of etx as in exor. the mts-based algorithm gives fewer transmissions as compared to etx-based exor. simple, practical, and effective opportunistic routing for short-haul multi-hop wireless networks [13] . in this protocol, the packet duplication rate was decreased. this is a simple algorithm and can be combined with other opportunistic routing algorithms. spectrum aware opportunistic routing [14] (saor) is another routing protocol for the cognitive radio network. it uses optimal link transmission (olt) as a cost metric for positioning the nodes in the forwarder list. saor gives better qos, reduced end-to-end delay, and improved throughput. energy-efficient opportunistic routing [15] (eeor) calculates the cost for each node to transfer the data packets. the eeor takes less time than exor for sending and receiving the data packets. trusted opportunistic routing algorithm for vanet [16] (tmcor) gives a trust mechanism for opportunistic routing algorithm. it also defines the trade-off between the cost metric and the safety factor. a novel socially aware opportunistic routing algorithm in mobile social networks [17] considered three parameters, namely social profile matching, social connectivity matching, and social interaction. this gives a high probability of packet delivery and routing efficiency. ensor-opportunistic routing algorithm for relay node selection in wsns is another algorithm where the concept of an energy-efficient node is implemented [18] . the packet delivery rate of ensor is better than geraf. economy-a duplicate free [19] is the only or protocol that uses token-based coordination. this algorithm ensures the absence of duplicate packet transmissions. with the advent of the latest network technologies, the virtualization of networks along with its related resources has made networks more reliable and efficient. the virtual network functions are used to solve the problems related to service function chains in cloud-fog computing [20] . further, iot works with multiple network domains, and the possibility of compromising the security and confidentiality of data is always inevitable. therefore, the use of virtual networks for service function chains in cloud-fog computing under multiple network domains, leads to saving network resources [21] . in recent times, the cloud of things (cot) has gained immense popularity, due to its ability to offer an enormous amount of resources to wireless networks and heterogeneous mobile edge computing systems. the cot makes the opportunistic decision-making during the online processing of tasks for load sharing, and makes the overall network reliable and efficient [22] . the cloud of things framework can significantly improve communication gaps between cloud resources and other mobile devices. in this paper, the author(s) proposed a methodology for offloading computation in mobile devices, which might reduce failure rates. this algorithm reduces failure rates by improving the control policy. in recent times, wsn used virtualization techniques to offer energy-efficient and fault-tolerant data communication to the immensely growing service domain for iot [23] . with the application of wsn in e-healthcare, the wireless body area network (wban) gained a huge response in the healthcare domain. the wban is used to monitor patient data by using body sensors, and transmits the acquired data, based on the severity of the patients' symptoms, by allocating a channel without contention or with contention [24] . eeor [15] is an energy-efficient protocol that works on transmission power as a major parameter. this protocol discussed two cases that involved constant and dynamic power consumption models. these models are known as non-adjustable and adjustable power models. in the first model, the algorithm calculated the expected cost at each node and made a forwarder list on the source node based on this cost. the forwarder list was sorted in increasing order of expected cost and the first node on the list became the next-hop forwarder. as eeor is an opportunistic routing protocol, broadcasting is utilized and the packets transmitted might be received by each node on the forwarder list. in this, the authors propose algorithms for fixed-power calculation, adjustable power calculation, sensors 2020, 20, 3887 5 of 21 and opportunistic power calculation. this algorithm was compared with exor [11] by simulation in the tossim simulator. the results showed that eeor always calculated the end-to-end cost based on links from the source to destination. eeor followed distance vector routing for storing the routing information inside each sensor node. the expected energy consumption cost was updated inside each node, after each round of packet transmission. data delivery was guaranteed in this protocol. additionally, according to the simulation results, packet duplication was significantly decreased. the mdor [25, 26] protocol worked on the distance between the source to relay nodes. in this, the authors proposed an algorithm that calculated the distance to each neighbor from the source node and found out the average distance node. the average distance node was used by the source as a next-hop forwarder. the authors also stated that, to increase the speed and reliability of transmission, the strength of the signal was very important. the signal power depended on the distance between the sender and receiver. if a node sent a packet to the nearest node, then it might take more hops and this would decrease the lifetime of the network. another problem addressed in this protocol was to reduce energy consumption at each node through the dynamic energy consumption model. this model consumed energy according to the packet size and transmitted the packet by amplifying it according to the distance between the source and the relay nodes. mdor always chose the middle position node to optimize energy consumption in amplifying the packets. the mdor simulation results showed that the energy consumption was optimized and it was suitable for certain applications of wsn like environment monitoring, forest fire detection, etc. opportunistic routing introduced the concept of reducing the number of retransmissions to save energy and taking advantage of the broadcasting nature of the wireless networks. with broadcasting, the routing protocol could discover as many paths in the network as possible. the data transmission would take place on any of these paths. if a particular path failed, the transmission could be completed by using some other path, using the forwarder list that had the nodes with the same data packet. the protocols that were responsible for data transmission in wsn were broadly ordered into two sets [2] , namely, (i) old-fashioned routing, and (ii) opportunistic routing. in the traditional routing, also known as old-fashioned routing techniques, the focus was on finding the route with a minimum number of intermediate nodes from the source to the destination, without taking into consideration some of the important factors like throughput, quality of links, reliability, etc. a small comparison [27] of the routing categories is shown in table 1 . as it is clear from the literature that energy consumption of a sensor node had a considerable impact on the lifetime and quality of the wireless sensor network, therefore, it becomes vital to design energy-efficient opportunistic routing protocols to maximize the overall lifetime of the network and also to enhance the quality of the sensor network. there are few methods in the literature listed below that might be useful to save the life of the sensor network. scheduling of duty cycle • energy-efficient medium access control (ee-mac) • energy-efficient routing • node replacements (not possible in unattended environments) of the above-mentioned methods for energy saving, energy-efficient routing is the most central method for the vitality of the wsn. as this method involved the transmission of signals, i.e., receiving and sending, it took about 66.66 percent of the total energy of the network [28] . therefore, it became relevant that an opportunistic routing protocol that enhanced the vitality of the sensor network might be designed for enhancing the overall life span of the sensor network. or broadcasts a data packet to a set of relay candidates that is overheard by the neighboring nodes, whereas in traditional routing a node is (pre)-selected for each transmission. then, relay candidates that are part of the forwarders list and who have successfully acknowledged the data packet, run a protocol called coordination protocol between themselves, to choose the best relay candidate to onward the data packet. in other words, or is abstractly comprised of these three steps: step 1: broadcast a data packet to the relay candidates (this will prepare the forwarder list). step 2: select the best relay by using a coordination protocol among the nodes in the forwarder list. step 3: forward the data packet to the selected relay node. considering an example shown in figure 2 , where the source node s sends a packet to the destination node d, through nodes r1, r2, r3, r4, and r5. first, s broadcasts a packet. the relay nodes r1, r2, and r3 might become the forwarder nodes. further, if r2 is chosen as a potential forwarder, then r4 and r5 might become relay nodes. similarly, if r5 is the forwarder node, then it forwards the data packets to the destination node d. energy balance of the above-mentioned methods for energy saving, energy-efficient routing is the most central method for the vitality of the wsn. as this method involved the transmission of signals, i.e., receiving and sending, it took about 66.66 percent of the total energy of the network [28] . therefore, it became relevant that an opportunistic routing protocol that enhanced the vitality of the sensor network might be designed for enhancing the overall life span of the sensor network. or broadcasts a data packet to a set of relay candidates that is overheard by the neighboring nodes, whereas in traditional routing a node is (pre)-selected for each transmission. then, relay candidates that are part of the forwarders list and who have successfully acknowledged the data packet, run a protocol called coordination protocol between themselves, to choose the best relay candidate to onward the data packet. in other words, or is abstractly comprised of these three steps: step 1: broadcast a data packet to the relay candidates (this will prepare the forwarder list). step 2: select the best relay by using a coordination protocol among the nodes in the forwarder list. step 3: forward the data packet to the selected relay node. considering an example shown in figure 2 , where the source node s sends a packet to the destination node d, through nodes r1, r2, r3, r4, and r5. first, s broadcasts a packet. the relay nodes r1, r2, and r3 might become the forwarder nodes. further, if r2 is chosen as a potential forwarder, then r4 and r5 might become relay nodes. similarly, if r5 is the forwarder node, then it forwards the data packets to the destination node d. opportunistic routing derived the following rewards: • the escalation in reliability. by using this routing strategy, the reliability of wsn increased significantly, as this protocol transmitted the data packet through any possible link rather than any pre-decided link. therefore, this routing protocol provided additional links that could act as back up links and thus reduced the chances of transmission failure. the escalation in transmission range. with this routing protocol, the broadcast nature of the wireless medium provided an upsurge in the transmission range, as all links irrespective of their location and quality of data packets were received. hence, the data transmission could reach the farthest relay node successfully. opportunistic routing derived the following rewards: • the escalation in reliability. by using this routing strategy, the reliability of wsn increased significantly, as this protocol transmitted the data packet through any possible link rather than any pre-decided link. therefore, this routing protocol provided additional links that could act as back up links and thus reduced the chances of transmission failure. the escalation in transmission range. with this routing protocol, the broadcast nature of the wireless medium provided an upsurge in the transmission range, as all links irrespective of their location and quality of data packets were received. hence, the data transmission could reach the farthest relay node successfully. in wsn, the sensor nodes could be deployed in two ways, randomly or manually. most applications require the random deployment of nodes in the area under consideration. initially, each node is loaded with the same amount of battery power. as soon as the network starts functioning, the nodes start consuming energy. to make the network energy efficient, the protocol used for transmitting data packets must consume less battery power and the calculation of the energy consumption network model and energy model should be formulated. in the upcoming subsection, these two models are discussed and these are depicted as assumptions, to carry out smooth working of the protocol. the n sensors are distributed in a square area of size 500 * 500 square meters. this network formed a graph g = (n, m), with the following possessions: . . , n n } is the set of vertices representing sensor nodes. • m is considered to be a set of edges representing the node-to-node links. the neighboring list nbl(n i ) consists of nodes that are in the direct link to the n i . the data traffic is assumed to be traveling from the sensor nodes toward the base station. if a packet delivery is successful, then the acknowledgment (ack) for the same is considered to travel the same path back to the source. the lifespan of a wsn depends on the endurance of each node, while performing network operations. the sensor nodes rely on the battery life to perform network operations. the energy cost model considered here is the first-order energy model for wsn [25] . various terms used in equations (1)-(3) are defined in table 2 . combined vitality cost of radio board of a sensor for communication of a data packets energy consumed in the transmission of n-bit packet up to l distance: energy consumed in the transmission of n-bit packet: sensors 2020, 20, 3887 8 of 21 sensor board-full operation, radio board-full operation, cpu board-sleep, wakeup for creating messages only. the proposed protocol uses these assumptions as preliminaries. a new algorithm is proposed in the next section, for solving the issue of energy efficiency and the reliability of opportunistic routing in wsn. let there be n nodes in the wsn, where each node has k neighbors, i.e., n 1 , n 2 , . . . , n k and each neighbor nodes are represented by x 1 , x 2 , . . . , x n attributes. in this case, the number of neighbors (k) might vary for the different nodes at a particular instance. additionally, it was assumed that the wireless sensor network is spread over an area of 500 × 500 square meters. let us assume that a node a ∈ n and had neighbors as na 1 , na 2 , . . . , na k , with respective features like node id, location, prr (packet reception ratio), residual energy (re) of nodes, and distance (d), which are represented by x 1 , x 2 , . . . , x n , respectively. the goal was to intelligently find a potential relay node a, say ar, such that ar ∈ {na 1 , na 2 , . . . , na k }. in the proposed machine learning-based protocol for the selection of potential forwarder, the packet reception ratio, distance, and outstanding energy of node was taken into consideration. the packet reception ratio (prr) [29] is also sometimes referred to as psr (packet success ratio). the psr was computed as the ratio of the successfully received packets to the sent packets. a similar metric to the prr was the per (packet error ratio), which could be computed as (1-prr). a node loses a particular amount of energy during transmission and reception of packets. accordingly, the residual energy in a node gets decreased [30] . the distance (d) was the distance between the source node and the respective distance of each sensor node in the forwarder list. the potential relay node selection was multi-class, with multiple features-based probabilistic problems, where the inherent selection of the relay node was dependent upon each node feature. the underlying probabilistic-based relay node selection problem could be addressed intelligently by building a machine learning model. the selection of a node with 'n' characteristics for a given node 'a' could be considered a supervised multiclass non-linearly separable problem. in this algorithm, the naïve baye's classifier was used to find the probability of node a to reach one of its neighbors, i.e., {n 1 , n 2 , . . . , n k }. we computed the probability, p(n 1 , n 2 , . . . , n k |a). the node with maximum probability, i.e., p(n 1 , n 2 , . . . , n k |a) was selected. the probability p of selecting an individual relay node of the given node a could be computed individually for each node, as shown respectively for each node in equation (4). where p(na k |a) denotes the probability of node a to node k. furthermore, the probability computation of node a to na 1 is such that na 1 is represented by the corresponding characteristics x 1 , x 2 , . . . , xn, which means to find the probability to select the relay node na 1 , given that feature x 1 , na 1 given that feature x 2 , na 1 given that feature x 3 , and so on. the individual probability of relay node selection, given that the node characteristics might be computed by using naïve bayes conditional probability, is shown in equation (5). sensors 2020, 20, 3887 9 of 21 where i = 1, 2, 3, . . . , n and p(xi|a) is called likelihood, p(a) is called the prior probability of the event, and p(xi) is the prior probability of the consequence. the underlying problem is to find the relay node a that has the maximum probability, as shown in equation (6). table 3a-x represent the neighbor sets {na 1 , na 2 , . . . , na k } along with their feature attributes as {x 1 , x 2 , x 3 , . . . , x n } of node a. the working of iop is comprised of two phases, i.e., phase i (forwarder_set_selection) and phase ii (forwarder_node_selection). in phase i, the authors used algorithm 1 for the forwarder set selection. in this step, the information collection task was initiated after the nodes were randomly deployed in the area of interest, with specific dimensions. the beginning of the phase started with a broadcast of "hello packet" which contained the address and the location of the sending node. if any node received this packet, it sent an acknowledgment to the source and was added to the neighbor list. this process was repeated again and again, but not more than the threshold, to calculate the prr of each node and the neighbor list was formed using this procedure repeatedly. from the neighbor list and the value of prr, the forwarder set was extracted. the pre-requisite for the working of the second phase was the output of the first phase. the forwarder set generated from algorithm 1 was the set of all nodes that had the potential to forward the data packets. however, all nodes in the set could not be picked for transmission, as this would lead to duplication of packets in the network. to tackle this situation, only one node from the forwarder list should be selected to transmit the packet to the next-hop toward the destination. this was accomplished using algorithm 2, which took a forwarder node list as input and selected a single node as a forwarder. algorithm 2 used a machine-learning technique called naïve baye's classifier, to select the forwarder node intelligently. the proposed method of relay node selection using iop could be understood by considering an example of wsn shown in figure 2 and using the naïve baye's algorithm on the generic data available in table 4 , to find the optimal path in terms of energy efficiency and reliability from source node s to destination node d. therefore, by using the proposed naïve baye's classifier method, the probability of selection of a relay node r1, r2, or r3 from source node s was denoted by p(r1, r2, r3|s), which could be calculated using equation (7). where, putting the values in the above equations from table 4 again, inputting the values in the above equations (12) declare three float variables x 1 , x 2 , and x 3 to represent the properties of ri, i.e., prr (packet reception ratio), re (residual energy), and d (distance), respectively. for each node ri∈ fl(s) repeat compute p(ri|s)//probability of selection of ri given s, i.e., p k = p(r i |s) for i = 1, 2 . . . , n and assign k←i 4. compute the probability of p(r i |s) by computing the probability of each parameter separately, given s. make an unsorted array of probability values of n nodes, i.e., r1, r2, . . . , rn from step 6. for i = 1 to n and k = i, arrprob[ri]←p k //to find the node with maximum probability. 6. select the first node of the array arrprob[0] as the node with maximum value pmax i.e., pmax←arrprob[0] 7. go through the rest of the elements of the array, i.e., from the 2nd element to the last (n − 1) element, for i = 1 to n − 1. for when the end of the array is reached, then the current value of the pmax is the greatest value in the array, pmax←arrprob[i]. 10. the node ri with pmax value is selected as a relay node from the forwarder list, as the node with the highest probability. the node with the next highest probability acts as a relay node in case the first selected relay node fails to broadcast. 11. broadcast transmission of the data packet as {ri, coordinates, data} 12. destination node d is reached, if yes, go to step 15. else, apply algorithm 1 on ri s←ri and go to step 2. 13. end output: a potential forwarder node is selected from the list of forwarder nodes. again, putting the values in the above equations finally using the proposed method of relay node selection using naïve baye's algorithm, we could compute probability p(r1, r2, r3 s) , using equation (26) . p(r1, r2, r3 s) = max(p(r1 s), p(r2 s), p(r3 s) = max(0.001, 0.002, 0.001) (26) thus, node r2 would be selected as the relay node in the forwarder list of r1, r2, and r3 for source node s. similarly, the process was followed again for the neighbors of s, which consequently would check the neighbors of r1, r2, and r3. the tables 5-7 describe the features of neighboring nodes of r1, r2, and r3, respectively. after the execution of phase i and phase ii on the above said example, the final route was intelligently selected for the onward transmission of the data packet from source node s to destination node d, using the naïve baye's algorithm shown in figure 3 . node_id location prr (j) (m) r5 r20005 (49,79) 0.6 0.7 11 d after the execution of phase i and phase ii on the above said example, the final route was intelligently selected for the onward transmission of the data packet from source node s to destination node d, using the naïve baye's algorithm shown in figure 3 . figure 3 gives the details about the route selected using the iop. the source node s broadcasts the data packet among its neighboring nodes, using algorithm 1 to create a forwarders list. the node r1, r2, and r3 in the figure, were selected as the nodes in the forwarders list. these were the potential nodes that would be used for the selection of a potential forwarder node. here, r2 was selected as the potential node using algorithm 2. the same procedure was adopted again and until the data reached its final destination. the final route was selected intelligently using iop is s→r2→r5→d. with the end goal of examination and comparison of the proposed or protocol, the simulation was performed in matlab. the simulation used the environment provided by the matlab to simulate the computer networks and other networks. matlab provides a good scenario to design a network of sensor nodes and also to define a sensor node and its characteristics. the simulation results were compared with the results of the eeor [25] and the mdor [26] protocols. table 8 below shows the parameter setting of the network. figure 3 gives the details about the route selected using the iop. the source node s broadcasts the data packet among its neighboring nodes, using algorithm 1 to create a forwarders list. the node r1, r2, and r3 in the figure, were selected as the nodes in the forwarders list. these were the potential nodes that would be used for the selection of a potential forwarder node. here, r2 was selected as the potential node using algorithm 2. the same procedure was adopted again and until the data reached its final destination. the final route was selected intelligently using iop is s→r2→r5→d. with the end goal of examination and comparison of the proposed or protocol, the simulation was performed in matlab. the simulation used the environment provided by the matlab to simulate the computer networks and other networks. matlab provides a good scenario to design a network of sensor nodes and also to define a sensor node and its characteristics. the simulation results were compared with the results of the eeor [25] and the mdor [26] protocols. table 8 below shows the parameter setting of the network. the motes are haphazardly deployed in 500 × 500 m field. the nodes are deployed in such a way that these can approximately cover the whole application area. the base station position is 250 × 250 m in the field. the field area was considered a physical world environment. the proposed or protocol started working immediately after the deployment process was complete. figure 4 below represents the unplanned deployment of the nodes in the area of consideration. the motes are haphazardly deployed in 500 × 500 m field. the nodes are deployed in such a way that these can approximately cover the whole application area. the base station position is 250 × 250 m in the field. the field area was considered a physical world environment. the proposed or protocol started working immediately after the deployment process was complete. figure 4 below represents the unplanned deployment of the nodes in the area of consideration. energy efficiency was the main objective of the proposed algorithm. it could be calculated as the overall energy consumption in the network for the accomplishment of diverse network operations. in matlab, the simulation worked based on simulation rounds. the simulation round was termed as packets transmission from a single source to a single destination. in matlab, when the simulation starts, a random source is chosen to start transmission and this node makes a forwarder list and starts executing the proposed protocol. one round of simulation represents successful or unsuccessful transmissions of packets from one source in the network. for each round, different source and relay nodes are selected. this process continues until at least one node is out of its energy. the energy efficiency was calculated as the total energy consumption after each round in the network. after the operation of the network starts, the sensor's energy starts decaying. this energy reduction was due to network operations like setting up the network, transmission, reception, and acknowledging the data packets, processing of data, and sensing of data. as the nodes decayed, their energy consumption kept increasing per round, as can be seen in figure 5 below. it can be seen in the figure that energy consumption for the proposed or protocol was less, as compared to the other two algorithms. this was because the proposed or protocol distributed energy consumption equally to energy efficiency was the main objective of the proposed algorithm. it could be calculated as the overall energy consumption in the network for the accomplishment of diverse network operations. in matlab, the simulation worked based on simulation rounds. the simulation round was termed as packets transmission from a single source to a single destination. in matlab, when the simulation starts, a random source is chosen to start transmission and this node makes a forwarder list and starts executing the proposed protocol. one round of simulation represents successful or unsuccessful transmissions of packets from one source in the network. for each round, different source and relay nodes are selected. this process continues until at least one node is out of its energy. the energy efficiency was calculated as the total energy consumption after each round in the network. after the operation of the network starts, the sensor's energy starts decaying. this energy reduction was due to network operations like setting up the network, transmission, reception, and acknowledging the data packets, processing of data, and sensing of data. as the nodes decayed, their energy consumption kept increasing per round, as can be seen in figure 5 below. it can be seen in the figure that energy consumption for the proposed or protocol was less, as compared to the other two algorithms. this was because the proposed or protocol distributed energy consumption equally to all nodes, so that every node could survive up to their maximum lifetime. hence, the proposed or protocol was more energy-efficient than mdor and eeor. sensors 2020, 20, x for peer review 18 of 24 all nodes, so that every node could survive up to their maximum lifetime. hence, the proposed or protocol was more energy-efficient than mdor and eeor. latency can be measured as the time elapsed between sending the packet and receiving the same at the base station. this is also called as end-to-end delay for the packets to be reached at the destination. the communication in wireless sensor networks is always from source nodes to the sink station. latency can be measured as the time elapsed between sending the packet and receiving the same at the base station. this is also called as end-to-end delay for the packets to be reached at the destination. the communication in wireless sensor networks is always from source nodes to the sink station. in the random deployment of nodes, some nodes are able to communicate directly with the base station. while some nodes follow multi-hop communication, i.e., source nodes have to go through relay nodes to forward the data packet toward the base station. hence, in some cases, the network delay can be very low and in some cases, it can be high. hence in figure 6 , the values of end-to-end delay after each communication in each round are plotted. it can be seen that the proposed or protocol has a good latency, as compared to the other two protocols. latency can be measured as the time elapsed between sending the packet and receiving the same at the base station. this is also called as end-to-end delay for the packets to be reached at the destination. the communication in wireless sensor networks is always from source nodes to the sink station. in the random deployment of nodes, some nodes are able to communicate directly with the base station. while some nodes follow multi-hop communication, i.e., source nodes have to go through relay nodes to forward the data packet toward the base station. hence, in some cases, the network delay can be very low and in some cases, it can be high. hence in figure 6 , the values of end-to-end delay after each communication in each round are plotted. it can be seen that the proposed or protocol has a good latency, as compared to the other two protocols. the throughput of a network can be measured in different ways. throughput is calculated as the average number of packets received successfully at the base station per second in each round. figure 7 represents the throughput for each round. the proposed or protocol has good throughput, as compared to the other two. as the proposed or protocol is efficient in energy consumption, the sensor nodes are able to survive and communicate for a long time in the network. as long as the communication goes on, the base station would continue to receive the packets. sensors 2020, 20, x for peer review 19 of 24 the throughput of a network can be measured in different ways. throughput is calculated as the average number of packets received successfully at the base station per second in each round. figure 7 represents the throughput for each round. the proposed or protocol has good throughput, as compared to the other two. as the proposed or protocol is efficient in energy consumption, the sensor nodes are able to survive and communicate for a long time in the network. as long as the communication goes on, the base station would continue to receive the packets. network lifetime for wireless sensor networks is dependent upon the energy consumption in the network. when the energy of the network is 100 percent, the network lifetime would also be 100 percent. however, as the nodes start operating in the network, the network lifespan would start to reduce. figure 8 represents the percentage of lifetime remaining after each round of simulation. proposed or protocol has a good network lifetime due to the lower energy consumption in the network. network lifetime for wireless sensor networks is dependent upon the energy consumption in the network. when the energy of the network is 100 percent, the network lifetime would also be 100 percent. however, as the nodes start operating in the network, the network lifespan would start to reduce. network lifetime for wireless sensor networks is dependent upon the energy consumption in the network. when the energy of the network is 100 percent, the network lifetime would also be 100 percent. however, as the nodes start operating in the network, the network lifespan would start to reduce. figure 8 represents the percentage of lifetime remaining after each round of simulation. proposed or protocol has a good network lifetime due to the lower energy consumption in the network. the packet loss is referred to as the number of packets that are not received at the destination. to calculate the number of packets lost during each round of the simulation, packet sequence numbers are used. whenever a source tries to send packets to a destination, it inserts a sequence number. later, on packet reception, these packet sequence numbers are checked for continuity. if a certain sequence number is missing then it is referred to as packet loss. packet loss recorded per round of simulation and presented in figure 9 . it can be depicted from the figure that packet loss for the proposed protocol is less, as compared to eeor and mdor. this is because the forwarder node selection algorithm runs on each relay and source node. this algorithm calculates the probability of successful transmission through a neighbor node. this also increases the reliability of the protocol and provides accurate transmissions. sensors 2020, 20, x for peer review 20 of 24 the packet loss is referred to as the number of packets that are not received at the destination. to calculate the number of packets lost during each round of the simulation, packet sequence numbers are used. whenever a source tries to send packets to a destination, it inserts a sequence number. later, on packet reception, these packet sequence numbers are checked for continuity. if a certain sequence number is missing then it is referred to as packet loss. packet loss recorded per round of simulation and presented in figure 9 . it can be depicted from the figure that packet loss for the proposed protocol is less, as compared to eeor and mdor. this is because the forwarder node selection algorithm runs on each relay and source node. this algorithm calculates the probability of successful transmission through a neighbor node. this also increases the reliability of the protocol and provides accurate transmissions. a significant improvement could be seen in the graphs after the simulation is complete. figure 5 shows the total energy consumption after each round of packet transmission is complete. here, the round was termed as packet transmissions in between single source and destination. mdor showed the highest energy consumption, followed by eeor and the proposed protocol. this was because mdor wasted more energy in the initial setup. however, the dynamic energy consumption considerations led the network to survive for a long time, as shown in figure 8 . in the case of eeor in figure 5 , it consumed lesser energy in transmission and the initial setup for opportunistic selection of relay nodes was based on the power level. however, when it comes to lifetime, eeor failed to perform better, as it considered the network to be dead when any one of the nodes ran out of its energy. eeor chose one node as a source and continued transmissions opportunistically, which resulted in a significant reduction in the power level of a single node. the proposed protocol gave the best results, as in each round, the source node was based on the intelligent model to change the next-hop relay node. figure 6 presents the average end-to-end delay per round, generated by the simulation, and the proposed protocol worked significantly better as the next-hop selection was based on an intelligent algorithm. the proposed algorithm helped to significantly reduce average a significant improvement could be seen in the graphs after the simulation is complete. figure 5 shows the total energy consumption after each round of packet transmission is complete. here, the round was termed as packet transmissions in between single source and destination. mdor showed the highest energy consumption, followed by eeor and the proposed protocol. this was because mdor wasted more energy in the initial setup. however, the dynamic energy consumption considerations led the network to survive for a long time, as shown in figure 8 . in the case of eeor in figure 5 , it consumed lesser energy in transmission and the initial setup for opportunistic selection of relay nodes was based on the power level. however, when it comes to lifetime, eeor failed to perform better, as it considered the network to be dead when any one of the nodes ran out of its energy. eeor chose one node as a source and continued transmissions opportunistically, which resulted in a significant reduction in the power level of a single node. the proposed protocol gave the best results, as in each round, the source node was based on the intelligent model to change the next-hop relay node. figure 6 presents the average end-to-end delay per round, generated by the simulation, and the proposed protocol worked significantly better as the next-hop selection was based on an intelligent algorithm. the proposed algorithm helped to significantly reduce average end-to-end delays. figures 7 and 9 showed the reliability and availability performances of all protocols, including the proposed protocol that showed significantly better performance. this meant that the proposed protocol was a new generation protocol that has potential in many applications of wsn. in recent years, wsn saw its applications growing exponentially with the integration of iot. this gave a new purpose to the overall utility of data acquisition and transmission. with the integration of wsn with iot, the iot is making a big impact in diverse areas of life, i.e., e-healthcare, smart farming, traffic monitoring and regulation, weather forecast, automobiles, smart city, etc. all these applications are hugely dependent on the availability of real-time accurate data. healthcare with iot is one such area that involves critical decision making [31] [32] [33] . the proposed approach makes use of intelligent routing and, therefore, would help in making reliable and accurate delivery of data to the integrated healthcare infrastructure, for proper care of the patients. the proposed framework for e-healthcare is shown in figure 10 . as the proposed algorithm saves energy, the healthcare devices that are sensor-enabled can work for longer duration, and easy deployment and data analysis is possible due to iot integration [34] [35] [36] [37] [38] . according to the proposed architecture, there can be any different kind of sensor nodes, such as smart wearables, sensors collecting health data like temperature, heartbeat, number of steps taken every day, sleep patterns, etc. these factors have a correlation with different existing diseases. the best part of the integration of iot and wsn is that, with the help of sensors, data are collected and the same is stored in the cloud due to iot integration. once the health data is stored in the cloud, this cloud is a health-record cloud that belongs to a specific hospital or a public domain cloud. these cloud data can be accessed by healthcare professionals in a different way, to analyze the data and also provide feedback to a specific patient and group of patients. in recent years, wsn saw its applications growing exponentially with the integration of iot. this gave a new purpose to the overall utility of data acquisition and transmission. with the integration of wsn with iot, the iot is making a big impact in diverse areas of life, i.e., e-healthcare, smart farming, traffic monitoring and regulation, weather forecast, automobiles, smart city, etc. all these applications are hugely dependent on the availability of real-time accurate data. healthcare with iot is one such area that involves critical decision making [31] [32] [33] . the proposed approach makes use of intelligent routing and, therefore, would help in making reliable and accurate delivery of data to the integrated healthcare infrastructure, for proper care of the patients. the proposed framework for e-healthcare is shown in figure 10 . as the proposed algorithm saves energy, the healthcare devices that are sensor-enabled can work for longer duration, and easy deployment and data analysis is possible due to iot integration [34] [35] [36] [37] [38] . according to the proposed architecture, there can be any different kind of sensor nodes, such as smart wearables, sensors collecting health data like temperature, heartbeat, number of steps taken every day, sleep patterns, etc. these factors have a correlation with different existing diseases. the best part of the integration of iot and wsn is that, with the help of sensors, data are collected and the same is stored in the cloud due to iot integration. once the health data is stored in the cloud, this cloud is a health-record cloud that belongs to a specific hospital or a public domain cloud. these cloud data can be accessed by healthcare professionals in a different way, to analyze the data and also provide feedback to a specific patient and group of patients. in the recent epidemic of covid-19, telemedicine had become one of the most popular uses of this platform. doctors also started e-consulation to the patients and getting access to their health records, using the smart wearables of patients. sill, there are many challenges, and lot of improvements are required. the proposed work add towards better energy efficiency of sensors, so that they can work for longer durations. thereafter these sensor data can be integrated using iot and cloud, as per the proposed approach shown in figure 10 . in the recent epidemic of covid-19, telemedicine had become one of the most popular uses of this platform. doctors also started e-consulation to the patients and getting access to their health records, using the smart wearables of patients. sill, there are many challenges, and lot of improvements are required. the proposed work add towards better energy efficiency of sensors, so that they can work for longer durations. thereafter these sensor data can be integrated using iot and cloud, as per the proposed approach shown in figure 10 . in this paper, we proposed a new routing protocol (iop) for intelligently selecting the potential relay node using naïve baye's classifier to achieve energy efficiency and reliability among sensor nodes. residual energy and distance were used to find the probability of a node to become a next-hop forwarder. simulation results showed that the proposed iop improved the network lifetime, stability, and throughput of the sensor networks. the proposed protocol ensured that nodes that are far away from the base station become relay nodes, only when they have sufficient energy for performing this duty. additionally, a node in the middle of the source and destination has the highest probability to become a forwarder in a round. the simulation result showed that the proposed or scheme was better than mdor and eeor in energy efficiency and network lifetime. future work will examine the possibility of ensuring secure data transmission intelligently over the network. the authors declare no conflict of interest. an overview of evaluation metrics for routing protocols in wireless sensor networks comparative study of opportunistic routing in wireless sensor networks opportunistic routing protocols in wireless sensor networks towards green computing for internet of things: energy oriented path and message scheduling approach. sustain toward energy-oriented optimization for green communication in sensor enabled iot environments greedi: an energy efficient routing algorithm for big data on cloud. ad hoc netw an investigation on energy saving practices for 2020 and beyond opportunistic routing-a review and the challenges ahead a revised review on opportunistic routing protocol geographic random forwarding (geraf) for ad hoc and sensor networks: multihop performance opportunistic multi-hop routing for wireless networks optimal forwarder list selection in opportunistic routing simple, practical, and effective opportunistic routing for short-haul multi-hop wireless networks spectrum aware opportunistic routing in cognitive radio networks energy-efficient opportunistic routing in wireless sensor networks a trusted opportunistic routing algorithm for vanet a novel socially-aware opportunistic routing algorithm in mobile social networks opportunistic routing algorithm for relay node selection in wireless sensor networks economy: a duplicate free opportunistic routing mobile-aware service function chain migration in cloud-fog computing service function chain orchestration across multiple domains: a full mesh aggregation approach online learning offloading framework for heterogeneous mobile edge computing system virtualization in wireless sensor networks: fault tolerant embedding for internet of things traffic priority aware medium access control protocol for wireless body area network an energy efficient opportunistic routing metric for wireless sensor networks middle position dynamic energy opportunistic routing for wireless sensor networks an intelligent opportunistic routing protocol for big data in wsns recent advances in energy-efficient routing protocols for wireless sensor networks: a review radio link quality estimation in wireless sensor networks: a survey futuristic trends in network and communication technologies futuristic trends in networks and computing technologies communications in computer and information handbook of wireless sensor networks: issues and challenges in current scenario's lecture notes in networks and systems 121 proceedings of icric 2019 introduction on wireless sensor networks issues and challenges in current era. in handbook of wireless sensor networks: issues and challenges in current scenario's congestion control for named data networking-based wireless ad hoc network deployment and coverage in wireless sensor networks: a perspective key: cord-284186-zf1w8ksm authors: suran, j. n.; latney, l. v.; wyre, n. r. title: radiographic and ultrasonographic findings of the spleen and abdominal lymph nodes in healthy domestic ferrets date: 2017-04-17 journal: j small anim pract doi: 10.1111/jsap.12680 sha: doc_id: 284186 cord_uid: zf1w8ksm objective: to describe the radiographic and ultrasonographic characteristics of the spleen and abdominal lymph nodes in clinically healthy ferrets. materials and methods: fifty‐five clinically healthy ferrets were prospectively recruited for this cross‐sectional study. three‐view whole body radiographs and abdominal ultrasonography were performed on awake (23 out of 55) or sedated (32 out of 55) ferrets. on radiographs splenic and abdominal lymph node visibility was assessed. splenic thickness and echogenicity and lymph node length, thickness, echogenicity, number and presence of cyst‐like changes were recorded. results: the spleen was radiographically detectable in all ferrets. on ultrasound the spleen was hyperechoic to the liver (55 out of 55) and mildly hyperechoic (28 out of 55), isoechoic (15 out of 55) or mildly hypoechoic (12 out of 55) to the renal cortices. mean splenic thickness was 11.80 ±0.34 mm. lymph nodes were radiographically discernible in 28 out of 55 ferrets and included caudal mesenteric and sublumbar nodes. an average of 9 ±2 lymph nodes (mean± standard deviation; mode 10) were identified in each ferret using ultrasound. a single large jejunal lymph node was identified in all ferrets and had a mean thickness of 5.28 ± 1.66 mm. for other lymph nodes the mean thickness measurements plus one standard deviation were less than 4.4 mm (95% confidence interval: ≤ 3.72 mm). clinical significance: the information provided in this study may act as a baseline for evaluation of the spleen and lymph nodes in ferrets. radiography and ultrasonography are part of the standard of care in domestic ferrets; however, there are few reports describing imaging findings or anatomic variations, despite lymph node lesions such as reactive lymphadenopathy and lymphoma occurring commonly in ferrets (o ' brien et al . 1996 , paul-murphy et al . 1999 , schwarz et al . 2003 , kuijten et al . 2007 , zaffarano 2010 , garcia et al . 2011 , eshar et al . 2013 , mayer et al . 2014 . radiographic and ultrasonographic references provide a baseline for clinical evaluations. in ferrets, there is a large jejunal lymph node in the mid-abdomen at the root of the mesentery which is commonly palpable in healthy individuals (paul-murphy et al . 1999 ) . the jejunal lymph node is also known as the mesenteric lymph node or cranial mesenteric lymph node. according to the nomina anatomica veterinaria (wava-icvgan 2012 ) , while the jejunal lymph nodes are part of the cranial mesenteric lymphocentre, carnivores lack a cranial mesenteric lymph node. therefore, although historically this lymph node has been identified as the mesenteric lymph node, it may be more appropriately termed the jejunal lymph node in ferrets, and will be referred to as such throughout this article. in both previous studies evaluating ferret lymph nodes with ultrasound, a single large jejunal lymph node was found in all animals (paul-murphy et al . 1999 , garcia et al . 2011 . with ultrasound, the jejunal lymph node was described as a round to ovoid structure with uniform echogenicity near the centre of the small intestinal mesentery and near the cranial and caudal mesenteric veins surrounded by fat (paul-murphy et al . 1999 ) . the mean and standard deviation for mesenteric lymph node dimensions varied somewhat between the studies and was reported as 7·6 ±2·0 mm thick by 12·6 ±2·6 mm long and 5·3 ±1·39 mm thick by 10·18 ±2·36 mm long (paul-murphy et al . 1999 , garcia et al . 2011 . in the later of the two ultrasound studies, the pancreaticoduodenal, splenic, gastric and hepatic lymph nodes were also examined (garcia et al . 2011 ) . anatomic landmarks for those lymph nodes were similar to that previously described in cats (schreurs et al . 2008 ) . pancreaticoduodenal lymph nodes were identified in 55%, splenic lymph nodes in 55%, gastric lymph nodes in 20% and hepatic lymph nodes in 5% of the 20 ferrets (garcia et al . 2011 ). lymph nodes were described as circular to elongate, hypoechoic structures surrounded by fat; some lymph nodes also had a faint echogenic halo. length measurements (mean ±standard deviation) for the pancreaticoduodenal lymph nodes were reported as 5·29 ±1·32 mm, for the gastric lymph nodes as 7·7 ±2·6 mm, and for the splenic lymph nodes as 5·93 ±1·59 mm; thickness measurements were only provided for the mesenteric lymph node (garcia et al . 2011 ) . in dogs and cats these and several other lymph nodes, such as the colic and medial iliac nodes, can be detected with ultrasound (d ' anjou 2008 , schreurs et al . 2008 . whether all of the lymph nodes detected in other carnivore species could be ultrasonographically identified in ferrets has not been determined. while splenomegaly is common in ferrets, the imaging appearance of the spleen in clinically healthy ferrets has also not been previously described. potential causes for splenomegaly include extramedullary haematopoiesis, neoplasia (especially lymphoma), lymphoid or myeloid hyperplasia, hypersplenism and infectious diseases such as aleutian disease, systemic coronavirus infection, mycobacteriosis and cryptococcus (ferguson 1985 , eshar et al . 2010 , dominguez et al . 2011 , morrisey & kraus 2012 , pollock 2012 , mayer et al . 2014 , nakata et al . 2014 , lindemann et al . 2016 . the spleen also increases in size with age and after administration of anaesthetics (fox 2014 , mayer et al . 2014 . the goal of this prospective, cross-sectional study was to describe the characteristics of the spleen and abdominal lymph nodes on radiographs and with ultrasound in a sample of clientowned, clinically healthy domestic ferrets ( mustela putorius furo ). healthy, client-owned ferrets between four months and four years of age were prospectively recruited at the matthew j ryan veterinary hospital of the university of pennsylvania between february 2013 and october 2013. for the calculation of sample sizes, the standard deviations of previously reported measurements of abdominal viscera in clinically healthy ferrets were compared, including gross renal measurements (1·25, 1·5 mm), ultrasonographic adrenal thickness (0·5, 0·6 mm) and ultrasonographic jejunal lymph node thickness (1·39, 2.0 mm) (o ' brien et al . 1996 , paul-murphy et al . 1999 , kuijten et al . 2007 , garcia et al . 2011 , krautwald-junghanns et al . 2011 , fox 2014 . using an averaged standard deviation for adrenal gland thickness of 0·55 mm and 90 to 95% confidence intervals (ci) of ±0·25 mm, the number of individuals needed to detect significant differences in organ measurements would be 14 (90% ci) to 19 (95% ci) (hulley 2007 ) . using an averaged standard deviation of the renal measurements (1·375 mm) and 90 to 95% ci of ±0·6 mm gives the same results (hulley 2007 ) . these standard deviation values were chosen because of the relatively small differences in the respective reported values. as adrenal and renal size has been shown to vary based on sex, we estimated that a total of 19 ferrets for each sex would be needed, requiring a total recruitment of at least 38 clinically healthy ferrets (neuwirth et al . 1997 , eshar et al . 2013 . additional ferrets were able to be enrolled because of the success of recruitment and remaining funding. this study was approved by and conducted in accordance with the institutional animal care and use committee -privately owned animal protocol committee (iacuc-poap #804586); informed owner consent was received for all procedures and conducted in accordance with the privately owned animal protocol committee. a total of 112 presumably healthy ferrets were actively recruited; 57 ferrets were subsequently excluded for failure to meet the inclusion criteria. ferrets determined to be clinically healthy based on history, physical examination performed by an exotic animal veterinarian, complete blood count, chemistry panel, urinalysis and follow-up owner contact were included in the study. all procedures were performed on the same day as diagnostic imaging. owners were contacted regarding the health of their ferret following the study visit (mean 15 days, range 6 to 28 days) in an attempt to exclude those with occult illness at the time of imaging. exclusion criteria included a history of either transient illness in the past six months or a long-term illness, administration of medications or the presence of any hormonal implant, any gross physical examination or clinicopathologic abnormality and manifestation of illness reported at any point in an individual ' s follow-up. individuals were also excluded if they had gross radiographic or ultrasonographic abnormalities based on previously published guidelines for the adrenal glands in ferrets and based on our experience in ferrets and in other species (o ' brien et al . 1996 , neuwirth et al . 1997 , paul-murphy et al . 1999 , kuijten et al . 2007 , garcia et al . 2011 , krautwald-junghanns et al . 2011 , eshar et al . 2013 . ferrets were not excluded if a cutaneous mast cell tumour was the sole abnormality (n=3); cutaneous mast cell tumours are typically focal benign lesions in ferrets with visceral involvement and malignancy being rare (orcutt & tater 2012 ) . reasons for exclusion are summarised in diagnostic imaging was performed and evaluated by a single board-certified radiologist. three-view whole body radiographs (including the thorax, abdomen and pelvis) were obtained and included right lateral, left lateral, ventrodorsal projections (canon lanmix cxdi-50g detector; sound-eklin). abdominal ultrasonography was performed using an 8 to 18 mhz linear transducer (ge logiq s8 vet, sound-eklin) with the exception of five ferrets at the initiation of the study which were imaged with a 6 to 15 mhz linear transducer (ge medical logiq 9 ultrasound imaging system; general electric medical systems) due to equipment changes at our facility. ferrets were scanned in dorsal recumbency. the ventral abdomen was shaved and warmed coupling gel was used. ferrets were not fasted prior to imaging. the spleen was radiographically identified using similar guidelines as in dogs and cats (armbrust 2009 ). whether the ventral extremity of the spleen was visible on the lateral radiographic projections along the ventral abdomen was recorded. on ultrasound, the splenic echotexture (homogeneous, mottled, nodular) and relative echogenicity (compared to the hepatic parenchyma and compared to the renal cortices) were recorded. with the spleen imaged in a longitudinal plane, parallel to the long axis, such that the splenic vein branches were evident along the mesenteric margin of the spleen, maximal thickness of the spleen was measured from the mesenteric margin to the anti-mesenteric margin (fig 1 a) . radiographs were reviewed for distinguishable lymph nodes which were identified as small, round to oblong, soft-tissue opaque structures in the expected anatomic location of a lymph node and not associated with other visceral structures; these included sublumbar and caudal mesenteric nodes. the term "sublumbar lymph node" generally refers to lymph nodes from the iliosacral lymphocentre, including the medial iliac and internal iliac nodes, which may not be radiographically distinguishable from each other. the presence of a lymph node, number of nodes, maximal length and maximal thickness (perpendicular to the length measurement) were recorded. abdominal lymph nodes were ultrasonographically searched for and identified based on canine and feline anatomic references (bezuidenhout 1993 , d ' anjou 2008 , schreurs et al . 2008 ultrasound images of the spleen in a ferret with a homogeneous splenic echotexture. maximal splenic thickness measurements (callipers) were performed on images of the spleen parallel to its longitudinal axis and extended from the mesenteric margin to the anti-mesenteric margin. (b) ultrasound image of the spleen in a ferret with a nodular echotexture; small, round, ill-defined, hypoechoic regions were identifiable throughout the splenic parenchyma stantinescu & schaller 2012 , wava-icvgan 2012 ). lymph nodes evaluated included the hepatic, pancreaticoduodenal, splenic, gastric, jejunal (mesenteric or cranial mesenteric), caudal mesenteric (left colic), colic, ileocolic (right colic), lumbar aortic (peri-aortic) and medial iliac lymph nodes (fig 2 ) . localisation of the renal, internal iliac (hypogastric) and sacral nodes was attempted. the presence of a lymph node, number of nodes, maximal length, maximal thickness (perpendicular to the length measurement), echogenicity, identification of a hyperechoic hilus and cyst-like regions within lymph nodes were recorded. echogenicity was classified as homogeneous, hyperechoic hilus with a hypoechoic rim, and heterogeneous (with or without a discernable hilus). the short-to-long axis ratio (s:l) was calculated for each lymph node. lymph node shape was classified as rounded (s:l>0.5) or elongate (s:l< _0.5), similar to prior studies (de swarte et al . 2011 , beukers et al . 2013 . due to the u-shape or further serpentine shape of the large jejunal lymph nodes, length measurements for those lymph nodes were achieved by performing segmental linear measurements along the entire length of the node, then summing these values to achieve a total length (fig 3 ) . all measurements were performed in duplicate; measurements for each individual were averaged for analyses. for the procedures ferrets were manually restrained or sedated. manually restrained ferrets were given a liquid oil supplement (ferre-tone skin and coat supplement, 8 in 1 pet products) as a distraction and as a treat. ferrets that resisted manual restraint or demonstrated escape behaviours during manual restraint were sedated. for sedation, 0.25 mg/kg midazolam (sagent pharmaceuticals) and 0.25 mg/ kg butorphanol tartrate (torbugesic; fort dodge animal health) were administered im (intramuscularly), and reversed respectively with im 0.01 mg/kg flumazenil (hikma farmaceutica) and 0.02 mg/ kg naloxone (hospira inc) upon completion of imaging. all procedures were first discussed with owners and informed consent was obtained. this study was approved by the university of pennsylvania institutional animal care and use committee. statistical analyses employed were predominantly descriptive. the mean, standard deviation (sd) and 95% ci were calculated fifty-five ferrets were included in this study. forty-two ferrets (76%) originated from commercial breeders and 13 (24%) from private breeders. thirty-four ferrets were male (eight intact and fig 2 . schematic illustration of intra-abdominal lymph nodes and major vessels. lymph nodes: 1 = hepatic, 2 = pancreaticoduodenal, 3 = gastric, 4 = splenic, 5 = cranial mesenteric group, 6 = jejunal, 7 = caudal mesenteric, 8 = lumbar aortic, 9 and 9 ´ = medial iliac. vessels: ao = aorta, cmv = cranial mesenteric vein, cvc = caudal vena cava, dci = deep circumflex iliac vessels, ei = external iliac vessels, pv = portal vein, sv = splenic vein. other landmarks: c = colon, d = duodenum, j = jejunoileum, s = stomach, sp = spleen, lk = left kidney, rk = right kidney. (from: atlas of small animal ultrasonography by penninck & d ' anjou ( 2008 ) . reproduced with permission of blackwell pub in the format journal/magazine via copyright clearance center. minor changes were made to the original image) fig 3 . ultrasound image of a jejunal lymph node. a single large jejunal lymph node was identified in all ferrets. the lymph node has a hyperechoic hilus and a hypoechoic rim. because of the non-linear shape, length measurements were obtained by adding segmental linear measurements (callipers) along the long axes of the lymph node 26 neutered males) and 21 were female (one intact and 20 neutered females). all sexually intact ferrets (9/55) were from private breeders. of the neutered ferrets, 4 out of 20 neutered females were from private breeders and were neutered at a relatively later age (up to 1·8 years old) than those from commercial breeders (typically neutered before six weeks of age). with regards to body weight, intact males generally weighed more than neutered males, which generally weighed more than neutered females (table 1 ) . the single intact female in this study weighed more than the neutered females. age at presentation ranged from four months to 4·2 years (mean ±sd: 1·9 ±1.0 years). sedatives were administered to 32 out of 55 ferrets (58%) to facilitate the procedures. there were no complications associated with sedation or any of the procedures. the spleen was radiographically identifiable in all ferrets. on ventrodorsal radiographs, the craniodorsal extremity was seen as a triangular soft-tissue opaque structure in the left side of the abdomen along the body wall, caudal to the stomach and craniolateral to the left kidney, and partly summating with the stomach and left kidney. the spleen then variably extended caudally or caudomedially as a broad curvilinear to crescentic soft-tissue structure. on lateral radiographic projections, the craniodorsal extremity of the spleen could be seen as a triangular soft-tissue opaque structure in the craniodorsal abdomen, caudal to the gastric fundus. the spleen could then sometimes be seen extending caudoventrally in the mid-abdomen as a broad, curvilinear soft-tissue opaque structure summating with the intestines. on lateral radiographic projections, the spleen could be seen along the ventral abdomen in 21 out of 55 (38%) ferrets; in 4 out of 21 (19%) this was seen only on the right lateral projection and in 5 out of 21 (24%) only on the left lateral projection. the spleen was more frequently visible along the ventral abdomen in male ferrets (17 out of 21; 81% males) than female ferrets (4 out of 21; 19% females) and in sedated (16 out of 21; 76%) than nonsedated (5 out of 21; 24%) ferrets. on ultrasound, the spleen was identified in the left lateral abdomen. the spleen was hyperechoic relative to the liver (55 out of 55). the spleen was mildly hyperechoic to the renal cortices in 28 out of 55 (51%) ferrets, isoechoic in 15 out of 55 (27%) and mildly hypoechoic in 12 out of 55 (22%). it had a homogeneous echotexture in 38 out of 55 (69%), a mildly mottled echotexture in 12 out of 55 (22%), and had ill-defined, round, hypoechoic nodules in 5 out of 55 (9%) (fig 1 ) . the three ferrets with presumptively incidental cutaneous mast cell tumours all had a homogeneous splenic echotexture. with ultrasound the mean splenic thickness measurement was 11·80 ±0·34 mm (95% ci: 11·12 to 12·49 mm; range 8·16 to 17·70 mm). the mean thickness was a little more in ferrets in which the spleen was radiographically detected along the ventral abdomen (12·75 ±0·63 mm) compared to those in which it was not (11·24 ±0·37 mm). lymph nodes were radiographically discernible in 28 out of 55 (51%) ferrets. the radiographic frequency of lymph node detection and measurements are summarised in table 2 . caudal mesenteric lymph nodes were seen on lateral abdominal radiographs in 24 out of 55 (44%) ferrets. caudal mesenteric lymph nodes were identified as well-defined, oblong, soft-tissue opaque structures in the caudal abdomen dorsal and immediately adjacent to the descending colon at the level of l5 to l6 (fig 4 ) . one caudal mesenteric lymph node was distinguishable in 21 ferrets and two nodes were distinguishable in three ferrets. sublumbar lymph nodes were seen on lateral abdominal radiographs in 6 out of 55 (11%) ferrets. sublumbar lymph nodes were identified as well-defined, oblong, soft-tissue opaque structures in the caudal retroperitoneal space ventral to l5 and l6 (fig 5 ) . one sublumbar lymph node was distinguishable in five ferrets and two nodes were distinguishable in one ferret. as the medial iliac lymph node was the only iliosacral lymphocentre node that was ultrasonographically identified, the sublumbar lymph nodes that were radiographically detected presumably represent medial iliac lymph nodes. lymph nodes were found with ultrasound in the expected locations and corresponding anatomic landmarks (fig 2 ) as previously reported for dogs and cats (bezuidenhout 1993 , d ' anjou 2008 , schreurs et al . 2008 , constantinescu & schaller 2012 . detected lymph nodes included jejunal, pancreaticoduodenal, hepatic, caudal mesenteric, splenic, gastric, medial iliac and lumbar aortic lymph nodes. small lymph nodes were also seen in the mid-abdomen and were difficult to differentiate as ileocolic lymph nodes, colic lymph nodes or additional smaller jejunal lymph nodes. these were often seen in close proximity and slightly cranial to the single large jejunal lymph node. because these small nodes were not distinguishable as specific lymph nodes, these were grouped and termed cranial mesenteric lymph nodes. tables 3 and 4 , respectively. lymph node thickness measurements are also graphically depicted in fig 6 . an average and sd of 9 ±2 lymph nodes (mode 10 lymph nodes; range 5 to 14 lymph nodes) were identified in each ferret. a single large jejunal lymph node was identified in all ferrets. all lymph nodes were oblong, with the exception of the single large jejunal lymph node which was u-shaped or serpentine (fig 3 ) . most lymph nodes were hypoechoic relative to the surrounding mesenteric fat with a more echogenic central hilus (fig 3 ) . homogeneous lymph nodes were either hypoechoic or mildly hyperechoic. heterogeneous lymph nodes were mildly heterogeneous and mildly hyperechoic. several of the heterogeneous lymph nodes were predominantly hyperechoic with discontinuous hypoechoic marginal regions, not forming a complete hypoechoic rim. anechoic cyst-like regions (fig 7 ) were identified in 31 out of a total of 492 lymph nodes (6.3%) evaluated, and were most frequently detected in the pancreaticoduodenal lymph nodes (10 out of 31; 32·3% of cystic lymph nodes) followed by the hepatic lymph nodes (5 out of 31; 16·1% of cystic lymph nodes). lymph nodes with cystlike changes were present in 14 out of 55 (25·5%) ferrets. cystlike changes were present in one lymph node in seven ferrets, two lymph nodes in two ferrets, three lymph nodes in four ferrets and eight lymph nodes in one ferret. the mean age of ferrets with cyst-like regions was 2·79 ±0·85 years (range: 1·5 to 4·2 years), compared to the mean age of ferrets without detected cyst-like regions, 1·53 ±0·85 years (range: 4 months to 3·7 years). the results presented in this study provide the most comprehensive evaluation of the spleen and abdominal lymph nodes with radiographs and ultrasound in clinically healthy ferrets to date. table 3 . ultrasonographic features of abdominal lymph nodes in ferrets. lymph node shape was recorded as round or elongate based on the short-to-long axis ratio (s:l); the mode shape is provided in the table. lymph nodes with a s:l greater than 0.5 were characterised as round, while those with an s:l less than or equal to 0.5 were characterised as elongate. the number of ferrets in which cyst-like changes were detected or in which a hyperechoic hilus was appreciable is also reported. lymph node echogenicity was recorded as either homogeneous (either hypoechoic or mildly hyperechoic), hilar (having a hypoechoic rim and hyperechoic hilus) or heterogeneous (mildly heterogeneous and hyperechoic with or without a discernable hilus) the spleen was radiographically detectable in all ferrets, and had a similar appearance to that previously described in dogs and cats (armbrust 2009 ). in cats, the spleen is considered enlarged if the body or ventral extremity is visible along the ventral abdomen on lateral radiographs. aside from this guideline, radiographic assessment of splenic size is subjective in cats and dogs (armbrust 2009 ). the spleen was radiographically visible along the ventral abdomen in 38% of the clinically healthy ferrets in this study and therefore cannot be used as a general guideline to determine if the spleen is enlarged. additionally, as the body and ventral extremity of the spleen are mobile, splenic position within the abdomen may affect the radiographic appearance. specific radiographic guidelines to determine enlargement were not determined in this study; subjective radiographic assessment of the ferret spleen is therefore warranted in ferrets. gross splenic measurements have been previously reported as 5·10 cm length×1·80 cm width.0·80 cm thick (evans & an 2014 ) . the thickness measurements obtained in this study using ultrasound were all greater than previously reported with the smallest thickness measurement in this study being 8·16 mm. these discrepancies in measurements may result from differences between in vivo versus post-mortem sampling. signalment and body weight differences may also contribute but this information was not available for the ferrets in which the gross measurements were derived. on ultrasound, the spleen was hyperechoic to the liver. the relative echogenicity of the spleen compared to the renal cortices was variable, but most frequently the spleen was hyperechoic to the renal cortex. in cats fat may be deposited in the renal cortex and results in increased cortical echogenicity; this is associated with sex hormones in cats and is not associated with body weight (yeager & anderson 1989 , maxie 1993 . it is unknown whether renal cortical fat deposition also occurs in ferrets. the spleen had a homogeneous echotexture in 69% of ferrets. a mildly mottled echotexture or ill-defined small hypoechoic regions were seen in 31% of ferrets. causes for a non-homogeneous echogenicity may include potentially incidental etiologies such as nodular hyperplasia or extramedullary hematopoiesis, which commonly occurs in adult ferrets, although other subclinical pathologies, such as lymphoma or splenitis, cannot be excluded. ferrets with a non-homogeneous splenic echogenicity were not excluded as the ferrets remained clinically healthy throughout the study and follow-up period (levaditi et al . 1959 , mayer et al . 2014 . this is the first study to describe radiographic appearance of presumed normal lymph nodes in ferrets and those detected included the caudal mesenteric and sublumbar lymph nodes. at least one lymph node was radiographically discernible in 51% of ferrets in this study. radiographic lymph node measurements were generally greater than ultrasound measurements. the differences in measurements between imaging modalities is likely due to magnification on radiographs. silhouetting or superimposition of adjacent lymph nodes may also contribute to larger measurements on radiographs. additionally, measurements may be affected by patient positioning. the provided measurements are intended as a descriptor for lymph node dimensions. the potential clinical utility for radiographic lymph node measurements is questionable as the use of measurements for radiographic image interpretation has not been found to be more accurate than subjective image interpretation (lamb & nelson 2015 ) . multiple lymph node groups were detectable with ultrasound. similar to previous studies, a single large jejunal lymph node (also known as mesenteric or cranial mesenteric lymph node) was detected in all ferrets (paul-murphy et al . 1999 , garcia et al . 2011 . the detection frequencies of the hepatic, pancreaticoduodenal, splenic and gastric lymph nodes were much higher than previously reported (garcia et al . 2011 ) . because of the overall small patient size, detection and differentiation of lymph nodes can be difficult. with the small size of ferrets, structures in the abdomen are relatively close together; additionally, compression of the abdomen and abdominal viscera which occur during ultrasonography may further compound this. small lymph nodes in the mid-abdomen just cranial to the single large jejunal lymph node were difficult to differentiate as ileocolic lymph nodes, additional colic lymph nodes (possibly middle anechoic cyst-like changes in a gastric lymph node. the lymph node has a hyperechoic hilus and a hypoechoic rim and contains a lobular, anechoic cyst-like region colic lymph nodes), additional smaller jejunal lymph nodes or a combination thereof. some of these nodes were suspected to be ileocolic lymph nodes, although the ileocolic junction is not readily identifiable in ferrets, which complicates identification of lymph nodes as ileocolic lymph nodes (evans & an 2014 ) . as the specific lymph node location could not be determined for these small lymph nodes and they likely belonged to the cranial mesenteric lymphocentre, they were termed cranial mesenteric lymph nodes. in carnivores, the cranial mesenteric lymphocentre is comprised of the ileocolic, colic and jejunal lymph nodes (bezuidenhout 1993 , constantinescu & schaller 2012 . the clinical relevance and importance of separating these small midabdominal lymph nodes into ileocolic, colic and jejunal lymph nodes is not known. based on the results of this study, the jejunal, pancreaticoduodenal, hepatic and caudal mesenteric lymph nodes can be routinely detected with ultrasound in most ferrets. the pancreaticoduodenal and hepatic lymph nodes were detected in 96·4% (53 out of 55) and 94·5% (52 out of 55) of ferrets, respectively. it is possible that these lymph nodes were not detected in those two and three ferrets, respectively, due to patient disposition and human error. for the caudal mesenteric lymph nodes, 8 out of 9 individuals in which the caudal mesenteric lymph nodes were not ultrasonographically detected were imaged at the initiation of this study. it is suspected that the caudal mesenteric lymph nodes were not detected, as opposed to not present, in those individuals, and after a steep learning curve these lymph nodes were routinely identified. the medial iliac lymph nodes were detected at a relatively high frequency as well; however, their small size (specifically thickness) made them more difficult to detect than others. with the exception of the single large jejunal lymph node, the mean value plus one sd for thickness measurements were all less than 4·4 mm and the upper ranges of the 95% ci were less or equal to than 3·72 mm. although previous studies using ultrasound reported detection of a single large jejunal lymph node, an anatomical reference for ferrets mentions both a left and right lymph node (paul-murphy et al . 1999 , garcia et al . 2011 , evans & an 2014 . in this study, a single large jejunal lymph node was identified with ultrasound in all ferrets; additional smaller lymph nodes in the vicinity of the jejunal lymph node may have represented smaller jejunal lymph nodes, ileocolic lymph nodes and/or colic lymph nodes. the mean jejunal lymph node thickness in this study (5·28 ±1·66 mm) was similar to the previously reported mean of 5·3 mm by garcia et al . ( 2011 ) , but was less than the previously reported mean of 7.6 mm in the study by paul-murphy et al . ( 1999 ) . in the study by paul-murphy et al . ( 1999 ) , cytology was performed on a portion of the jejunal lymph nodes and demonstrated relatively high numbers of eosinophils in 55% of sampled nodes which may represent a normal variant in ferrets or may have represented occult disease (paul-murphy et al . 1999 ). cytology was not performed in this study or in that by garcia et al . ( 2011 ) . the mean jejunal lymph node length measurement (26·34 ±7·50 mm) was greater than the previously reported mean values (12·6 ±2·6 and 5·3 ±1·39 mm) (paul-murphy et al . 1999 , garcia et al . 2011 ; this is suspected to be due to differences in methodology. because the jejunal lymph node was serpentine, segmented linear measurements were summed to determine the total length. although the specific technique was not described in the previous studies, presumably a single linear measurement was previously performed along the long axis of the largest part of the lymph node. this presumed difference in technique would account for the length measurements in this study being greater than previously reported. in general, lymph node thickness measurements and short-to-long axis ratios may have more clinical utility than evaluation of length measurements alone. when lymph nodes enlarge, they tend to become more rounded, having a greater short-to-long axis ratio; this change is attributed to a greater increase in thickness measurements compared to length measurements (de swarte et al . 2011 ) . in humans and dogs, evaluation of the short-to-long axis ratio may assist in differentiating benign versus malignant neoplastic lymphadenopathies; a greater short-to-long axis ratio is associated with neoplastic lymphadenopathy while a lesser short-to-long axis ratio is associated with normal and reactive or inflammatory lymphadenopathies (nyman et al . 2004 , de swarte et al . 2011 . cyst-like changes were identified in 6·3% (31 out of 492) of all lymph nodes evaluated in this study and were more frequently identified in older ferrets. this finding is of unknown clinical significance, particularly since cyst-like changes were recognised in 25·5% (14 out of 55) of the clinically healthy ferrets of this study. cyst-like changes are suspected to represent lymphatic sinus ectasia (lymphangectasia, lymphatic cysts, cystic lymphatic ectasia or sinus dilation). hyperplastic lymph nodes are common in older ferrets secondary to underlying gastrointestinal inflammation (antinoff & williams 2012 ) . overt gastrointestinal abnormalities were not ultrasonographically detected in ferrets included for data analysis, and ferrets with clinical signs referable to the gastrointestinal tract were excluded; however, lymph node hyperplasia secondary to subclinical gastrointestinal or non-gastrointestinal pathology cannot be excluded. in rats and mice, lymphatic sinus ectasia is associated with lymphoid atro-phy and can be seen in ageing animals (sainte-marie et al . 1997 , elmore 2006 . in humans, cyst-like areas in lymph nodes can be seen with metastatic neoplasia, particularly secondary to necrosis. cystic necrosis results in an anechoic area within the lymph node and is commonly found in metastatic lymph nodes in humans. coagulative nodal necrosis is uncommonly seen, results in an echogenic area and can be seen in malignant and inflammatory lymph nodes (ahuja & ying 2003 ) . nodules comprised of neoplastic cells may also have a pseudocystic appearance on ultrasound (ahuja & ying 2003 ) . less frequently nodal metastasis may produce a true cyst with an epithelial lining (verma et al . 1994 ) . further studies with histopathologic evaluation of cystic lymph nodes and to evaluate the clinical significance of this change are warranted. because of the small patient size, adrenal glands can be mistaken for lymph nodes and vice versa. the hepatic lymph nodes may be mistaken for the right adrenal gland, for example. the portal vein and caudal vena cava can be seen relatively close together in the cranial abdomen of ferrets. additionally the caudal vena cava is easily compressed from pressure of the ultrasound transducer during ultrasonography. although both the portal vein and caudal vena cava may be seen adjacent to the hepatic lymph nodes, the hepatic lymph nodes are more closely associated with the portal vein, while the right adrenal gland is in close apposition to the caudal vena cava. careful evaluation of lymph nodes relative to their anatomic landmarks, which are crucial for differentiating lymph nodes from each other and from the adrenal glands and which are in close vicinity to other landmarks, is warranted. the major limitations of the study included external validity (generalisability), selection bias, misclassification bias and human error (reliability and internal consistency). the ferrets in this study may not be representative of the general population; most ferrets in this study, in common with most in the usa, originated from a single large commercial breeder. additionally there were very few sexually intact ferrets. the generalisability of the findings in this study to ferrets from geographic locations outside of the usa is difficult to judge. another limitation is that there was no gold standard to confirm a disease-free status. organ sampling was not within the scope of this study. as an imperfect surrogate, we used the individual ' s history, physical exam, blood work, urinalysis and follow-up owner contact. diagnostic imaging findings did result in some patients being excluded, who may bias the data and result in exclusion of normal variants; however, based on our experiences with ultrasound in ferrets and in other species, those individuals were considered very likely to be abnormal. the included ferrets were clinically healthy throughout the study and follow-up interim. inter-observer and intra-observer differences were not evaluated. in summary, the information provided in this study may act as a baseline for evaluation of the spleen and lymph nodes in ferrets. on radiographs the spleen was visible in all ferrets, and the sublumbar or caudal mesenteric lymph nodes were discernible in 51% of ferrets. with ultrasound the spleen was hyperechoic to the liver and most often had a homogeneous or mildly mottled echotexture; additionally, multiple lymph nodes were identified. the jejunal, pancreaticoduodenal, hepatic and caudal mesenteric lymph nodes can be routinely detected with ultrasound. the jejunal lymph node was seen in all ferrets and had a mean thickness of 5·28 ±1·66 mm. the mean thickness measurements plus one sd for all other lymph nodes were less than 4.4 mm. additional studies evaluating the clinical utility and predictive validity of the provided measurements are warranted. abdominal cavity, lymph nodes, and great vessels sonography of neck lymph nodes. part ii: abnormal lymph nodes neoplasia . in: ferrets, rabbits, and rodents the spleen . in: bsava manual of canine and feline abdominal imaging computed tomographic characteristics of presumed normal canine abdominal lymph nodes the lymphatic system . in: miller ' s anatomy of the dog illustrated veterinary anatomical nomenclature abdominal radiographic and ultrasonographic findings in ferrets (mustela putorius furo) with systemic coronavirus infection histopathology of the lymph nodes disseminated, histologically confirmed cryptococcus spp infection in a domestic ferret radiographic kidney measurements in north american pet ferrets (mustela furo) anatomy of the ferret idiopathic hypersplenism in a ferret normal clinical and biological parameters anatomia ultrassonográfica dos linfonodos abdominais de furões europeus hígidos appendix 6d: sample size for a descriptive study funding was provided by an institutional grant from the university of pennsylvania and a donation from abaxis. the authors would like to acknowledge and thank bruce williams for consultation on histopathology; thomas tyson for medical illustration; mary baldwin, alisa rassin and max emanuel for technical assistance and scales and tails rescue for assistance with recruitment. no conflicts of interest have been declared. key: cord-199630-2lmwnfda authors: ray, sumanta; lall, snehalika; mukhopadhyay, anirban; bandyopadhyay, sanghamitra; schonhuth, alexander title: predicting potential drug targets and repurposable drugs for covid-19 via a deep generative model for graphs date: 2020-07-05 journal: nan doi: nan sha: doc_id: 199630 cord_uid: 2lmwnfda coronavirus disease 2019 (covid-19) has been creating a worldwide pandemic situation. repurposing drugs, already shown to be free of harmful side effects, for the treatment of covid-19 patients is an important option in launching novel therapeutic strategies. therefore, reliable molecule interaction data are a crucial basis, where drug-/protein-protein interaction networks establish invaluable, year-long carefully curated data resources. however, these resources have not yet been systematically exploited using high-performance artificial intelligence approaches. here, we combine three networks, two of which are year-long curated, and one of which, on sars-cov-2-human host-virus protein interactions, was published only most recently (30th of april 2020), raising a novel network that puts drugs, human and virus proteins into mutual context. we apply variational graph autoencoders (vgaes), representing most advanced deep learning based methodology for the analysis of data that are subject to network constraints. reliable simulations confirm that we operate at utmost accuracy in terms of predicting missing links. we then predict hitherto unknown links between drugs and human proteins against which virus proteins preferably bind. the corresponding therapeutic agents present splendid starting points for exploring novel host-directed therapy (hdt) options. the pandemic of covid-19 (coronavirus disease-2019) has affected more than 6 million people. so far, it has caused about 0.4 million deaths in over 200 countries worldwide (https://coronavirus.jhu.edu/map.html), with numbers still increasing rapidly. covid-19 is an acute respiratory disease caused by a highly virulent and contagious novel coronavirus strain, sars-cov-2, which is an enveloped, single-stranded rna virus 1 . sensing the urgency, researchers have been relentlessly searching for possible therapeutic strategies in the last few weeks, so as to control the rapid spread. in their quest, drug repurposing establishes one of the most relevant options, where drugs that have been approved (at least preclinically) for fighting other diseases, are screened for their possible alternative use against the disease of interest, which is covid-19 here. because they were shown to lack severe side effects before, risks in the immediate application of repurposed drugs are limited. in comparison with de novo drug design, repurposing drugs offers various advantages. most importantly, the reduced time frame in development suits the urgency of the situation in general. furthermore, most recent, and most advanced artificial intelligence (ai) approaches have boosted drug repurposing in terms of throughput and accuracy enormously. finally, it is important to understand that the 3d structures of the majority of viral proteins have remained largely unknown, which raises the puts up the obstacles for direct approaches to work even higher. the foundation of ai based drug repurposing are molecule interaction data, optimally reflecting how drugs, viral and host proteins get into contact with each other. during the life cycle of a virus, the viral proteins interact with various human proteins in the infected cells. through these interactions, the virus hijacks the host cell machinery for replication, thereby affecting the normal function of the proteins it interacts with. to develop suitable therapeutic strategies and design antiviral drugs, a comprehensive understanding of the interactions between viral and human proteins is essential 2 . when watching out for drugs that can be repurposed to fight the virus, one has to realize that targeting single virus proteins easily leads to the viruses escaping the (rather simpleminded) attack by raising resistance-inducing mutations. therefore, host-(1) we link existing high-quality, long-term curated and refined, large scale drug/protein -protein interaction data with (2) molecular interaction data on sars-cov-2 itself, raised only a handful of weeks ago, (3) exploit the resulting overarching network using most advanced, ai boosted techniques (4) for repurposing drugs in the fight against sars-cov-2 (5) in the frame of hdt based strategies. as for (3)-(5), we will highlight interactions between sars-cov-2-host protein and human proteins important for the virus to persist using most advanced deep learning techniques that cater to exploiting network data. we are convinced that many of the fairly broad spectrum of drugs we raise will be amenable to developing successful hdt's against covid-19. in the following, we will first describe the workflow of our analysis pipeline and the basic ideas that support it. we proceed by carrying out a simulation study that proves that our pipeline accurately predicts missing links in the encompassing drug -human protein -sars-cov-2-protein network that we raise and analyze. namely we demonstrate that our (high-performance, ai supported) prediction pipeline accurately re-establishes links that had been explicitly removed before. this provides sound evidence that the interactions that we predict in the full network most likely reflect true interactions between molecular interfaces. subsequently, we continue with the core experiments. we predict links to be missing in the full (without artificially having removed links), encompassing drug -human protein -sars-cov-2-protein network, raised by combining links from year-long curated resources on the one hand and most recently published covid-19 resources on the other hand. as per our simulation study, a large fraction, if not the vast majority of the predictions establish true, hence actionable interactions between drugs on the one hand and sars-cov-2 associated human proteins (hence of use in hdt) on the other hand. a b c d figure 1 . overall workflow of the proposed method: the three networks sars-cov-2-host ppi, human ppi, and drug-target network (panel-a) are mapped by their common interactors to form an integrated representation (panel-b). the neighborhood sampling strategy node2vec converts the network into fixed-size low dimensional representations that perverse the properties of the nodes belonging to the three major components of the integrated network (panel-c). the resulting feature matrix (f) from the node embeddings and adjacency matrix (a) from the integrated network are used to train a vgae model, which is then used for prediction (panel-d). for the purposes of high-confidence validation, we carry out a literature study on the overall 92 drugs we put forward. for this, we inspect the postulated mechanism-of-action of the drugs in the frame of several diseases, including sars-cov and mers-cov driven diseases in particular. see figure 1 for the workflow of our analysis pipeline and the basic ideas that support it. we will describe all important steps in the paragraphs of this subsection. this reduces the training time compared to the general graph autoencoder model. we tested the model performance for a different number of sampled nodes, keeping track of the area under the roc curve (auc), average precision (ap) score, and model training time in the frame of a train-validation-test split at proportions 8:1:1. table 1 shows the performance of the model for sampled sugraph sizes n s = 7000, 5000, 3000, 2500 and 1000. for 5000 sampled nodes, the model's performance is sufficiently good enough concerning its training time and validation-auc and -ap score. the average test roc-auc and ap score of the model for n s =5000 are 88.53 ± 0.03 and 84.44 ± 0.04. to know the efficacy of the model in discovering the existing edges between only cov-host and drug nodes, we train the model (with n s =5000) on an incomplete version of the graph where the links between cov-host and drugs have been removed. we further compute the feature matrix f based on the incomplete graph, and use it. the test set consists of all the previously removed edges. the model performance is no doubt better for discovering those edges between cov-host and drug nodes (roc-auc: 93.56 ± 0.01 ap: 90.88 ± 0.02 for 100 runs). the fastgae model is learned with the feature matrix (f) and adjacency matrix (a). the node feature matrix (f) is obtained from a using the node2vec neighborhood sampling strategy. the model performance is evaluated with and without using f as feature matrix. figure 2 shows the average performance of the model on validation sets with and without f as input for the different number of sampling nodes. we calculate average auc, and ap scores for 50 complete runs of the model. from figure 2 , it is evident that including f as feature matrix enhances the model's performance markedly. we use the node2vec framework to learn low dimensional embeddings of each node in the compiled network. it uses the skipgram algorithm of the word2vec model to learn the embeddings, which eventually groups nodes with a similar 'role' or having a similar 'connection pattern' within the graph. similar 'role' ensures that nodes within the sets/groups are structurally similar/equivalent than the other nodes outside the groups. two nodes are said to be structurally equivalent if they have identical connection patterns to the rest of the network 20 . to explore this, we have analyzed the embedding results in two steps. first, we explore structurally equivalent nodes to identify 'roles' and similar connection patterns to the rest of the networks, and later use lovain clustering to examine the same within the groups/clusters. the most_similar function of the node2vec inspects the structurally equivalent nodes within the network. we find out all the cov-host nodes which are most similar to the drug nodes. while it is expected to observe nodes of the same types within the neighborhood of a particular node, in some cases, we found some drugs are neighbors of cov-host proteins with high probability (pobs > 0.65). sars-cov-2 3cl protease 21 . some other drugs such as 'clenbuterol' and 'fenbendazole', the probable neighbor of ppp1cb and eef1a respectively, are used as bronchodilators in asthma. to explore the closely connected groups, we have constructed a neighborhood graph using the k-th nearest neighbor algorithm from the node embeddings and apply louvain clustering ( figure 3 -panel-c). although there is a clear separation between host proteins (including cov-host) cluster and drug cluster, some of the louvain clusters contain both types of nodes. for example, louvain cluster-16 and -17 contain four and two drugs along with the other cov-host proteins, respectively. figure 3 panel-d represents a network consisting of these six drugs and their most similar cov-host nodes. for drug-cov-host interaction prediction, we exploit variational graph autoencoder (vgae), an unsupervised graph neural network model, first introduced in 18 to leverage the concept of variational autoencoder in graph-structured data. to make learning faster, we utilized the fastgae model to take advantage of the fast decoding phase. we have used two data matrices in the fastgae model for learning: one is the adjacency matrix, which represents the interaction information over all the nodes, and the other one is the feature matrix representing the low-dimensional embeddings of all the nodes in the network. we create a test set of 'non-edges' by removing all existing links between drugs and cov-host proteins from all possible combinations (332 cov-host × 1302 drugs) of edges. the model is trained on the whole network with the adjacency matrix a and feature matrix f. the trained model is then applied to the test 'non-edges' to know the most probable links. we identified a total of 692 most probable links with 92 drugs and 78 cov-host proteins with a probability threshold of 0.8. the predicted cov-host proteins are involved in different crucial pathways of viral infection (table 4). the p-values for pathway and go enrichment are calculated by using the hypergeometric test with 0.05 fdr corrections. figure 4 , panel-a shows the heatmap of probability scores between predicted drugs and cov-host proteins. to get more details of the predicted bipartite graph, we figure 4 . drug-cov-host predicted interaction: panel-a shows heatmap of probability scores between 92 drugs and 78 cov-host proteins. the four predicted bipartite modules are annotated as b1, b2, b3 and b4 within the heatmap. the drugs are colored based on their clinical phase (red-launched, preclinical-blue, phase2/phase3-green and phase-1/ phase-2-black ). panel-b, c, d and e represents networks corresponding to b1, b2, b3 and b4 modules.the drugs are annotated using the disease area found in cmap database 22 a b c d e figure 5 . predicted interactions for probability threshold: 0.9. panel-a shows the interaction graph between drugs and cov-host. drugs are annotated with their usage. panel-b, c, d and e represents quasi-bicliques for one, two, three and more than three drugs molecules respectively. use a weighted bipartite clustering algorithm proposed by j. beckett 23 . this results in 4 bipartite modules (panel-a figure 4): b1 (11 drugs, 28 cov-host), b2 (4 drugs, 41 cov-host), b3 (71 rugs and 4 cov-host), and b4 ( 6 drugs and 5 cov-host). the other panels of the figure show the network diagram of four bipartite modules. b1 contains 11 drugs, including some antibiotics (anisomycin, midecamycin), and anti-cancer drugs (doxorubicin, camptothecin). b3 also has some antibiotics such as puromycin, demeclocycline, dirithromycin, geldanamycin, and chlortetracycline, among them, the first three are widely used for bronchitis, pneumonia, and respiratory tract infections 24 . some other drugs such as lobeline and ambroxol included in the b3 module have a variety of therapeutic uses, including respiratory disorders and bronchitis. the high confidence predicted interactions (with threshold 0.9) is shown in figure 5 panel-a. to highlight some repurposable drug combination and their predicted cov-host target, we perform a weighted clustering (clusterone) 25 on this network and found some quasy-bicluques (shown in panel-b-e) we matched our predicted drugs with the drug list recently published by zhou et al. 13 and found six common drugs: mesalazine, vinblastine, menadione, medrysone, fulvestrant, and apigenin. among them, apigenin has a known effect in the antiviral activity together with quercetin, rutin, and other flavonoids 26 . mesalazine is also proven to be extremely effective in the treatment of other viral diseases like influenza a/h5n1 virus. 27 . baclofen, a benzodiazepine receptor (gabaa-receptor) agonist, has a potential role in antiviral associated treatment 28 . antiinflammatory antecedents fisetin is also tested for antiviral activity, such as for inhibition of dengue (denv) virus infection 29 . it down-regulates the production of proinflammatory cytokines induced by a denv infection. both of the drugs are listed in the high confidence interaction set with the three cov-hosts: tapt1 (interacted with sars-cov-2 protein: orf9c), slc30a6 (interacted with sars-cov-2 protein: orf9c), and trim59 (interacted with sars-cov-2 protein: orf3a) ( figure 5 -panel-c). topoisomerase inhibitors play an active role as antiviral agents by inhibiting the viral dna replication 30, 31 . some topoisomerase inhibitors such as camptothecin, daunorubicin, doxorubicin, irinotecan and mitoxantrone are predicted to interact with several cov-host proteins. it has been demonstrated that the anticancer drug camptothecin (cpt) and its derivative irinotecan have a potential role in antiviral activity 32, 33 . it inhibits host cell enzyme topoisomerase-i which is required for the initiation as well as completion of viral functions in host cell 34 . daunorubicin (dnr) has also been demonstrated as an inhibitor of hiv-1 virus replication in human host cells 35 . the conventional anticancer antibiotic doxorubicin was identified as a selective inhibitor of in vitro dengue and yellow fever virus replication 36 . it is also reported that doxorubicin coupling with monoclonal antibody can create an immunoconjugate that can eliminate hiv-1 infection in mice cell 37 . mitoxantrone shows antiviral activity against the human herpes simplex virus (hsv1) by reducing the transcription of viral genes in many human cells that are essential for dna synthesis 38 . histone deacetylases inhibitors (hdaci) are generally used as latency-reversing agents for purging hiv-1 from the latent reservoir like cd4 memory cell 39 . our predicted drug list (table 3 ) contains two hdaci: scriptaid and vorinostat. vorinostrate can be used to achieve latency reversal in the hiv-1 virus safely and repeatedly 40 . asymptomatic patients infected with sars-cov-2 are of significant concern as they are more vulnerable to infect large number of people than symptomatic patients. moreover, in most cases (99 percentile), patients develop symptoms after an average of 5-14 days, which is longer than the incubation period of sars, mers, or other viruses 41 . to this end, hdaci may serve as good candidates for recognizing and clearing the cells in which sars-cov-2 latency has been reversed. heat shock protein 90 (hsp) is described as a crucial host factor in the life cycle of several viruses that includes an entry in the cell, nuclear import, transcription, and replication 42, 43 . hsp90 is also shown to be an essential factor for sars-cov-2 envelop (e) protein 44 . in 45 , hsp90 is described as a promising target for antiviral drugs. the list of predicted drugs contains three hsp inhibitors: tanespimycin, geldanamycin, and its derivative alvespimycin. the first two have a substantial effect in inhibiting the replication of herpes simplex virus and human enterovirus 71 (ev71), respectively. recently in 46 , geldanamycin and its derivatives are proposed to be an effective drug in the treatment of covid-19. inhibiting dna synthesis during viral replication is one of the critical steps in disrupting the viral infection. the list of predicted drugs contains six such small molecules/drugs, viz., niclosamide, azacitidine, anisomycin, novobiocin, primaquine, menadione, and metronidazole. dna synthesis inhibitor niclosamide has a great potential to treat a variety of viral infections, including sars-cov, mers-cov, and hcv virus 47 and has recently been described as a potential candidate to fight the 9/19 sars-cov-2 virus 47 . novobiocin, an aminocoumarin antibiotic, is also used in the treatment of zika virus (zikv) infections due to its protease inhibitory activity. in 2005, chloroquine (cq) had been demonstrated as an effective drug against the spread of severe acute respiratory syndrome (sars) coronavirus (sars-cov). recently hydroxychloroquine (hcq) sulfate, a derivative of cq, has been evaluated to efficiently inhibit sars-cov-2 infection in vitro 48 . therefore, another anti-malarial aminoquinolin drug primaquine may also contribute to the attenuation of the inflammatory response of covid-19 patients. primaquine is also established to be effective in the treatment of pneumocystis pneumonia (pcp) 49 . cardiac glycosides have been shown to play a crucial role in antiviral drugs. these drugs target cell host proteins, which help reduce the resistance to antiviral treatments. the antiviral effects of cardiac glycosides have been described by inhibiting the pump function of na, k-atpase. this makes them essential drugs against human viral infections. the predicted list of drugs contains three cardiac glycosides atpase inhibitors: digoxin, digitoxigenin, and ouabain. these drugs have been reported to be effective against different viruses such as herpes simplex, influenza, chikungunya, coronavirus, and respiratory syncytial virus 50 . mg132, proteasomal inhibitor is established to be a strong inhibitor of sars-cov replication in early steps of the viral life cycle 51 . mg132 inhibits the cysteine protease m-calpain, which results in a pronounced inhibition of sars-cov-2 replication in the host cell. in 52 , resveratrol has been demonstrated to be a significant inhibitor mers-cov infection. resveratrol treatment decreases the expression of nucleocapsid (n) protein of mers-cov, which is essential for viral replication. as mg132 and resveratrol play a vital role in inhibiting the replication of other coronaviruses sars-cov and mers-cov, so they may be potential candidates for the prevention and treatment of sars-cov-2. another drug captopril is known as angiotensin ii receptor blockers (arb), which directly inhibits the production of angiotensin ii. in 53 , angiotensin-converting enzyme 2 (ace2) is demonstrated as the binding site for sars-cov-2. so angiotensin ii receptor blockers (arb) may be good candidates to use in the tentative treatment for sars-cov-2 infections 54 . in summary, our proposed method predicts several drug targets and multiple repurposable drugs that have prominent literature evidence of uses as antiviral drugs, especially for two other coronavirus species sars-cov and mers-cov. some drugs are also directly associated with the treatment of sars-cov-2 identified by recent literature. however, further clinical trials and several preclinical experiments are required to validate the clinical benefits of these potential drugs and drug targets. in this work, we have successfully generated a list of high-confidence candidate drugs that can be repurposed to counteract sars-cov-2 infections. the novelties have been to integrate most recently published sars-cov-2 protein interaction data on the one hand, and to use most recent, most advanced ai (deep learning) based high-performance prediction machinery on the other hand, as the two major points. in experiments, we have validated that our prediction pipeline operates at utmost accuracy, confirming the quality of the predictions we have raised. the recent publication (april 30, 2020) of two novel sars-cov-2-human protein interaction resources 15, 16 has unlocked enormous possibilities in studying virulence and pathogenicity of sars-cov-2, and the driving mechanisms behind it. only now, various experimental and computational approaches in the design of drugs against covid-19 have become conceivable, and only now such approaches can be exploited truly systematically, at both sufficiently high throughput and accuracy. here, to the best of our knowledge, we have done this for the first time. we have integrated the new sars-cov-2 protein interaction data with well established, long-term curated human protein and drug interaction data. these data capture hundreds of thousands approved interfaces between encompassing sets of molecules, either reflecting drugs or human proteins. as a result, we have obtained a comprehensive drug-human-virus interaction network that reflects the latest state of the art in terms of our knowledge about how sars-cov-2 and interacts with human proteins and repurposable drugs. for exploiting the new network-already establishing a new resource in its own right-we have opted for most recent and advanced deep learning based technology. a generic reason for this choice is the surge in advances and the resulting boost in operative prediction performance of related methods over the last 3-4 years. a particular reason is to make use of most advanced graph neural network based techniques, namely variational graph autoencoders as a deep generative model of utmost accuracy, the practical implementation of which 19 was presented only a few months ago (just like the relevant network data). note that only this recent implementation enables to process networks of sizes in the range of common molecular interaction data. in essence, graph neural networks "learn" the structure of links in networks, and infer rules that underlie the interplay of links. based on the knowledge gained, they enable to predict links and output the corresponding links together with probabilities for them to indeed be missing. simulation experiments, reflecting scenarios where links known to exist in our network were re-established by prediction upon their removal, pointed out that our pipeline does indeed predict missing links at utmost accuracy. encouraged by these simulations, we proceeded by performing the core experiments, and predicted links to be missing without prior removal of links in our encompassing network. these core experiments revealed 692 high confidence interactions relating to 92 drugs. in our experiments, we focused on predicting links between drugs and human proteins that in turn are known to interact with sars-cov-2 proteins (sars-cov-2 associated host proteins). we have decidedly put the focus not on drug -sars-cov-2-protein interactions, which would have reflected more direct therapy strategies against the virus. instead, we have focused on predicting drugs that serve the purposes of host-directed therapy (hdt) options, because hdt strategies have proven to be more sustainable with respect to mutations by which the virus escapes a response to the therapy applied. note that hdt strategies particularly cater to drug repurposing attempts, because repurposed drugs have already proven to lack severe side effects, because they are either already in use, or have successfully passed the preclinical trial stages. we further systematically categorized the 92 repurposable drugs into 70 categories based on their domains of application and molecular mechanism. according to this, we identified and highlighted several drugs that target host proteins that the virus needs to enter (and subsequently hijack) human cells. one such example is captopril, which directly inhibits the production of angiotensin-converting enzyme-2 (ace-2), in turn already known to be a crucial host factor for sars-cov-2. further, we identified primaquine, as an antimalaria drug used to prevent the malaria and also pneumocystis pneumonia (pcp) relapses, because it interacts with the tim complex timm29 and alg11. moreover, we have highlighted drugs that act as dna replication inhibitor (niclosamide, anisomycin), glucocorticoid receptor agonists (medrysone), atpase inhibitors (digitoxigenin, digoxin), topoisomerase inhibitors (camptothecin, irinotecan), and proteosomal inhibitors (mg-132). note that some drugs are known to have rather severe side effects from their original use (doxorubicin, vinblastine), but the disrupting effects of their short-term usage in severe covid-19 infections may mean sufficient compensation. in summary, we have compiled a list of drugs, which when repurposed are of great potential in the fight against the covid-19 pandemic, where therapy options are urgently needed. our list of predicted drugs suggests both options that had been identified and thoroughly discussed before and new opportunities that had not been pointed out earlier. the latter class of drugs may offer valuable chances for pursuing new therapy strategies against covid-19. we have utilized three categories of interaction datasets: human protein-protein interactome data, sars-cov-2-host protein interaction data, and drug-host interaction data. we have taken sars-cov-2-host interaction information from two recent studies by gordon et al and dick et al 15, 16 . in 15 , 332 high confidence interactions between sars-cov-2 and human proteins are predicted using using affinity-purification mass spectrometry (ap-ms). in 16 , 261 high confidence interactions are identified using sequence-based ppi predictors (pipe4 & sprint). the drug-target interaction information has been collected from five databases, viz., drugbank database (v4.3) 57 , chembl 58 database, therapeutic target database (ttd) 59 , pharmgkb database, and iuphar/bps guide to pharmacology 60 . total number of drugs and drug-host interactions used in this study are 1309 and 1788407, respectively. we have built a comprehensive list of human ppis from two datasets: (1) ccsb human interactome database consisting of 7,000 genes, and 13944 high-quality binary interactions 61-63 , (2) the human protein reference database 56 which consists of 8920 proteins and 53184 ppis. the summary of all the datasets is provided in table 2 . cmap database 22 is used to annotate the drugs with their usage different disease areas. we have utilized node2vec 17 , an algorithmic framework for learning continuous feature representations for nodes in networks. it maps the nodes to a low-dimensional feature space that maximizes the likelihood of preserving network neighborhoods. the principle of feature learning framework in a graph can be described as follows: let g = (v, e) be a given graph, where v represents a set of nodes, and e represents the set of edges. the feature representation of nodes (|v |) is given by a mapping function: f : v → r d , where d specify the feature dimension. the f may also be represented as a node feature matrix of dimension of |v | × d. for each node, v ∈ v , nn s (v) ⊂ v defines a network neighborhood of node v which is generated using a neighbourhood sampling strategy s. the sampling strategy can be described as an interpolation between breadth-first search and depth-first search technique 17 . the objective function can be described as: this maximizes the likelihood of observing a network neighborhood nn s (v) for a node v given on its feature representation f . now the probability of observing a neighborhood node n i ∈ nn s (v) given the feature representation of the source node v is given as : where, n i is the i th neighbor of node v in neighborhood set nn s (v). the conditional likelihood of each source (v) and neighborhood node (n i ∈ nn s (v )) pair is represented as softmax of dot product of their features f (v) and f (n i ) as follows: variational graph autoencoder (vgae) is a framework for unsupervised learning on graph-structured data 64 . this model uses latent variables and is effective in learning interpretable latent representations for undirected graphs. the graph autoencoder consists of two stacked models: 1) encoder and 2) decoder. first, an encoder based on graph convolution networks (gcn) 18 maps the nodes into a low-dimensional embedding space. subsequently, a decoder attempts to reconstruct the original graph structure from the encoder representations. both models are jointly trained to optimize the quality of the reconstruction from the embedding space, in an unsupervised way. the functions of these two model can be described as follows: encoder: it uses graph convolution network (gcn) on adjacency matrix a and the feature representation matrix f. encoder generates a d -dimensional latent variable z i for each node i ∈ v , with |v | = n, that corresponds to each embedding node, with d ≤ n. the inference model of the encoder is given below: where, r(z i |a, f) corresponds to normal distribution, n ( z i µ i , σ 2 i ), µ i and σ i are the gaussian mean and variance parameters. the actual embedding vectors z i are samples drawn from these distributions. decoder: it is a generative model that decodes the latent variables z i to reconstruct the matrix a using inner products with sigmoid activation from embedding vector, (z). where, a is the decoded adjacency matrix. the objective function of the variational graph autoencoder (vgae) can be written as: the objective function c v gae maximizes the likelihood of decoding the adjacency matrix w.r.t graph autoencoder weights using stochastic gradient decent. here, d kl (.||.) represents kullback-leibler divergence 65 and p(z) is the prior distribution of latent variable. drug-sars-cov-2 link prediction 1. adjacency matrix preparation in this work, we consider an undirected graph g = (v, e) with |v | = n nodes and |e| = m edges. we denote a as the binary adjacency matrix of g. here v consists of sars-cov-2 proteins, cov-host proteins, drug-target proteins and drugs. the matrix (a) contains a total of n = 16444 nodes given as: where, n nc is the number of sars-cov-2 proteins. n dt is the number of drug targets, whereas n nt and n d represent the number of cov-host and drugs nodes, respectively. total number of edges is given by: where, e 1 represents interactions between sars-cov-2 and human host proteins, e 2 is the number of interactions among human proteins, and e 3 represents the number of interactions between drugs and human host proteins. the neighborhood sampling strategy is used here to prepare a feature representation of all nodes. a flexible biased random walk procedure is employed to explore the neighborhood of each node. a random walk in a graph g can be described as the probability: where, π(v, x) is the transition probability between nodes v and x, where (v, x) ∈ e and a i is the i th node in the walk of length l. the transition probability is given by π(v, x) = c pq (t, x) * w vx , where t is the previous node of v n the walk, w vx is the static edge weights and p, q are the two parameters which guides the walk. the coefficient c pq (t, x) is given by where, distance(t, x) represents the shortest path distance between nodes t and node x. the process of feature matrix f n×d generation is governed by the node2vec algorithm. it starts from every nodes and simulates r random walks of fixed length l. in every step of walk transition probability π(v, x) govern the sampling. the generated walk of each iteration is included to a walk-list. finally, the stochastic gradient descent is applied to optimize the list of walks and result is returned. 3. link prediction: scalable and fast variational graph autoencoder (fastvgae) 19 is utilized in our proposed work to reduce the computational time of vgae in large network. the adjacency matrix a and the feature matrix f are given into the encoder of fastvgae. the encoder uses graph convolution neural network (gcn) on the entire graph to create the latent representation (z). the encoder works on full adjacency matrix a. after encoding, sampling is done and decoder works on the sampled sub graph. the mechanism of decoder of fastvgae is slightly different from traditional vgae. it regenerate the adjacency matrix a based on a subsample of graph nodes, v s . it uses a graph node sampling technique to randomly sample the reconstructed nodes at each iteration. each node is assigned with a probability p i and the selection of noes is based on the high score of p i . the probability p i is given by the following equation: where, f (i) is the degree of node i, and α is the sharpening parameter. we take α = 2 in our study. the node selection process is repeated until |v s | = n s , where n s is the number of sampling nodes. the decoder reconstructs the smaller matrix, a s of dimension n s × n s instead of decoding the main adjacency matrix a. the decoder function follows the following equation: a s (i, j) = sigmoid(z t i .z j ), ∀(i, j) ∈ v s ×v s . at each training iteration different subgraph (g s ) is drawn using the sampling method. after the model is trained the drug-cov-host links are predicted using the following equation: where a i j represents the possible links between all combination of sars-cov-2 nodes and drug nodes. for each combination of nodes the model gives probability based on the logistic sigmoid function. a new coronavirus associated with human respiratory disease in china host-pathogen systems biology host-directed therapies for bacterial and viral infections network-based drug repositioning: approaches, resources, and research directions new horizons for antiviral drug discovery from virus-host protein interaction networks drug target prediction and repositioning using an integrated network-based approach mapping protein interactions between dengue virus and its human and insect hosts a review of in silico approaches for analysis and prediction of hiv-1-human protein-protein interactions network-based study reveals potential infection pathways of hepatitis-c leading to various diseases prediction of the ebola virus infection related human genes using protein-protein interaction network a genome-wide positioning systems network algorithm for in silico drug repurposing deepdr: a network-based deep learning approach to in silico drug repositioning network-based drug repurposing for novel coronavirus 2019-ncov/sars-cov-2 network bioinformatics analysis provides insight into drug repurposing for covid-2019 a sars-cov-2 protein interaction map reveals targets for drug repurposing comprehensive prediction of the sars-cov-2 vs. human interactome using pipe4, sprint, and pipe-sites scalable feature learning for networks fastgae: fast, scalable and effective graph autoencoders with stochastic subgraph decoding from community to role-based graph embeddings specific plant terpenoids and lignoids possess potent antiviral activities against severe acute respiratory syndrome coronavirus a next generation connectivity map: l1000 platform and the first 1,000,000 profiles improved community detection in weighted bipartite networks drugbank: a comprehensive resource for in silico drug discovery and exploration detecting overlapping protein complexes in protein-protein interaction networks the therapeutic potential of apigenin delayed antiviral plus immunomodulator treatment still reduces mortality in mice infected by high inoculum of influenza a/h5n1 virus baclofen promotes alcohol abstinence in alcohol dependent cirrhotic patients with hepatitis c virus (hcv) infection antiviral and immunomodulatory effects of polyphenols on macrophages infected with dengue virus serotypes 2 and 3 enhanced or not with antibodies evaluation of topoisomerase inhibitors as potential antiviral agents potent antiviral activity of topoisomerase i and ii inhibitors against kaposi's sarcoma-associated herpesvirus antiviral action of camptothecin an analog of camptothecin inactive against topoisomerase i is broadly neutralizing of hiv-1 through inhibition of vif-dependent apobec3g degradation water-insoluble camptothecin analogues as potential antiviral drugs inhibition of hiv-1 replication by daunorubicin a derivate of the antibiotic doxorubicin is a selective inhibitor of dengue and yellow fever virus replication in vitro elimination of hiv-1 infection by treatment with a doxorubicin-conjugated anti-envelope antibody antiviral activity of mitoxantrone dihydrochloride against human herpes simplex virus mediated by suppression of the viral immediate early genes histone deacetylase inhibitors for purging hiv-1 from the latent reservoir interval dosing with the hdac inhibitor vorinostat effectively reverses hiv latency the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application synthesis and in vitro anti-hsv-1 activity of a novel hsp90 inhibitor bj-b11 heat shock protein 90 facilitates formation of the hbv capsid via interacting with the hbv core protein dimers severe acute respiratory syndrome coronavirus envelope protein regulates cell stress response and apoptosis hsp90: a promising broad-spectrum antiviral drug target drug repositioning suggests a role for the heat shock protein 90 inhibitor geldanamycin in treating covid-19 infection broad spectrum antiviral agent niclosamide and its therapeutic potential hydroxychloroquine, a less toxic derivative of chloroquine, is effective in inhibiting sars-cov-2 infection in vitro pharmacokinetic optimisation in the treatment of pneumocystis carinii pneumonia the antiviral effects of na, k-atpase inhibition: a minireview severe acute respiratory syndrome coronavirus replication is severely impaired by mg132 due to proteasome-independent inhibition of m-calpain effective inhibition of mers-cov infection by resveratrol structural basis of receptor recognition by sars-cov-2 angiotensin receptor blockers as tentative sars-cov-2 therapeutics next-generation sequencing to generate interactome datasets development of human protein reference database as an initial platform for approaching systems biology in humans drugbank 4.0: shedding new light on drug metabolism chembl: a large-scale bioactivity database for drug discovery therapeutic target database update 2016: enriched resource for bench to clinical drug target and targeted pathway information the iuphar/bps guide to pharmacology: an expert-driven knowledgebase of drug targets and their ligands towards a proteome-scale map of the human protein-protein interaction network a proteome-scale map of the human interactome network a reference map of the human binary protein interactome stochastic backpropagation and approximate inference in deep generative models on information and sufficiency. the annals mathematical statistics key: cord-034545-onj7zpi1 authors: abuelkhail, abdulrahman; baroudi, uthman; raad, muhammad; sheltami, tarek title: internet of things for healthcare monitoring applications based on rfid clustering scheme date: 2020-11-03 journal: wireless netw doi: 10.1007/s11276-020-02482-1 sha: doc_id: 34545 cord_uid: onj7zpi1 covid-19 surprised the whole world by its quick and sudden spread. coronavirus pushes all community sectors: government, industry, academia, and nonprofit organizations to take forward steps to stop and control this pandemic. it is evident that it-based solutions are urgent. this study is a small step in this direction, where health information is monitored and collected continuously. in this work, we build a network of smart nodes where each node comprises a radio-frequency identification (rfid) tag, reduced function rfid reader (rfrr), and sensors. the smart nodes are grouped in clusters, which are constructed periodically. the rfrr reader of the clusterhead collects data from its members, and once it is close to the primary reader, it conveys its data and so on. this approach reduces the primary rfid reader’s burden by receiving data from the clusterheads only instead of reading every tag when they pass by its vicinity. besides, this mechanism reduces the channel access congestion; thus, it reduces the interference significantly. furthermore, to protect the exchanged data from potential attacks, two levels of security algorithms, including an aes 128 bit with hashing, have been implemented. the proposed scheme has been validated via mathematical modeling using integer programming, simulation, and prototype experimentation. the proposed technique shows low data delivery losses and a significant drop in transmission delay compared to contemporary approaches. coronavirus will have a long-term impact overall world. the most significant impact will manifest itself in the penetration of it surveillance and tracking. wireless sensor networks (wsns) become very efficient and viable to a wide variety of applications in many aspects of human life, such as tracking systems, medical treatment, environmental monitoring, intelligent transportation system (its), public health, smart grid, and many other areas [1] . radio frequency identification (rfid) is a wireless technology with a unique identifier that utilizes the radio frequency for data transmission; it is transferred from the device to the reader via radio frequency waves. the data is stored in tags; these tags can be passive, active, or battery-assisted-passive (bap). the active and bap tags contain batteries that allow them to communicate on a broader range that can go up to 1 km for enterprise users and over 2 km in military applications. unlike battery-powered tags, passive tags use the reader's rf signal to generate power and transmit/receive data [2] . using wsns and rfid is a promising solution, and it becomes prevalent in recent years. its low cost and low power consumption, rfid is easy to install, deploy, and combine with sensors [3] . these features make rfid combined with sensors a viable and enabling technology for iot. with a wide variety of increasingly cheap sensors and rfid technologies, it becomes possible to build a real-time healthcare monitoring system at low price with very high quality. the rfid system is considered the strategic enabling component for the healthcare system due to the energy autonomy of battery-less tags and their low cost. in addition, rfid can be attached to the monitored items to be recognized, and hence enhancing the efficiency of monitoring and managing the objects [4] [5] [6] [7] . having real-time data collection and management is very important, especially in health-related systems. for instance, the united nations international children emergency fund (unicef) and the world health organization (who) reported in 2016 that more than 295 thousand women die every year from causes related to pregnancy and childbirth [8] ; this is due to the unavailability of timely medical treatments. moreover, the report stated that the main reasons of cancer-related deaths are due to the late detection of the abnormal cellular growth at the last stage. many lives can be saved by utilizing real-time iot smart nodes that can continuously monitor the patient's health condition. hence, it empowers the physicians to detect serious illnesses such as cancer in the primary stage. the motivations for the proposed framework are threefold: low cost, high performance, and real-time collection of data. an rfid reader cannot rapidly get data from tags because of its static nature and short transmission range. therefore, high power and costly rfid reader is required to extend the range for quick information gathering. however, this would result in an increase in the price of the framework considering the high cost of rfid reader with a high transmission range (not less than $500) and the increased expenditure of initiating the connection between backend servers rfid reader. the question can we limit rfid readers' quantity, while still accomplishing sufficient information accumulation? moreover, in customary rfid observing applications, such as tracking luggage in airlines, an rfid reader is necessary to rapidly handle many tags at various distances. an rfid reader can just read tags within its range. many limitations could negatively affect the data collection's performance, such as multi bath fading and limited bandwidth; these issues can be maintained by transmitting information in short separations through multi-hop information transmission mode in wsns. besides, in every data collection system, the most critical challenge is to consider the real-time requirements. combining rfid tags with rfid readers and wsns helps significantly in solving this challenge [9] [10] [11] . in this paper, we develop a framework that integrates rfid with wireless sensor systems based on a clustering scheme to gather information efficiently. essentially, our framework utilizes a smart node proposed by shen et al. [3] . the smart node contains an rfid tag, reduced function rfid reader (rfrr), and a wireless sensor. the cluster's construction depends on multi-criteria in choosing the clusterhead among smart nodes in the same range. for instance, each node can read the tag id and battery level of all smart nodes in its range; the node with the highest battery level will be chosen as the clusterhead. the cluster consists of a clusterhead and cluster members; each member in the cluster transmits their tag information to the clusterhead. then, the rfid readers send the collected data to the backend server for data management and processing. also, to protect exchanged data from potential attacks, we have applied two levels of security algorithms. the proposed technique can lend itself to a wide range of applications, for example, collecting data in smart cities, aiming to monitor people's healthcare in large events such as festivals, malls, airports, train stations, etc. the specific contributions of this paper are listed below: • we exploit the smart nodes to develop an efficient healthcare monitoring scheme based on a collaborative adaptive clustering approach. • the proposed clustering scheme reduces the reader's burden to read every node and allows them to read only the node within its range. this approach minimizes the channel access congestion and helps in reducing any other interference. it also reduces the transmission delay, thus collecting the information between nodes efficiently for a large-scale system. • we formulate the clustering problem as a mathematical programming model to minimize the energy consumption and the interference in a large-scale mobile network. • to protect the collected data by the proposed approach from security threats that might occur during data communication among smart nodes and primary readers, we secure the exchanged data by two security levels. • we develop a small-scale prototype where we explore the performance of the proposed approach. the prototype is composed of a set of wearable smart nodes that each consists of rfid tag, reduced function rfid reader, and body sensor. also, all exchanged data among the smart nodes have been encrypted. the rest of the paper is organized as follows. section 2 presents the related work on health care monitoring applications. in sect. 3, the proposed system is discussed, starting with explaining the problem statement followed by the proposed clustering approach. in sect. 4, the cluster formation is modeled as an integer program. in sect. 5, we present and discuss the three used methods to evaluate our proposed approach. first, the optimal solution using integer programming is discussed. given the long-running time required for integer programming, the proposed system is simulated using matlab, where the local information is employed to construct the clusters. thirdly, a small-scale prototype is built to test the approach. finally, we conclude this paper with our findings and suggestions for future directions. this section summarizes some of the previous work related to health care monitoring applications. many researchers have focused on solving this problem by using either rfid or wsn as the short-range radio interfaces. however, very few of these solutions are suitable for the problem (health care monitoring applications for a large-scale system) that addresses a crowded area with high mobility. sun microsystems, in collaboration with the university of fribourg [12] proposed a web-based application called (rfid-locator) to improve the quality of hospital services. rfid-locator tracks the patients and goods in the hospital to build a smart hospital. all patients in the hospital are given an rfid based on wristband resembling a watch with a passive rfid tag in it. all patient's history and treatment records are stored in a centralized secure database. doctors have rfid-enabled personal data assistant (pda) devices to read the patient's data determined on the patients' rfid bangles. the results are promising, but too much work is needed in the security and encryption of the collected data. dsouza et al. [13] proposed a wireless localization network to follow the location of patients in indoor environments as well as to monitor their status (e.g., walking, running). the authors deploy static nodes at different locations of the hospital that interact with a patient mobile unit to determine the patient's position in the building. each patient carries a small mobile node that is composed of a small-size fleck nano wireless sensor and a three-axis accelerometer sensor to monitor his/her physical status. however, using everybody's smartphone gps and wi-fi is not an energy-efficient solution because it requires enormous power. chandra-saharan et al. [14] proposed a location-aware wsn to track people in a disaster site using a ranging algorithm. the ranging algorithm is based on received signal strength indicator (rssi) environment and mobility adaptive (rema). like [15, 16] , the authors in [17] focused on the healthcare area and provided a survey that shows the current study on rfid sensing from the viewpoint of iot for individual healthcare also proves that rfid technology is now established to be part of the iot. on the other hand, the paper reveals many challenging issues, such as the reliability of the sensors and the actual dependence of the reader's node. there are even more advanced solutions in [18] ; the authors proposed ihome approach, which consists of three key blocks: imedbox, imedpack, and the bio-patch. rfid tags are used to enable communication capabilities to the imedpack block also flexible, and wearable biomedical sensor devices are used to collect data (bio-patch). the results are promising, but the study didn't consider monitoring purposes. another smart healthcare system is proposed in [19] to monitor and track patients, personnel, and biomedical devices automatically using deferent technologies rfid, wsn, and smart mobile. to allow these different technologies to interoperate a complex network communication relying on a coap, 6low-pan, and rest paradigms, two use cases have been implemented. the result proved a good performance not only to operate within hospitals but to provide power effective remote patient monitoring. the results are promising, but their approach needs more infrastructures of the wired and wireless sensor network. gope and hwang [20] proposed a secure iot healthcare application using a body sensor network (bsn) to monitor patient's health using a collection of tiny-powered and lightweight wireless sensor nodes. also, the system can efficiently protect a patient's privacy by utilizing a lightweight anonymous authentication protocol, and the authenticated encryption scheme offset codebook (ocb). the lightweight anonymous authentication protocol can achieve mutual authentication, preserve anonymity, and reduce computation overhead between nodes. the ocb block cipher encryption scheme is well-suited for secure and expeditious data communication as well as efficient energy consumption. the results are promising, but their approach needs infrastructure. furthermore, an intelligent framework for healthcare data security (ifhds) has been proposed to secure and process large-scale data using a column-based approach with less impact on data processing [21] . the following table comapres the proposed approach with the existing literature. it shows that there is no similar work to the proposed approach. techniques f-1 f-2 f-3 f-4 f-5 f-6 f-7 f-8 f-9 [3] a hybrid rfid energy-efficient routing scheme for dynamic clustering networks a smart real-time healthcare monitoring and tracking system based on mobile clustering scheme a data collection method based on mobile edge computing for wsn energy-efficient large-scale tracking systems based on mobile clustering scheme energy-efficient large-scale tracking systems based on two-level hierarchal clustering : the smart node is a wearable smart node that includes a reduced function rfid reader (rfrr), a body sensor (bs), a rfid tag and a microcontroller, where in the rfid reader has a greater transmission range than the rfrr, where in the rfrr reads other smart nodes' tags and stores this data into its own rfid tag feature 2 (f-2): aplurality of smart nodes, which integrate radio-frequency identification (rfid) and wireless sensor network (wsn) feature 3 (f-3): the clustering scheme in which each node reads the tag id of all nodes in its range and a cluster head is a node which has the highest cost function (e.g. battery level); the cluster consists of a clusterhead and cluster members feature 4 (f-4): the data collection scheme in which an rfid reader receives all packets of node data from the ch, and the rfid reader sends the collected information to a back-end server for data processing and management feature 5 (f-5): formulating a novel mathematical programming model which optimizes the clustering structures to guarantee the best performance of the network. the mathematical model optimizes the following objective functions: (1) minimizing the total distance between chs and cms to improve positioning accuracy; and (2) minimizing the number of clusters which reduces the signal transmission traffic feature 6 (f-6): two level security is obtained by when a node writes data to its rfid tag, the data is signed with a signature, which is a hash value, the obtained hash is encrypted with a aes 128 bits shared key in this section, the proposed system is discussed, starting with explaining the problem statement followed by the proposed solution. during healthcare monitoring of people, the main challenge is to ensure safety, efficient data collection, and privacy. people stay in a bounded area, embedded with various random movements in their vicinity. different technologies have been suggested to collect data from crowds and can be categorized as passive and active sensing. passive sensing, such as computer vision, does not need any connection with the user. they can aid in movement detection, counting people, and density approximation [36, 37] . however, these approaches fail to deliver accurate identification of individuals in addition to the need for ready infrastructure, which is very costly. there are also some active systems such as rfid tags that can be attached to the individual and obtain user's data. nevertheless, these systems require an expensive infrastructure for organizing rfid readers at points of data collection [38] . therefore, to deliver accurate identification of individuals in addition to reduce the cost of the infrastructure and to attain efficient large-scale data collection for healthcare monitoring applications, we suggest employing a system of mobile smart nodes that is composed of rfid and wsn. the mobile smart nodes are clustered to minimize data traffic and ensure redundancy and delivery to the command center. however, clustering rfid nodes into groups comes with many technical challenges, such as achieving accurate positioning, collecting information in each cluster, and reporting this information from clusters head to the server for processing it. in addition, there are also many challenges related to clustering, which is crucial to managing the transmission to avoid interference. furthermore, the rfid tag is susceptible to malicious attacks; therefore, we implemented two levels of security algorithms to protect the stored data from potential attacks. this section discusses the proposed data collection technique that can efficiently collect the health information (e.g., temperature, heartbeat) and make them available to the backendback-end server in real-time. the main components in our system architecture include smart nodes, rfid readers, and a backend server, as shown in fig. 1 . the smart node integrates the functionalities of rfid and wireless sensor node. it consists of body sensor (bs), rfid tag, and reduced-function rfid reader (rfrr). unlike standard sensors, bs does not have a transmission function. bs is responsible for collecting the body-sensed data, such as heartbeat, muscle, temperature. the rfrr is an rfid reader with a small range compared to the traditional rfid reader. the protocol is composed of two phases: cluster construction and data exchange. in the beginning, each node reads the tag particulars (e.g., id, battery level) of all nodes in its range. then, the node, for example, with the highest battery level, is autonomously nominated as a clusterhead for this group of nodes. all smart nodes initiate a table of the nominees to be the clusterhead of the newly constructed cluster. the clusterhead sends a message to all nodes within its range to inform them that i am a clusterhead to join its group. secondly, the node accepting the offer from this clusterhead node sends an acknowledgment message; this is important to avoid duplicate association with multiple nodes. this step ends the cluster construction. once the cluster is formed, it reads other smart nodes and stores their data into its local tag. the clusterhead tag works as a data storage. finally, when an rfrr comes across rfid, the stored data are transferred to rfid and the backendback-end server for further processing. this feature helps reach remote nodes and hence enhance the system reliability and reduce the infrastructure cost. this process is repeated periodically; new clusters are formed, and new clusterheads are selected along with their children. this technique guarantees fair load distribution among multiple devices to attain the network's maximum lifetime and avoid draining the battery of any individual smart node. the pseudo-code for our algorithm is shown below. choose the ch with highest bl 6: if it is ch and meet its cm then 7: read data from the cluster member 8: end if 9: if it is ch and meet an rfid reader then 10: send its data to the rfid reader 11: end if the ultimate goal of this research is to design an optimum healthcare monitoring application based on the rfid clustering scheme. to meet the practical requirements for applying the system in large-scale environments, the proposed system's energy consumption should be minimum, and communication quality must be high. therefore, the integer programming model presented below aims to optimize the following objectives: • minimizing the total distance between clusterheads (chs) and cluster members (cms). • minimizing the number of clusters. the first objective, which is to minimize the total distance between all chs and their respective cms, is meant to enhance tag detectability. also, shorter distances improve the signal quality and reduce the time delay of transmissions within each cluster. for example, in traditional rfid monitoring applications, such as supply chain management and baggage checking in delta airlines, an rfid reader is required to process several tags at different distances in a short time frame. an rfid reader can only read tags in its range. limited communication bandwidth, background noise, multi-path fading, and channel accessing contention between tags would severely deteriorate the performance of the data collection [3] . the second objective is pursued because minimizing the number of clusters reduces signal transmission traffic, lowering the interference between signals. this results in reducing the use of energy and maximizing the lifetime of the network. for instance, rfid tag data usually is collected using direct transmission mode, in which an rfid reader communicates with a tag only when the tag moves into its transmission range. if many tags move towards a reader at the same time, they contend to access the channels for information transmission. when a node enters the reading range of an rfid reader, the rfid reader reads the node's tag information. suppose several nodes enter the range of rfid reader at the same time. in that case, the rfid reader gives the first meeting tag the highest priority to access the channel, reducing channel contention and long-distance transmission interference [38] . in the clusterhead based algorithm, cluster members replicate their tag data to the clusterhead. when a clusterhead of a particular cluster reaches an rfid reader, the rfid reader receives all nodes' information in this cluster. this enhanced method significantly reduces channel access congestion and reduces the information exchanges between nodes. the method is suitable for a wide range of applications where monitored objects (e.g., zebras, birds, and people) tend to move in clusters. let i = 1 to n denote the cm number, j = 1 to n denote the ch number, dij denotes the distance between cmi and chj, and f denotes the fixed cost per ch. the user's battery level (bl) is defined as in (1), / which is a predefined node energy threshold. expressions (2) and (3) define the decision variables, xij and yj, which are integer binary variables. fig. 1 the architecture of the healthcare monitoring system fig. 2 a timeline of the transactions carried out between smart nodes, b timeline of the transactions carried out between smart nodes and the main rfid reader wireless networks the complete integer-programming model of the clustering problem is given by (4) . the first expression in (4) is the objective function z, which consists of two terms. the first term is the total distance between chs and cms, and the second term is the total number of clusters in the network. the objective function z is minimized subject to four sets of constraints. constraint (i) ensures that every cm has a ch, so we avoid any isolated smart nodes. constraint ii controls the maximum cluster size (cs). constraint iii ensures that all cluster members are within the ch's rfid range, i.e., not more than d max away (e.g., two feet). finally, constraint iv ensures that a ch node's battery level must be at least / (e.g.. 50%). the fixed cost of each ch is denoted by f, which is analyzed later. in this section, the performance of the proposed approach is evaluated using three methods: the integer programming, simulation, and a small-scale prototype. the general algebraic modeling system (gams) is designed for modeling and solving linear programming (lp), nonlinear programming (nlp), and mixed-integer programming (mip) optimization problems [39] . since the above model described in eq. (4) is a binary integer program, it is solved by the mip feature of gams. we use gams version 24.3.3. we consider two different scenarios. the first scenario tackles the problem by considering the two terms in the objective function that aims at minimizing the number of clusters and the total distance between chs and cms to find the optimal cluster size (cs) in constraint ii. the second scenario applies sensitivity analysis by fixing the total number of nodes to n = 400, 500, 600, 700, and 800; this is done while changing the fixed cost of each ch, f, and calculating the optimal value of the number of clusters and the total distance as well. both scenarios are analyzed under the condition that the service region's size is set as 10 * 30 ft 2 . to achieve a 95% confidence level, we have repeated each experiment 10 times using different random input for nodes' locations and the battery level for each node. it can be observed from fig. 3 that the total distance between the chs and the cms is reduced on average when cs is equal 6 (i.e., one clusterhead and five cluster members) for 400 nodes and 500 nodes. the total distance between the chs and the cms is also reduced on average when cs is equal 7, 8, 9 for 600 nodes, 700 nodes, and 800 nodes, respectively. for example, with 400 nodes, the minimum accumulated distance between all clusters and their members is about 200 ft when cluster size is equal 6, whereas, with 10 cluster size, it is about 350 ft. similar to 800 nodes scenario, the minimum distance is about 460 ft when cluster size is equal to 9, whereas, with 5 cluster size, it is about 685 ft and 535 when cluster size is 10. therefore, the clustering approach is effective in reducing the total distances when cs is equal to 6 for 400 nodes and 500 nodes, and 7, 8, 9 for 600 nodes, 700 nodes, and 800 nodes, respectively. figure 4 displays the number of clusters while the cluster size is changing. it can be observed that the number of clusters drops when the cluster size increases. however, we are interested not only in minimizing the number of the clusters, but we are also interested in minimizing the total distances between the clusterhead and the cluster member to achieve the accuracy of positioning and maximize the lifetime of the network. for instance, with 400 nodes, the optimum minimum distance is about 200 ft when cluster size is equal 6, and with 800 nodes, the optimum minimum distance is about 460 ft when cluster size is equal 9. therefore, the optimum value of cluster size is equal to 6 for 400 nodes and 500 nodes, and 7, 8, 9 for 600 nodes, 700 nodes, and 800 nodes, respectively. figure 5 demonstrates the model's total distance when the fixed cost per master f is equal to 10 e , where e = 0, 1, 2…, 6. for 400 nodes, the optimal (minimum) total distance is 200 ft, which is obtained when f is equal to 100 (e = 2). for the case of 800 nodes, the optimal total distance is 460ft, which is also obtained when f is equal to 100. these numbers indicate that the clustering approach is well-suited for large-scale monitoring applications. figure 6 illustrates the optimal number of the clusters when the value of fixed cost per master f is equal to 10 e where e = 0, 1, 2…, 6. for 400 nodes, the optimal (minimum) number of the clusters is 72 clusters, which is obtained when e = 2, or f = 100. for the case of 800 nodes, the optimal number of clusters is 94 clusters, which is also obtained when f = 100. therefore, the best value of f for both terms in the optimization function in eq. (4) to work effectively is 100. in this section, we formulate the energy consumption of the proposed clustering approach and the traditional approach analytically. in the beginning, we define the following parameters: r: rfid_maximum_data_rate, bps p a : rfid_active_power, w p i : rfid_idle_power, w t a tag ¼ l=r: rfid tag active time in second, where l is the data length, bits. we define the total energy consumption for the traditional approach as follows. for the traditional approach, t a tag ¼ t round . given the current advancement in rfid technology, we can assume that the collision rate is very low with confidence. hence, b i % 1. then, for the proposed approach, we define the following specific parameters.e ch : total energy consumption per clusterhead, e total : total energy consumption given the current advancement in rfid technology, we assume the collision rate to be minimal. hence, b i % 1 and t i tag i ð þ % t round à t a tag , for 8i. besides, in order not to miss any data, the clusterhead is set on for the whole round period, hence, t a rfrr ¼ t round and t i rfrr ¼ 0. equation (9) can be rewritten as fig. 3 the average total distance when changing cs from 400 to 800 nodes fig. 4 number of clusters when changing cs from 400 to 800 nodes fig. 5 the total distance when changing f from 400 to 800 nodes wireless networks we have implemented the proposed system for monitoring the health parameters using cisco packet tracer 7.0 since it supports iot, rfid, and many other functions. figure 7 shows the smart node components as built using cisco packet tracer. the smart node consists of rfrr, bs, rfid tag, and the microcontroller. the rfrr is a standard rfid reader with a limited range. we program the rfrr to perform two tasks: the first task is reading data from the attached body sensors 1 and storing data into its tag. the second task is reading the data from other smart nodes within its transmission range and storing it into its tag. the body sensor is responsible for collecting body-sensed data such as temperature, heartbeat. the rfid tag works as data storage. on the other hand, the microcontroller (mcu) is used to monitor, verify, and process smart nodes readings. the transmitted data between smart nodes and rfid readers has three fields. a unique smart node id assigned to each user (1 byte), the sensed-data (1 byte), and the timestamp, which records the time at when the data is collected (2 bytes). furthermore, to protect the collected data from potential attacks, we apply rivest-shamir-adleman (rsa) algorithms [37] . figure 8 shows the components of the rfid reader and its connectivity with the backend server. the rfid readers are responsible for collecting the data from smart nodes and delivering them to the backend server. the transmission range of the rfid reader is much greater than that of the rfrr. upon reading the smart node tag data, it sends that data directly to the backend server wirelessly carried by udp packets. rivest-shamir-adleman (rsa) algorithms are also applied for the transmitted data from smart nodes to primary readers. using the above setup, we start by studying the performance of the packet delay, and the number of delivered packets have been calculated for the traditional approach and the cluster approach. in the traditional approach, every node sends its packets directly to an rfid reader. in the clustering approach, every node sends its packets to its clusterhead, and the clusterhead forwards them to an rfid reader. each node sends ten packets every minute, and the simulation has been tested for 10 min to achieve a 95% confidence interval. the average delay per packets is calculated using eq. (10), where n is the number of delivered packets and r t is the receiving time and s t is the sending time. average delay per packet ¼ 1 n table 1 shows a sample of the collected data at the backend server before and after implementing the rsa algorithm. the smart node appends the timestamp to the sensed data 2 and stores the information in its tag through rfrr. as stated before, the transmitted data between smart nodes and rfid readers has three fields, namely, unique smart node id, the sensed data, and the timestamp when the data was collected. figure 9 illustrates the average transmission delay per packet for a different number of nodes. we can notice that the traditional approach's delay per packet is almost fixed regardless of available smart nodes. this behavior can be attributed to the fact that each node would meet the rfid readers for forwarding its packets with equal probability. on the other hand, when the clustering approach is employed, the delay drops significantly; for example, when n = 30, the packet delay drops by 63%. the higher is the number of smart nodes, the lower is the packet delay; this happens because when the number of smart nodes increases in the same area, the density increases, as well as the fig. 6 number of clusters when changing f from 400 to 800 nodes fig. 7 the smart node components as built-in packet tracer number of clusterheads. therefore, the probability of a regular node meets a clusterhead increases, which leads to reduce the delay in delivering the collected data to the primary reader and then to the back-end-server. figure 10 displays the number of delivered packets for different numbers of nodes. in the clustering approach, the system delivers exactly 300 packets, which are the total number of packets generated by all smart nodes. on the other hand, in the traditional approach, the system suffers packet loss (e.g., 20% loss for n = 30) due to the increase in channel access congestion as the number of nodes increases. next, we study the traditional approach's energy consumption, the optimal approach, and the proposed clustering approach. in the traditional approach, every node sends its packets directly to an rfid reader. in the clustering approach, as explained in sect. 3.2, every node reads the tag particulars (battery level 3 ) of all nodes in its range. the node with the highest battery level is then chosen as a clusterhead for this group of nodes. then, the clusterhead broadcast a message to all nodes within its range to inform them that i am a clusterhead to join its group. then, the node accepting this clusterhead node's offer sends an acknowledgment message; this is important to avoid duplicate association with multiple nodes. once the cluster is formed, the clusterhead remains active, and the cluster member remains in sleep mode. the clusterhead reads other smart nodes and stores their data into its local tag. the cluster member switches to active mode every 10 s to store its data into its own local. finally, the clusterhead sends the data to an rfid reader, then to the backend server for further processing and management. this process is repeated every 1 min; new clusters are formed, and new clusterheads are selected along with their children. this technique guarantees fair load distribution among multiple devices to attain the maximum lifetime of the network and avoiding draining the battery of any individual smart node. the relative performance of the three methods has been evaluated using matlab. it is assumed that each node can send data traffic at a rate of 250 kbps, and it can send frames with sizes up to 4 bytes (one byte for the id tag number, one byte for the data (heartbeat) and two bytes for timestamp and sequence number). table 2 shows the rfid hardware energy consumption parameters, as specified by sparkfun [40] . in order to achieve a 95% confidence interval, each simulation experiment was repeated 10 times using different random topologies. for each simulation run, the total energy consumption for each round was calculated for different values of the number of nodes (n = 400, 500,…, 800). figure 11 and table 3 show the average total energy consumption for the traditional approach, the clustering algorithm, and the optimal gams solution of the integer programming model. figure 11 shows that the clustering solution's total energy consumption is close to the minimum total consumption obtained by the optimal gams solution. the clustering algorithm's total energy becomes closer to the optimal value as the number of nodes increases. this result is clear from table 3 , which shows a difference of 8% between the clustering algorithm's performance and the optimal gams solution when the number of nodes is equal to 400, but only a difference of 3.26% when the number of nodes is equal to 800. this feature shows that the proposed clustering algorithm can produce high-quality, near-optimum solutions for large-scale problems. as shown in table 3 , the traditional approach's energy consumption is 455.14% higher than the optimal consumption specified by gams when the number of nodes is equal to 400, and 741.07% higher when the number of nodes is equal to 800. the traditional approach (without clustering) is not a practical solution method for large-scale systems. in this section, we evaluate the performance of the proposed approach using a small-scale prototype. we begin by describing the experimental setup and then discuss the experimental results. figure 12 shows the smart node components in our prototype testbed. the smart node consists of rfrr, bs, rfid tag, and the microcontroller. the rfrr is a standard rfid reader with a limited range, which can read up to two feet as in spark fun specification with onboard antenna [40] . we program the rfrr to perform two tasks. the first task is reading the heartbeat, and the muscle sensed data from the bs (via pulse sensor, and muscle sensor), respectively, and storing this data into its tag. the second task is reading the data from other smart nodes within its transmission range and storing it into its tag. bs is responsible for collecting the body-sensed data such as heartbeat and muscle data. the rfid tag works as a packet memory buffer for data storage. arduino's read board is a microcontroller that is used to monitor, verify, and process smart nodes readings. the transmitted data between smart nodes and rfid readers has three fields, smart node id, the sensed data, and the sequence number of the data to know when the data was recorded. for each node, three packets of data are needed to be published so that other nodes can get their information. therefore, we need only four bytes of data entries: node id (1 byte), heart rate information (1 byte), and the sequence number (2 bytes). the sequence number helps in discovering how recent the carried information is, and helps other nodes in deciding whether to record newly read data or discard it. each rfid tag has a 64-byte capacity; the first 48 bytes are divided into chunks of 4 bytes where each is used to store information of one node, this sums to a total of 12 data slots. the remaining 16 bytes are used for authentication. the first data slot is reserved for one's tag. other data slots are initially marked as available; that is, they do not contain data about other nodes and are ready to be utilized for that purpose. figure 13 shows the flowchart that presents the process of handling new data. when a new data arrives and is to be stored, the controller tries to find whether a slot that contains data for the same id exists. if so, the slot is updated if the sequence number is less than the new sequence number; otherwise, the new data is discarded. if the controller does not find a previous record for that id, it stores its data in a new available slot, which means some data to be lost. we implement two levels of security algorithms to ensure the integrity of the arrived data, as well as to authenticate the source of data in our scheme. when a node writes the 48 bytes data into its tag, the data is signed with 16 bytes signature, which is used for authentication. to obtain the signature, the controller calculates the md5 128 bits hash value of the 48 data bytes. then, the obtained hash is encrypted with the aes 128 bits shared key. the result is the signature and is stored on the tag. to verify a newly read tag, the controller computes the hash of the new data (but not the signature), encrypts it with the shared key, and compares the result with the signature. the new data is valid if the result and its signature match each other. otherwise, it is considered an invalid node, and its data is discarded. the experimental prototype consists of three smart nodes (1, 2, 3) and one primary rfid reader, as shown in fig. 14 . each smart node consists of rfid tag, microcontroller, pulse sensor, and rfrr, a regular rfid reader with a limited range, which can read up to two feet with an onboard antenna. the primary rfid reader is an rfid reader attached to an external antenna to increase its transmission range. in this prototype, node 3, which has the highest battery level, plays the role of the clusterhead, and node 1 and node 2 play the role of the cluster members. node 3 reads tag information of node 2 and node 3. then, the primary rfid reader receives all packets of node 1, node 2, and node 3 from node 3 when it moves into the primary rfid reader range. then, the rfid reader sends the collected information to the backend server for data processing. figure 15 shows a sample of the collected data of the pulse sensor that includes the beat per minute (bpm), live heartbeat or interbeat interval (ibi), and the analog signal (as) on the serial monitor. each row in fig. 15 includes bpm, ibi, and as. for instance, the first row has 78 as bpm, 1670 as ibi, and 491 as as. the typical readings of the beat per minute of the pulse sensor should be between 60 and 100. otherwise, it is considered an emergency case. it can be observed from figs. 16 and 17 that a valid foreign tag # 1 is read and updated, and a valid foreign tag# 2 is read and then updated on the serial monitor, respectively. to verify a newly read tag, the controller computes the hash of the new data (but not the signature), encrypts it with the shared key, and compares the result with the signature. the new data is valid if the result and its signature match each other. otherwise, it is considered an invalid node, and its data is discarded. figures 16 and 17 shows that tag# 1 and tag# 2 are valid. figure 18 shows the captured data packets in an invalid foreign tag. in this example, the reader using the authentication process, which the controller executed, reported that tag number four is invalid. the controller computes the hash of the new data, encrypts it with the shared key, and compares the result with the signature, so tag four is considered as an invalid node. its data is discarded because the results and signature do not match. in this paper, we presented a novel technique for iot healthcare monitoring applications based on the rfid clustering scheme. the proposed scheme integrates rfid with wireless sensor systems to gather information efficiently, aiming at monitoring the health of people in large events such as festivals, malls, airports, train stations. the developed system is composed of clusters of wearable smart nodes. the smart node is composed of rfid tag, reduced function of an rfid reader, and body sensors. the clusters are reconstructed periodically based on specific criteria, such as the battery level. these clusters collect data from their members and when they come across rfid readers, they deliver the collected data to these readers. on the other hand, using the traditional approaches, only the nodes in the range of the rfid readers can send their tag data to the rfid readers. hence, this will cause several performance problems such as long delay, dropped packets, missing data, and channel access congestion. the proposed clustering approach overcome all these problems. it demonstrated outstanding performance in reducing the packet transmission delay, inter-node interference, and better energy utilization. the experimental results have supported the above performance. the proposed approach can lend itself easily to monitor and collect the health information of the society population continuously, especially in the current pandemic. as future research directions, we are planning to integrate the smart nodes with other sensors to ensure full health care application and test the new application in large-scale scenarios. there is also a need to improve the clustering algorithm to guarantee a high level of service quality of the deployed health applications. conflict of interest the authors declare that they have no conflict of interest. ethical approval the study only includes humans in roaming a large hall to test the connectivity of the established networks. a survey on the internet of things security handbook: fundamentals and applications in contactless smart cards, radio frequency identification and near field communication efficient data collection for large-scale mobile monitoring applications influence of thermal boundary conditions on the double-diffusive process in a binary mixture engineering design process an object-oriented finite element implementation of large deformation frictional contact problems and applications x-analysis integration (xai) technology. virginia technical report preventing deaths due to hemorrhage taxonomy and challenges of the integration of rfid and wireless sensor networks neuralwisp: a wirelessly powered neural interface with 1-m range a capacitive touch interface for passive rfid tags building a smart hospital using rfid technologies wireless localization network for patient tracking empirical analysis and ranging using environment and mobility adaptive rssi filter for patient localization during disaster management the research of network architecture in warehouse management system based on rfid and wsn integration bringing iot and cloud computing towards pervasive healthcare rfid technology for iot-based personal healthcare in smart spaces a health-iot platform based on the integration of intelligent packaging, unobtrusive bio-sensor, and intelligent medicine box an iot-aware architecture for smart healthcare systems bsn-care: a secure iot-based modern healthcare system using body sensor network ifhds: intelligent framework for securing healthcare bigdata effective data collection in multi-application sharing wireless sensor networks a hybrid approach of rfid andwsn system for efficient data collection an analysis on optimal clusterratio in cluster-based wireless sensor networks wireless regulation and monitoringsystem for emergency ad-hoc networks using nodes concurrent data collectiontrees for iot applications acooperation-based routing algorithm in mobile opportunistic networks a data prediction model based on extended cosine distance for maximizing network lifetime of wsn crpd: anovel clustering routing protocol for dynamic wireless sensor networks secure data transmission in hybrid radio frequency identification with wireless sensor networks real-time healthcare monitoring system using smartphones energy-efficient data collection scheme based on mobile edge computing in wsns iterative clustering for energy-efficient large-scale tracking systems an asynchronous clustering and mobile data gathering schema based on timer mechanis min wireless sensor networks optimum bilevel hierarchi-cal clustering for wireless mobile tracking systems modeling and representation to support design-analysis integration crowd analysis: a survey. machine vision and applications data-driven crowd analysis in videos gams specifications the sparkfun specification key: cord-349724-yq4dphmb authors: santos, hugo; alencar, derian; meneguette, rodolfo; rosário, denis; nobre, jéferson; both, cristiano; cerqueira, eduardo; braun, torsten title: a multi-tier fog content orchestrator mechanism with quality of experience support date: 2020-05-06 journal: nan doi: 10.1016/j.comnet.2020.107288 sha: doc_id: 349724 cord_uid: yq4dphmb abstract video-on-demand (vod) services create a demand for content orchestrator mechanisms to support quality of experience (qoe). fog computing brings benefits for enhancing the qoe for vod services by caching the content closer to the user in a multi-tier fog architecture, considering their available resources to improve qoe. in this context, it is mandatory to consider network, fog node, and user metrics to choose an appropriate fog node to distribute videos with qoe support properly. in this article, we introduce a content orchestrator mechanism, called of fog4video, which chooses an appropriate fog node to download video content. the mechanism considers the available bandwidth, delay, and cost, besides the qoe metrics for vod, namely number of stalls and stalls duration, to deploy vod services in the opportune fog node. decision-making acknowledges periodical reports of qoe from the clients to assess the video streaming from each fog node. these values serve as inputs for a real-time analytic hierarchy process method to compute the influence factor for each parameter and compute the qoe improvement potential of the fog node. fog4video is executed in fog nodes organized in multiple tiers, having different characteristics to provide vod services. simulation results demonstrate that fog4video transmits adapted videos with 30% higher qoe and reduced monetary cost up to 24% than other content request mechanisms. to take advantage of vod cached closer to avoid overloaded conditions of the network and video servers [8] . in this context, new services must be created to offer qoe-aware vod services to mobile users, while optimizing the usage of heterogeneous network resources. for instance, vod providers can run part of vod services in the cloud in a more cost-effective fashion, but with higher values of latency for a stationary user with low qoe requirements. however, vod providers migrate part of the vod content to a fog node granting low latency with the high monetary cost for a mobile user with high qoe requirements. hence, an efficient content orchestrator mechanism must periodically monitor network and user requirements, in order to select the most suitable node in the multi-tier fog environment for a user to access the vod content, providing qoe assurance and efficient usage of network resources. the content orchestrator works in two phases: (i ) analysis and (ii ) decision-making & execution. the analysis phase collects information from multi-tier fog nodes and qoe from the mobile user to understand their behaviors/availabilities to keep the qoe high, while optimizing usage of network and cloud resources. decision-making & execution phase finds the best node in the multi-tier fog environment to distribute the vod service for each user request by using a multi-criteria technique. this phase is also responsible for forwarding a content request to such fog node. some investigations deal with content orchestrator mechanisms for vod services [9, 10, 11, 12, 13, 14, 15] . however, network conditions continuously change over time. previous works are limited to select a fog node only at the streaming startup and often neglect qoe information. it is mandatory for such mechanism to efficiently and periodically classify the best fog nodes to disseminate the content. this classification must consider different metrics to understand the performance of fog nodes and users to make efficient decisions. in this sense, it must consider the trade-offs between network conditions, qoe, and costs for decision making. in this article, we introduce a content orchestrator mechanism for vod streaming in a multi-tier fog computing environment, called fog4video [16] . we consider a network infrastructure to enable the cooperation between multi-tier fog and cloud nodes to meet user needs for vod services. in this sense, we introduce a multi-tier fog architecture for vod services with layered components used to implement fog4video in two phases. for the analysis phase, the fog4video collects information about available bandwidth, delay, stall duration, number of stall events, and monetary cost from the network, user client, and fog node. for the decision-making & execution phase, fog4video considers the analytic hierarchy process (ahp) method to assign different degrees of importance for each criterion. therefore, fog4video selects the suitable fog node to stream vod content cached on it, improving the qoe of delivered videos. simulation results demonstrate the efficiency of fog4video in transmitting vod compared to additional content orchestrator mechanisms. the number of stall events is reduced by up to 70%, and the stall duration is reduced by up to 65%. this behavior is an essential achievement of fog4video since stall duration and stall events cause the most negative impact on user perception than bitrate switching [17] . fog4video also improved the average bitrate by up to 35% and reduced monetary cost by up to 24% compared to other content orchestrator mechanisms. the remainder of this article is organized as follows. section 2 overviews the state-of-the-art for studies about a content orchestrator mechanism for vod service. section 3 describes fog4video, which is implemented in a multi-tier fog architecture. section 4 details the simulation methodology and evaluation results of the fog4video mechanism. finally, section 5 presents the concluding remarks and future works. in this section, we introduce recent research on multi-tier fog computing systems to address the challenges in terms of quality of service (qos) and qoe for vod services. we also identify gaps in the literature, leading to design a content orchestrator mechanism in multi-layered and distributed fog systems. byers et al. [6] propose a multi-tier architecture for several use cases, such as transportation, smart cities, and residential customers. the closer availability of content enhances delivery of video content in specific regions improving the network by reducing the network load and absolute latency. khattak et al. [18] present a vehicular fog computing architecture for infotainment applications and evaluates in terms of cache size, cache hit ratio and energy of the vehicular fog nodes. rosário et al. [19] describe the operational impacts and advantages regarding video content migration in a multi-tier architecture with qoe support. iotti et al. [20] analyze the effect of proactive cache into fog nodes on the network edge to avoid redundant traffic. they classify and identify the manageable traffic based on dns queries and replies. however, these works mostly provide optimizations lowering latency but lack in coordinating the content orchestrator to more efficient fog nodes in cases of poor qoe delivery. caching [21, 22] and adaptive bitrate (abr) [23] schemes aim at optimizing the qoe of delivered videos. caching schemes enable users to access popular content from caches placed near to the user [22] . abr adapts the video bit rate according to the different network, device, qoe, and user characteristics [23, 12] . however, these video services are restricted to cloud and edge which can significantly increase the deployment cost compared to multi-tier. in this context, it is essential to explore how to combine the existing video services in each tier to improve qoe by dynamically manage the video content orchestrator to serving fog nodes with the available network, processing, storage, and cost of network devices [24, 25] . several works introduced content orchestrator mechanisms exclusively at of request arrival. tang et al. [9] redirect user requests to multiple destination servers to minimize a cost function considering service response times, computing costs, and routing costs. siavoshani et al. [13] balance a load of storage resources and communication costs, managing the server redirection process based on cache size limitations and proximity to the server. chunlin et al. [15] aim to minimize service time, power consumption, and costs from the service provider, leveraging multiple qos parameters to select video service provisioning. however, these works are restricted to optimize qos not necessarily improving qoe for vod services. moreover, they neglect overloaded network path after the selection of service provider. xiao et al. [10] formulate geo-distributed and cloud-based dynamic content orchestrator redirection and resource provisioning as a stochastic optimization problem reducing costs for renting cloud resources to improve qoe by correlating with qos metric of delay. zhang et al. [11] propose content placement and request dispatching for cloud-based video distribution services with markov decision process aiming to maximize profits of the video service provider considering cost and also correlating qos with qoe. however, it is essential to consider qoe for vod to provide more accurate information about the user's visual perception, which lacks when correlating network qos metrics with qoe for vod. based on our analysis of the state-of-the-art, we conclude that vod services deployed in a multi-tier fog architecture improve the qoe by efficiently orchestrating fog nodes resources, while reducing delay and amount of data uploading/downloading to the cloud in a cost efficient fashion. hence, it is essential to manage content requests during the entire streaming to avoid potential playback abandonment of users with poor qoe by streaming video from fog nodes with proper connectivity and resources. however, existing mechanisms still need to provide an efficient content orchestrator mechanism based on network, fog node, and user information to support vod distribution with qoe support. table 1 summarizes the main characteristics of previous works intended to provide vod distribution. in this section, we present the proposed fog4video content orchestrator. it aims to choose an appropriate fog node to distribute content throughout vod streaming, considering network, fog node, and user's information for decision-making. fog4video performs the content orchestrator in a hierarchically network infrastructure, where multi-tier fog nodes could cache the content and also provide vod services, improving the qoe of vod service. the hierarchical design stands for performance, location, and cost deployment on each layer. based on such architecture, fog4video classifies the connectivity and resources of each available fog node into a multi-criteria rank, where it considers the ahp method to assign different degrees of importance for each criterion to provide better qoe for each user. in the following, we introduce the multi-tier fog architecture, and also we describe the fog4video mechanism in detail as well as its deployment in such architecture. fog nodes can be deployed anywhere in a network, organized in tiers between the mobile devices (at the bottom) and the cloud (at the top) [19] , as it can be seen in figure 1 . the client module can be mobile devices consuming vod content. the multi-tier fog module can be any device in the radio access network (ran), e.g., base station (bs), or access point (ap) providing multimedia services to few dozens or hundreds of mobile devices. based on the service demand, network path, and to sustain appropriate qoe, a replica of a tier can take place in the network, such as base band unit (bbu) or internet service provider (isp). besides, mobile devices could become a fog node to relay the video content via device-to-device (d2d) wireless communication for mobile devices with high and similar traffic demands. on top of such multi-tier architecture, there is the cloud module capable to provide vod service in a centralized fashion keeping the entire database into a cloud computing datacenter. figure 1 shows the modules and components of the multi-tier fog architecture, which relies on two types of nodes: centralized cloud computing and distributed fog computing. both work collaboratively to provide vod services for client applications that aim to improve the qoe. cloud nodes perform control functions, while fog nodes execute cache and vod streaming services. the client application requests and displays the vod content to users. in the following, we introduce the functionality of each component in the modules. the client module consists of a client agent that manages communication among video player and client qoe/qos under test. the client agent plays the role of an interface between cloud and client modules, synchronizing control, and data flow in both directions. moreover, this module manages the migration of video streaming services to the fog node. the video player component downloads video content on the mobile device screen. while displaying video content, the client qoe/qos under test collects and reports qoe and qos measurements, such as playback start time, stall duration, mean opinion score (mos), throughput, round trip time (rtt), packet loss, and others, to understand qoe related to vod. the multi-tier fog module includes a fog agent connecting the streaming unit, transcoding unit, cache unit, and fog qoe/qos under test components. the fog agent also plays the role of an interface between cloud and multi-tier fog modules; synchronize control and data flow exchange in both directions; manage/provide communication among internal modules. transcoding unit runs on a fog node to adapt the video codec, bit rate, or resolution according to the network conditions, device capabilities, or qoe [23] . however, it can run only on a tier with sufficient resources available, since it requires more processing, data exchange, and memory capabilities. the cache unit stores redundant copies of given video content close to the the cloud module consists of a cloud controller connecting the orchestrator, video database, request service, and cloud qoe/qos under test. the orchestrator coupled to the controller steers decision-making on management and operation tasks based on the qos/qoe reports from the clients. for example, the orchestrator decides about fog deployment, service migration, evaluates available resources, and considers a specific content orchestrator mechanism to define where, what, and when a client must download the video from a different streaming unit. it holds input from the cloud qoe/qos under test, vod requirements, and high-level management information, such as network-wide policies or service level agreements (slas). the video database stores the vod content for traditional dash in the service provider, while the request service coupled to the controller distributes the content to the video player at the client device. moreover, it controls from which tier the video player must download the content, including when the download shifts from one fog node to another. the cloud qoe/qos under test built-in the controller collects qoe and qos measurements and replays the demands made by the orchestrator. the cloud controller manages communication among internal modules, synchronizes control and data flow exchange, and sends decisions taken by the orchestrator to fog nodes. these decisions generate control flows to fog and client agents. hence, the controller of each tier can start the vod service procedures to optimize the qoe of delivered videos. in this architecture, the video player requests a video from the request service at the cloud. next, request service sends back the media presentation descriptor (mpd), which is an xml file that contains information about the available video chunks, as it happens in dash. it also includes other metadata needed by clients to choose between the available video chunks. in a vod scenario, the video is divided into multiple chunks, and each chunk can be requested with a different bit rate representation to avoid buffer underflow, preventing stalling in varying network conditions. in this sense, the video player requests the next chunk with an appropriate bit rate based on available transmission resources. in other words, the bit rate increases with sharpened network conditions and decreases in case of buffer underflow. the client qoe/qos under test periodically sends qoe feedback, i.e., the number of stalls and stalls duration values, to the orchestrator unit. the stalls consist of interruptions on the video playback. based on these values, fog4video implemented in the orchestrator considers network, fog node, and user information to perform real-time content orchestration, i.e., it chooses an appropriate streaming unit from a given tier for the client to download the video. the orchestrator can also decide to keep downloading from the current fog node streaming unit as decided previously. therefore, fog4video performs load balancing among fog nodes to better meet the user needs, i.e., avoiding streaming unit overload while improving vod distribution. fog4video distinguishes two phases, namely, analysis, and decisionmaking & execution. in the analysis phase, it collects important metrics for decision-making, i.e., available bandwidth, delay, number of stalls, stalls duration, and cost to deploy vod services in a given tier. afterwards, this information is evaluated in a decision-making & execution phase, which determines the best streaming unit to download the video. finally, the cloud controller sends decisions taken by the orchestrator to fog nodes and clients. cloud qoe/qos under test, fog qoe/ qos under test, and client qoe/qos under test collect information from cloud, fog nodes, and the user, to understand their performance to make the best decision. specifically, fog4video receives the qos characteristics, i.e., available bandwidth and delay, collected by cloud qoe/qos under test and fog qoe/qos under test, since these values impact qoe of vod services. the vod service uses tcp stream, which is highly affected by high latency in best-effort internet [29] . as tcp, video adaptation algorithm based on tcp depends on quick feedback given by the clients. therefore, the delay is essential to provide a more accurate response of the characteristics of the next chunk to be sent. the orchestrator gives preference to idle and more cost efficient fog nodes before collecting qos reports from the clients. this phase also considers qoe metrics, i.e., number of stall events, and stall duration, collected by client qoe/qos under test. specifically, video player buffers the downloaded chunks before playing out and stops the video playback. as soon as the buffer level is empty, the video playback cannot continue, since there is insufficient data available in the buffer [2] . the interruption lasts until the fulfillment of a complete chunk in the buffer. these interruptions are stall events, and their duration is called stall duration. these two well-known objective metrics have the most crucial factor in qoe since they directly impact the continuity of the vod session [2] . for instance, users who experience more interruptions in the video tend to watch the video for a shorter duration and are likely to be dissatisfied in the case of four or more interruptions [30] . furthermore, viewers prefer a single but long stall event instead of several short stall events. hence, not only the number of interruptions but also their duration affect qoe [31] . fog4video also considers cost k to process data in a given fog node k, which depends on the amount of cpu time the streaming unit or transcoding unit use cpu for processing p k and the monetary cost per hour h k . in this sense, a video content orchestrator should consider the trade-off between the increased cost and qoe of delivered vod services. in general, the scale deployment of more centralized fog nodes tend to offer resources in a cost-effective manner per processing unit compared to more geographically distributed. the value cost k to stream the vod service in a fog node is computed as follows. eq. 2 computes the overall cost c k to process a video chunk to deploy vod services in a given fog node. it depends on cost k to process the bitrate representation r of a given video v and binary variable α r v,k . a true value of α r v,k stands for the transcoding of a chunk b r v in a fog node k. fog4video checks the resource availability of the fog nodes, i.e., the number of available computation resources to adapt video content accordingly to the video bitrate requested by the clients from a specific fog node. table 2 summarizes the notations. binary shows if k-th fog node transcoded the r-th representation of v-th video cost k cost to process data in the k-th fog node c k cost of video service provisioning in the k-th fog node a k resource demand in the k-th fog node l candidate fog node list at this phase, the fog4video mechanism implemented in the orchestrator at the cloud is responsible for selecting the best streaming unit of a given fog node from which the client should download the video. in the first step, the mechanism creates a list l of candidate fog nodes by checking the resource availability to compute content adaptation of each fog node to deploy the vod service, as shown algorithm 1. fog4video receives a chunk request in a bitrate b r,t v at a given time t. then, fog4video checks the resource demand a k on time t − 1 based on b r,t−1 v and α r,t−1 v,k for all fog nodes and videos. fog4video evaluates if the resource availability t k can support the current resource demand in a k for insertion of the fog node k in l. from the list of candidate fog nodes, we consider network, fog node, and user metrics for decision making, which have different degrees of im-algorithm 1: computing fog node candidate list push request b r,t v for candidate fog node k into list l portance on decision making. in this context, fog4video considers ahp [32] to compute the influence factor for each parameter. specifically, ahp is a multi-criteria decision-making scheme capable of balancing inputs with different degrees of importance. ahp combines qualitative and quantitative elements for the analysis, allowing the system to find an ideal solution considering several metrics in the decision-making process. ahp recognizes a pairwise comparison between the numerical values of each parameter and their relative degrees of importance, to adjust their weights at runtime. as a result, a higher weight means higher importance for the corresponding criterion. the pairs must not contradict with each other, e.g., if the metric i is twice more important compared to metric j, then j has 1/2 importance than i. we consider seven importance levels to compare each pair of parameters, indicating how essential one parameter is compared to others, as shown in table 3 . i is as important as j 2 i is slightly important than j 3 i is more important than j 4 i is much more important than j 1/2 i is slightly less important than j 1/3 i is less important than j 1/4 i is much less important than j we consider a comparison matrix a = m n x n with lines and columns representing the metrics considered for decision-making to represent all pairwise comparisons. variable n denotes the number of elements compared, as shown in eq. 3. each m i,j value in the matrix means how important the i − th element is compared to the j − th element. the degree of importance levels depends on subjective judgement related to abandonment rates of vod due to poor qoe. we set the values to achieve higher improvements in terms of qoe, while also considering others metrics, unless stated otherwise. for the metrics used by fog4video, we define the number of stalls as f , stall duration as e, delay as d, and cost for the deployment of vod as c k . the comparison matrix m indicates which parameters have higher priority than others, as shown in eq. 4. for instance, in the first line, we see that the number of stalls f metric is twice more significant than stall duration e in the second line and three times more important than delay d in the third line. it is essential to highlight that if one criterion is considered to be twice more relevant than another, then the other is 1/2 as important compared to the first. note that the main diagonal of the matrix must always contain the value 1, as we compare a metric with itself. ahp measures the influence factor i i,k assigning pairwise comparisons with the data on each fog node k. the influence factor is given by the sum of the multiplication of the current value of a metric p i,k , i.e., f, e, d, c k , with the relative importance of the other metrics, as shown in eq. 5. for example, if the values in p k are f = 1, e = 2, f = 15 and c k = 1 , the influence factor of the delay metric would 15 × (1 × (1/3) + 2 × (2/3) + 15 × 1 + 1 × 2), based on the third line of eq. 4. the influence factor of each metric serves as input for the score s k of the current conditions in each fog node k, which is given in eq. 6 as each video may have different qoe requirements, we consider a weight matrix w to give different priorities for each video stream in v . in this sense, each column in w is the weight given by a video, as each type of video has different characteristics and needs specific management. in this case, various weights were assigned to each video type, as shown in eq. 7. the decision matrix dm considers the combination of each video weight in w and score s k of each fog node, based on eq. 8 the parameters in the matrix dm have a significant variation making a low accuracy analysis for the decision. to decrease the discrepancy between the values of dm , we perform a normalization in every parameter of dm using the arithmetic average dm k of the values of column k. the calculation gets the difference between a given tier and the average of all tiers, parameter by parameter, as shown in eq. 9. in the end, we have the normalized matrix η v,k with the same dimensions of dm . afterwards, we measure the euclidean distance ξ between the attributes of the fog node chosen in t − 1 compared to the current conditions of the other fog nodes within the overlapping regions η v−1,k−1 , based on eq. 10. considering the ξ value, we select the fog node with the highest value. based on the higher value of ξ, the fog4video mechanism acknowledges the potential of a fog node to stream video content to the client meeting qoe requirements and considering the cost. in this sense, the fog4video mechanism informs its decision via the cloud controller to fog nodes and client, detailing about which streaming unit the client must request for the given chunk. for the multi-tier scenario, each fog node embeds a set of modules and components to assist the content orchestrator. fog4video defines one phase for analysis and another for decision-making & execution to support dynamic content orchestrator in real-time. for the last phase, an ahp method balances the multi-criteria inputs and executes the decision-making. this section describes the evaluation methodology, including scenario description, simulation parameters, metrics used to evaluate the qoe of delivered videos transmitted by different content orchestrator mechanisms. we define the scenario and simulation parameters in section 4.1. we discuss the results and the findings of the proposal in section 4.2. we implemented fog4video , and the scenario, as shown in figure 2 , in the ns-3.29 simulator, and the implementation is available for download at [16] . ns-3.29 implements the protocol stack for communication between the mobile device and the network infrastructure to reach the vod service provider. for the wired infrastructure, we considered the partial topology of the fibre testbed [19] to set delays of the long-haul communication between fog nodes and the cloud. we distributed the multi-tier fog nodes organized into three fog tiers and one cloud level. in tier 3, there is an edge server with an wi-fi ap in a local datacenter with resource availability of 10 mbps for transcoding. in tiers 1 and 2, there is a regional data center with resource availability of 20 mbps for transcoding, each. in the cloud, there is a powerful datacenter with a resource availability of 100 mbps for transcoding. figure 2 shows the delay of long-haul communication between cloud and tiers. for the wireless infrastructure, we consider wi-fi 802.11n aps, channel bonding of 40 mhz in the center of a square area of 50 m 2 providing access to 40 randomly distributed mobile devices. the mobile devices followed a linear continuous video request rate of 10 requests per second. by default, the video streaming initiates from the cloud node while the orchestrator becomes aware of the qoe/qos metrics of each client, such as stalling and delay. the adaptation algorithm is rate-based, where the bit rate starts from the lowest value and, for smoothing between each quality level, switches one level at a time following a conservative bit rate switching profile. the video player buffers at least 2000 ms, which is the size of a chunk. therefore, as soon as the buffer does not have content to render (i.e., stalling event), it has to re-buffer a complete chunk to play out the video again. we considered the big buck bunny, sunflower version video downloaded from the video library [33] . the video player at the client requests a video at a given time. precisely, we use a high definition video with a duration of 600 seconds, configured with 30 frames per second, and encoded into eight common used bitrates of 400, 650, 1000, 1500, 2250, 3400, 4700 and 6000 kbps [19] , as shown in table 4 . qoe metrics overcome the limitation of qos metrics to capture aspects of vod related to the human perception [34] . in this way, we apply wellknown qoe metrics for vod services, namely bit rate, bit rate switch events, number, and duration of stalls [2] . due to the conservative behavior of the adaptation algorithm, we consider the first 20 chunks as the initial bitrate since the first chunk always starts with the lowest bitrate, the last 20 chunks for the final bitrate, and average bitrate of all the chunks. depending on the content orchestrator, a fog node, chosen with the best qoe and cost improvement potential, can reply to most of the requests. we consider the jain fairness index f to express the concentration of requests [35] to measure requests fairness between fog nodes for each mechanism. the index calculation is denoted in eq. 11, where x i means the number of requests in fog node f k . we also evaluate the cost c k to deploy vod services in a given fog node k, which is computed based on eq. 2. the cost c k for a given fog node k depends on the amount of time the streaming unit or transcoding unit uses resources for processing a chunk, causing a monetary cost m k per hour of usage. in this way, we computed the monetary cost of cpu time per hour based on amazon web services cost of ownership calculator 1 . the cpu time proportionally decreases when renting a higher number of cpu cores in the same aws region. we considered the deployment into four regions and three amazon instance setups and the monetary cost of each cpu core per hour, as well as memory and storage, are shown in table 5 . we conducted 33 simulations for each of the three different fog content orchestrator mechanisms, namely, random, greedy, and fog4video. then, we analyze their impact to deliver vod content with qoe support and provide a 95% confidence interval. all mechanisms leverage the resource availability of the fog nodes. the random strategy chooses the fog nodes with equal chances among all of them. the greedy mechanism selects the fog node owning the smallest delay. fog4video evaluates the collected metrics to choose the best serving fog node, such as explained in section 3. figure 3 shows the number of clients per tier for each individual chunk, i.e., tier 3, tier 2, tier 1, and cloud. this analysis provides information about the fog node selection behavior of each content orchestrator and from each tier the chunks were requested along the video playback. by analyzing the results, it is possible to conclude that all the clients start requesting the video from the cloud. afterwards, each mechanism selects the streaming unit for the client to download the video in different ways. for instance, the random strategy selects the tiers with a 25% probability, since it is one tier between four candidates. on the other hand, the greedy mechanism prefers to select nodes with lower latency but picks the remaining tiers due to the resource availability in each tier. finally, fog4video selects the appropriate streaming unit based on network, fog node, user information, and cost, leading to higher use of tier 2 and the cloud. tier 2 has the same cost as tier 1, but the delay is better in tier 2, leading a more significant share. tier 3 usage grows gradually before the 15th chunk because the bitrates of all clients are small in the beginning. when the bitrate grows, the clients move to other tiers capable of adapting the content as requested. in the last 15 chunks, the download of some clients finish, and other clients start to request from more cost effective tiers. figure 4 depicts the costs for vod service deployment c k for different content orchestrator mechanisms. by analyzing the costs, it is possible to conclude that fog4video reduces the cost by up to 24.04% and 16.32% compared to greedy and random, respectively. fog4video provides lower costs because it selects closer and more expensive tiers only when poor qoe is detected, despite the others mechanisms. on the other hand, greedy and random strategies do not follow this approach. in the last two cases, greedy decides for the nodes with the lowest delays representing a closer distance between fog nodes and clients. however, this decision incurs higher costs. the random strategy has lower costs because of the lower number of clients requesting from the more expensive tier. greedy random figure 4 : cost to deploy vod services figure 5 shows the bitrate initial, final, and average bitrates received by the client downloading the video via different content orchestrator mechanisms, i.e., fog4video, greedy, and random. the video starts with a lower bitrate, i.e., 400 kbps, regardless of the mechanism. we can also see that fog4video delivered the final bitrate up to 19.3% and 27.9% higher than greedy and random, respectively. the higher bitrate occurs because fog4video chooses the best streaming unit based on metrics like delay combined with qoe. the lower delay allows the adaptation algorithm to better predict the network conditions between the client and fog node with more frequent and updated information. moreover, fog4video provides more fairness between the clients, giving a better opportunity to clients with worse qoe indicators, as shown in figure 6 . the fairness allows the clients to have more room to increase their bitrate and thus to have a better bitrate than the clients using greedy and random mechanisms. however, ahp needs around 15 chunks to evaluate the performance of each tier resulting in lower initial bitrate for fog4video. finally, the average bitrate delivered by fog4video is 30.91% and 35.05% higher than provided by greedy and random mechanisms, respectively. the average is higher because fog4video has a short period to adapt and converge. in this case, the clients can request better bitrates earlier, when compared to those clients using greedy and random, giving them a better overall bitrate. bitrate figure 6 shows jain's fairness index for the client's distribution on fog nodes, which is computed by eq. 11. the index shows the concentration of requests set by the mechanisms. the random mechanism is the fairest because the probability to choose a tier is equal between all of them. however, this performance does not result in better qoe or cost-effective results by the mechanism. in this sense, fog4video offers the best tradeoff between application performance and fairness, achieving a high score on the fairness index while cost-effectively improving qoe. fog4video decides to allocate requests to fog nodes with more potential to enhance qoe, and the fairness stands because of the usage of cheaper fog nodes. moreover, the greedy mechanism has a worse performance because it concentrates the requests to the closest servers even though it does not necessarily reflect in better qoe. figure 7 shows the number of stalls and their duration per client for videos delivered by each content orchestrator mechanism. fog4video reduces the number of stall events by 71.44% and 71.18% compared to greedy and random mechanisms, respectively. moreover, fog4video reduces the duration of stall events by 72.45% and 65.23% compared to greedy and rangreedy random figure 6 : fairness index dom mechanisms, respectively. these metrics have a significant influence on qoe, where high values could result in the viewer most likely leaving the video service. the interruption is a direct consequence of buffer starvation at the player, which is caused by poor network conditions, i.e., long delay, between the client and the streaming unit. by analyzing the results, we can see that fog4video delivered videos for 40 clients with less than one stall per client. for example, around 16 clients experienced a single stall during the video player, which lasted about 0.68 seconds. the reduced number of stalls happens because fog4video proactively selects the best streaming unit based on network, fog node, and user information. the qoe metrics played an essential role in identifying how a fog node can potentially improve the user's satisfaction. on the other hand, the greedy and random mechanisms selected the streaming unit without considering such metrics, which do not avoid overloaded streaming unit for video delivery. in this article, we introduced a multi-tier content orchestrator mechanism to provide qoe support for vod service, called fog4video. it chooses an appropriate fog node considering network, fog node, and user information. fog4video is executed in fog nodes organized in multi-tier between the cloud (at the top) and the mobile devices (at the bottom) to provide vod services. fog4video provided better qoe regarding a vod use case compared to delay based mechanisms. fog4video considers available delay, number of stalls, stalls duration, and cost to deploy vod services in a given tier. the information of fog nodes and clients served as inputs for the ahp method to compute the influence factor for each parameter. from this, fog4video properly decided from which fog tier a better vod provision with a lower overall cost can be achieved. from our evaluation analysis, we identified that fog4video delivered videos with an up to 30% qoe improvement compared to other content orchestrator mechanisms. the number of stall events reduces by up to 70%, and the stall duration reduced by up to 65%. these results are an essential achievement of fog4video since stall duration, and stall events significantly minimize the most detrimental factors that affect user perception. fog4video also improved average bitrate by up to 35% and reduced monetary cost by up to 24% compared to other content orchestrator mechanisms. hence, simulation results show sufficient qoe assuring user's satisfaction in the use case of vod. for future works, the content orchestrator mechanism can consider the migration of the content to the fog nodes. this way, the prefetch of most popular videos and their representations would imply storage and transmission costs between caches and links of the fog nodes. these costs could guide the selection of the best fog nodes to cache more representations of the same video and avoid frequent transcoding. besides saving transcoding costs, fewer redundant transmissions in the backhaul could provide better qoe. moreover, we can consider additional qos metrics to collect more accurate information about network conditions, such as jitter and packet loss. visual networking index: forecast and trends measurement of quality of experience of video-on-demand services: a survey a hybrid energy-aware video bitrate adaptation algorithm for mobile networks optical/ip -network operators brace for covid-19 traffic spikes potentials, trends, and prospects in edge technologies: fog, cloudlet, mobile edge, and micro data centers architectural imperatives for fog computing: use cases, requirements, and architectural techniques for fog-enabled iot networks edge caching with mobility prediction in virtualized lte mobile networks a survey on mobile edge computing: the communication perspective dynamic request redirection and elastic service scaling in cloud-centric media networks dynamic request redirection and resource provisioning for cloud-based video services under heterogeneous environment optimal content placement and request dispatching for cloud-based video distribution services towards qoe-assured 4k video-on-demand delivery through mobile edge virtualization with adaptive prefetching storage, communication, and load balancing trade-off in distributed cache networks ads: adaptive and dynamic scaling mechanism for multimedia conferencing services in the cloud optimal media service selection scheme for mobile users in mobile cloud fog4video repository -available after publish acceptance measuring the quality of experience of http video streaming integrating fog computing with vanets: a consumer perspective service migration from cloud to multi-tier fog nodes for multimedia dissemination with qoe support improving quality of experience in future wireless access networks through fog computing video delivery in heterogenous crans: architectures and strategies cache in the air: exploiting content caching and delivery techniques for 5g systems enhancing mobile video capacity and quality using rate adaptation, ran caching and processing fog computing may help to save energy in cloud computing opportunistic mobile sensing in the fog dynamic service migration in mobile edge-clouds fog-based transcoding for crowdsourced video livecast realtime streaming with guaranteed qos over wireless d2d networks using software defined networking to enhance the delivery of video-on-demand initial delay vs. interruptions: between the devil and the deep blue sea the effect of frame freezing and frame skipping on video quality analytic hierarchy process big buck bunny, sunflower version, accessed date quality of experience management in mobile cellular networks: key issues and design challenges a quantitative measure of fairness and discrimination conceptualization, methodology, software, formal analysis, data curation, writing -original draft derian alencar: software, validation, investigation, visualization rodolfo meneguette: methodology, software denis rosário: conceptualization, writing -review & editing jéferson nobre: writing -review & editing cristiano both: writing -review & editing eduardo cerqueira: resources, supervision torsten braun: writing -review & editing his research interests are in the field of fog computing, multi-access edge computing, vehicular networks, video-on-demand, virtual reality and quality of experience. rodolfo meneguette is an professor at federal technology institute. he received his bachelor's degree in computer science from the paulista university (unip), brazil ufpa) in brazil, as well as invited researcher at the network research lab at uclausa and centre for informatics and systems of the university of coimbra-portugal. his publications include 5 edited books, 5 book chapters, 4 patents and over than 180 papers in national/international refereed journals/conferences. he has been serving as a guest editor for 6 special issues of various peerreviewed scholarly journals. his research involves multimedia, future internet, quality of experience, mobility and ubiquitous computing this study was financed in part by the coordenação de aperfeiçoamento de pessoal de nível superior -brasil (capes) -finance code 001. we also thank the national council for scientific and technological development (cnpq) for the financial support through grant 431474/2016-8. the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. key: cord-269711-tw5armh8 authors: ma, junling; van den driessche, p.; willeboordse, frederick h. title: the importance of contact network topology for the success of vaccination strategies date: 2013-05-21 journal: journal of theoretical biology doi: 10.1016/j.jtbi.2013.01.006 sha: doc_id: 269711 cord_uid: tw5armh8 abstract the effects of a number of vaccination strategies on the spread of an sir type disease are numerically investigated for several common network topologies including random, scale-free, small world, and meta-random networks. these strategies, namely, prioritized, random, follow links and contact tracing, are compared across networks using extensive simulations with disease parameters relevant for viruses such as pandemic influenza h1n1/09. two scenarios for a network sir model are considered. first, a model with a given transmission rate is studied. second, a model with a given initial growth rate is considered, because the initial growth rate is commonly used to impute the transmission rate from incidence curves and to predict the course of an epidemic. since a vaccine may not be readily available for a new virus, the case of a delay in the start of vaccination is also considered in addition to the case of no delay. it is found that network topology can have a larger impact on the spread of the disease than the choice of vaccination strategy. simulations also show that the network structure has a large effect on both the course of an epidemic and the determination of the transmission rate from the initial growth rate. the effect of delay in the vaccination start time varies tremendously with network topology. results show that, without the knowledge of network topology, predictions on the peak and the final size of an epidemic cannot be made solely based on the initial exponential growth rate or transmission rate. this demonstrates the importance of understanding the topology of realistic contact networks when evaluating vaccination strategies. the importance of contact network topology for the success of vaccination strategies for many viral diseases, vaccination forms the cornerstone in managing their spread and the question naturally arises as to which vaccination strategy is, given practical constraints, the most effective in stopping the disease spread. for evaluating the effectiveness of a vaccination strategy, it is necessary to have as precise a model as possible for the disease dynamics. the widely studied key reference models for infectious disease epidemics are the homogeneous mixing models where any member of the population can infect or be infected by any other member of the population; see, for example, anderson and may (1991) and brauer (2008) . the advantage of a homogeneous mixing model is that it lends itself relatively well to analysis and therefore is a good starting point. due to the homogeneity assumption, these models predict that the fraction of the population that needs to be vaccinated to curtail an epidemic is equal to 1à1=r 0 , where r 0 is the basic reproduction number (the average number of secondary infections caused by a typical infectious individual in a fully susceptible population). however, the homogeneous mixing assumption poorly reflects the actual interactions within a population, since, for example, school children and office co-workers spend significant amounts of time in close proximity and therefore are much more likely to infect each other than an elderly person who mostly stays at home. consequently, efforts have been made to incorporate the network structure into models, where individuals are represented by nodes and contacts are presented by edges. in the context of the severe acute respiratory syndrome (sars), it was shown by meyers et al. (2005) that the incorporation of contact networks may yield different epidemic outcomes even for the same basic reproduction number r 0 . for pandemic influenza h1n1/09, pourbohloul et al. (2009) and davoudi et al. (2012) used network theory to obtain a real time estimate for r 0 . numerical simulations have shown that different networks can yield distinct disease spread patterns; see, for example, bansal et al. (2007) , miller et al. (2012) , and section 7.6 in keeling and rohani (2008) . to illustrate this difference for the networks and parameters we use, the effect of different networks on disease dynamics is shown in fig. 1 . descriptions of these networks are given in section 2 and appendix b. at the current stage, most theoretical network infectious disease models incorporate, from a real world perspective, idealized random network structures such as regular (all nodes have the same degree), erd + os-ré nyi or scale-free random networks where clustering and spatial structures are absent. for example, volz (2008) used a generating function formalism (an alternate derivation with a simpler system of equations was recently found by miller, 2011) , while we used the degree distribution in the effective degree model presented in lindquist et al. (2011) . in these models, the degree distribution is the key network characteristic for disease dynamics. from recent efforts (ma et al., 2013; volz et al., 2011; moreno et al., 2003; on incorporating degree correlation and clustering (such as households and offices) into epidemic models, it has been found that these may significantly affect the disease dynamics for networks with identical degree distributions. fig. 2 shows disease dynamics on networks with identical degree distribution and disease parameters, but with different network topologies. clearly, reliable predictions of the epidemic process that only use the degree distribution are not possible without knowledge of the network topology. such predictions need to be checked by considering other topological properties of the network. network models allow more precise modeling of control measures that depend on the contact structure of the population, such as priority based vaccination and contact tracing. for example, shaban et al. (2008) consider a random graph with a pre-specified degree distribution to investigate vaccination models using contact tracing. kiss et al. (2006) compared the efficacy of contact tracing on random and scale-free networks and found that for transmission rates greater than a certain threshold, the final epidemic size is smaller on a scale-free network than on a corresponding random network, while they considered the effects of degree correlations in kiss et al. (2008) . cohen et al. (2003) (see also madar et al., 2004) considered different vaccination strategies on scale-free networks and found that acquaintance immunization is remarkably effective. miller and hyman (2007) considered several vaccination strategies on a simulation of the population of portland oregon, usa, and found it to be most effective to vaccinate nodes with the most unvaccinated susceptible contacts, although they found that this strategy may not be practical because it requires considerable computational resources and information about the network. bansal et al. (2006) took a contact network using data from vancouver, bc, canada, considered two vaccination strategies, namely mortality-and morbidity-based, and investigated the detrimental effect of vaccination delays. and found that, on realistic contact networks, vaccination strategies based on detailed network topology information generally outperform random vaccination. however, in most cases, contact network topologies are not readily available. thus, how different network topologies affect various vaccination strategies remains of considerable interest. to address this question, we explore two scenarios to compare percentage reduction by vaccination on the final size of epidemics across various network topologies. first, various network topologies are considered with the disease parameters constant, assuming that these have been independently estimated. second, different network topologies are used to fit to the observed incidence curve (number of new infections in each day), so that their disease parameters are different yet they all line up to the same initial exponential growth phase of the epidemic. vaccines are likely lacking at the outbreak of an emerging infectious disease (as seen in the 2009 h1n1 pandemic, conway et al., 2011) , and thus can only be given after the disease is already widespread. we investigate numerically whether network topologies affect the effectiveness of vaccination strategies started with a delay after the disease is widespread; for example, a 40 day delay as in the second wave of the 2009 influenza pandemic in british columbia, canada (office of the provincial health officer, 2010). details of our numerical simulations are given in appendix a. this paper is structured as follows. in section 2, a brief overview of the networks and vaccination strategies (more details are provided in appendices b and c) is given. in section 3, we investigate the scenario where the transmission rate is fixed, while in section 4 we investigate the scenario where the growth rate of the incidence curve is fixed. to this end, we compute the incidence curves and reductions in final sizes (total number of infections during the course of the epidemic) due to vaccination. for the homogeneous mixing model, these scenarios are identical (ma and earn, 2006) , but as will be shown, when taking topology into account, they are completely different. we end with conclusions in section 5. . on all networks, the average degree is 5, the population size is 200,000, the transmission rate is 0.06, the recovery rate is 0.2, and the initial number of infectious individuals is set to 100. both graphs represent the same data but the left graph has a semi-log scale (highlighting the growth phase) while the right graph has a linear scale (highlighting the peak). (b)) on networks with identical disease parameters and degree distribution (as shown in (a)). the network topologies are the random, meta-random, and near neighbor networks. see appendix b for details of the constructions of these networks. detailed network topologies for human populations are far from known. however, this detailed knowledge may not be required when the main objective is to assert the impact that topology has on the spread of a disease and on the effects of vaccination. it may be sufficient to consider a number of representative network topologies that, at least to some extent, can be found in the actual population. here, we consider the four topologies listed in table 1 , which we now briefly describe. in the random network, nodes are connected with equal probability yielding a poisson degree distribution. in a scale-free network, small number of nodes have a very large number of links and large number of nodes have a small number of links such that the degree distribution follows a power law. small world (sw) networks are constructed by adding links between randomly chosen nodes on networks in which nodes are connected to the nearest neighbors. the last network considered is what we term a meta-random network where random networks of various sizes are connected with a small number of interlinks. all networks are undirected with no self loops or multiple links. the histograms of the networks are shown in table 2 , and the details of their construction are given in appendix b. the vaccination strategies considered are summarized in table 3 . in the random strategy, an eligible node is randomly chosen and vaccinated. in the prioritized strategy, nodes with the highest degrees are vaccinated first, while in the follow links strategy, inspired by notions from social networks, a randomly chosen susceptible node is vaccinated and then all its neighbors and then its neighbor's neighbors and so on. finally, in contact tracing, the neighbors of infectious nodes are vaccinated. for all the strategies, vaccination is voluntary and quantity limited. that is, only susceptibles who do not refuse vaccination are vaccinated and each day only a certain number of doses is available. in the case of (relatively) new viral diseases, the supply of vaccines will almost certainly be constrained, as was the case for the pandemic influenza h1n1/09 virus. also in the case of mass vaccinations, there will be resource limitations with regard to how many doses can be administered per day. the report (office of the provincial health officer, 2010) states that the vaccination program was prioritized and it took 3 weeks before the general population had access to vaccination. thus we assume that a vaccination program can be completed in 4-6 weeks or about 40 days, this means that for a population of 200,000, a maximum of 5000 doses a day can be used. for each strategy for each time unit, first a group of eligible nodes is identified and then up to the maximum number of doses is dispensed among the eligible nodes according to the strategy chosen. more details of the vaccination strategies and their motivations are given in appendix c. to study the effect of delayed availability of vaccines during an emerging infectious disease, we compare the effect of vaccination programs starting on the first day of the epidemic with those vaccination programs starting on different days. these range from 5 to 150 days after the start of the epidemic, with an emphasis on a 40 day delay that occurred in british columbia, canada, during the influenza h1n1/2009 pandemic. when a node is vaccinated, the vaccination is considered to be ineffective in 30% of the cases (bansal et al., 2006) . in such cases, the vaccine provides no immunity at all. for the 70% of the nodes for which the vaccine will be effective, a two week span to reach full immunity is assumed (clark et al., 2009) . during the two weeks, we assume that the immunity increases linearly starting with 0 at the time of vaccination reaching 100% after 14 days. the effect of vaccination strategies has been studied (see, for example, conway et al., 2011) using disease parameter values estimated in the literature. however, network topologies were not the focus of these studies. in section 3, the effect of vaccination strategies on various network topologies is compared with a fixed per link transmission rate. the per link transmission rate b is difficult to obtain directly and is usually derived as a secondary quantity. to determine b, we pick the basic reproduction number r 0 ¼ 1:5 and the recovery rate g ¼ 0:2, which are close to that of the influenza a h1n1/09 virus; see, for example, pourbohloul et al. (2009 ), tuite et al. (2010 . in the case of the homogeneous mixing sir model, the basic reproduction number is given by r 0 ¼ t=g, where t is the per-node transmission rate. our table 1 illustration of the different types of networks used in this paper. scale-free small world meta-random table 2 degree histograms of the networks in table 1 with 200,000 nodes. scale-free small world meta-random parameter values yield t ¼ 0:3. for networks, t ¼ b/ks. with the assumption that the average degree /ks ¼ 5, the above gives the per-link transmission rate b ¼ 0:06. the key parameters are summarized in table 4 . in section 3, we use this transmission rate to compare the incidence curves for the networks in table 1 with the vaccination strategies in table 3 . some of the most readily available data in an epidemic are the number of reported new cases per day. these cases generally display exponential growth in the initial phase of an epidemic and a suitable model therefore needs to match this initial growth pattern. the exponential growth rates are commonly used to estimate disease parameters (chowell et al., 2007; lipsitch et al., 2003) . in section 4, we consider the effects of various network topologies on the effectiveness of vaccination strategies for epidemics with a fixed exponential growth rate. the basic reproduction number r 0 ¼ 1:5 and the recovery rate g ¼ 0:2 yield an exponential growth rate l ¼ tàg ¼ 0:1 for the homogeneous mixing sir model. we tune the transmission rate for each network topology to give this initial growth rate. in this section, the effectiveness of vaccination strategies on various network topologies is investigated for a given set of parameters, which are identical for all the simulations. the values of the disease parameters are chosen based on what is known from influenza h1n1/09. qualitatively, these chosen parameters should provide substantial insight into the effects topology has on the spread of a disease. unless indicated otherwise the parameter values listed in table 4 are used. the effects of the vaccination strategies summarized in table 3 when applied without delay are shown in fig. 3 . for reference, fig. 1 shows the incidence curves with no vaccination. since the disease dies out in the small world network (see fig. 1 ), vaccination is not needed in this network for the parameter values taken. especially in the cases of the random and meta-random networks, the effects of vaccination are drastic while for the scale-free network they are still considerable. what is particularly notable is that when comparing the various outcomes, topology has as great if not a greater impact on the epidemic than the vaccination strategy. besides the incidence curves, the final sizes of epidemics and the effect vaccination has on these are also of great importance. table 5 shows the final sizes and the reductions in the final sizes for the various networks on which the disease can survive (for the chosen parameter values) with vaccination strategies for the cases where there is no delay in the vaccination. fig. 4 and table 6 show the incidence curves and the reductions in final sizes for the same parameters as used in fig. 3 and table 5 but with a delay of 40 days in the vaccination. as can be expected for the given parameters, a delay has the biggest effect for the scale-free network. in that case, the epidemic is already past its peak and vaccinations only have a minor effect. for the random and meta-random networks, the table 3 illustration of vaccination strategies. susceptible nodes are depicted by triangles, infectious nodes by squares, and the vaccinated nodes by circles. the average degree in these illustrations has been reduced to aid clarity. the starting point for contact tracing is labeled as a while the starting point for the follow links strategy is labeled as b. the number of doses dispensed in this illustration is 3. random follow links contact tracing table 3 for the network topologies in table 1 given a fixed transmission rate b. there is no delay in the vaccination and parameters are equal to those used in fig. 1 . to further investigate the effects of delay in the case of random vaccination, we compute reductions in final sizes for delays of 5, 10, 15,y,150 days, in random, scale-free, and meta-random networks. fig. 5 shows that, not surprisingly, these reductions diminish with longer delays. however, the reductions are strongly network dependent. on a scale-free network, the reduction becomes negligible as the delay approaches the epidemic peak time, while on random and meta-random networks, the reduction is about 40% with the delay at the epidemic peak time. this section clearly shows that given a certain transmission rate b, the effectiveness of a vaccination strategy is impossible to predict without having reliable data on the network topology of the population. next, we consider the case where instead of the transmission rate, the initial growth rate is given. we line up incidence curves on various network topologies to a growth rate l predicted by a homogeneous mixing sir model with the basic reproduction number r 0 ¼ 1:5 and recovery rate g ¼ 0:2 (in this case with exponential, l ¼ ðr 0 à1þg ¼ 0:1). table 7 summarizes the transmission rates that yield this exponential growth rate on the corresponding network topologies. the initial number of infectious individuals for models on each network topology needs to be adjusted as well so that the curves line up along the homogeneous mixing sir incidence curve for 25 days. as can be seen from the table, the variations in the parameters are indeed very large, with the transmission rate for the small world network being nearly 8 times the value of the transmission rate for the scale-free network. the incidence curves corresponding to the parameters in table 7 are shown in fig. 6 . as can clearly be seen, for these parameters, the curves overlap very well for the first 25 days, thus showing indeed the desired identical initial growth rates. however, it is also clear that the curves diverge strongly later on, with the epidemic on the small world network being the most severe. these results show that the spread of an epidemic cannot be predicted on the basis of having a good estimate of the growth rate alone. in addition, comparing figs. 1 and 6, a higher transmission rate yields a much larger final size and a longer epidemic on the meta-random network. the effects of the various vaccination strategies for the case of a given growth rate are shown in fig. 7 . given the large differences in the transmission rates, it may be expected that the final sizes show significant differences as well. this is indeed the case as can be seen in table 8 , which shows the percentage reduction in final sizes for the various vaccination strategies. with no vaccination, the final size of the small world network is more than 3 times that of the scale-free network, but for all except the follow links vaccination strategy the percentage reduction on the small world network is greater. the effects of a 40-day delay in the start of the vaccination are shown in fig. 8 and table 9 . besides the delay, all the parameters are identical to those in fig. 7 and table 8 . the delay has the largest effect on the final sizes of the small world network, increasing it by a factor of 20-30 except in the follow links case. on a scale-free network, the delay renders all vaccination strategies nearly ineffective. these results also confirm the importance of network topology in disease spread even when the incidence curves have identical initial growth. the initial stages of an epidemic are insufficient to estimate the effectiveness of a vaccination strategy on reducing the peak or final size of an epidemic. the relative importance of network topology on the predictability of incidence curves was investigated. this was done by considering whether the effectiveness of several vaccination strategies is impacted by topology, and whether the growth in the daily incidences has a network topology independent relation with the disease transmission rate. it was found that without a fairly detailed knowledge of the network topology, initial data cannot predict epidemic progression. this is so for both a given transmission rate b and a given growth rate l. for a fixed transmission rate and thus a fixed per link transmission probability, given that a disease spreads on a network with a fixed average degree, the disease spreads fastest on scale-free networks because high degree nodes have a very high probability to be infected as soon as the epidemic progresses. in turn, once a high degree node is infected, on average it passes on the infection to a large number of neighbors. the random and meta-random networks show identical initial growth rates because they have the same local network topology. on different table 1 without vaccination for the case where the initial growth rate is given. the transmission rates and initial number of infections for the various network topologies are given in table 7 , while the remaining parameters are the same as in fig. 1 meta-random network fig. 7 . the effects of the vaccination strategies for different topologies when the initial growth rate is given. the transmission rates b are as indicated in table 7 , while the remaining parameters are identical to those in fig. 6 . network topologies, diseases respond differently to parameter changes. for example, on the random network, a higher transmission rate yields a much shorter epidemic, whereas on the metarandom network, it yields a longer one with a more drastic increase in final size. these differences are caused by the spatial structures in the meta-random network. considering that a metarandom network is a random network of random networks, it is likely that the meta-random network represents a general population better than a random network. for a fixed exponential growth rate, the transmission rate needed on the scale-free network to yield the given initial growth rate is the smallest, being about half that of the random and the meta-random networks. hence, the per-link transmission probability is the lowest on the scale-free network, which in turn yields a small epidemic final size. for different network topologies, we quantified the effect of delay in the start of vaccination. we found that the effectiveness of vaccination strategies decreases with delay with a rate strongly dependent on network topology. this emphasizes the importance of the knowledge of the topology, in order to formulate a practical vaccination schedule. with respect to policy, the results presented seem to warrant a significant effort to obtain a better understanding of how the members of a population are actually linked together in a social network. consequently, policy advice based on the rough estimates of the network structure should be viewed with caution. this work is partially supported by nserc discovery grants (jm, pvdd) and mprime (pvdd). we thank the anonymous reviewers for their constructive comments. the nodes in the network are labeled by their infectious status, i.e. susceptible, infectious, vaccinated, immune, refusing vaccination (but susceptible), and vaccinated but susceptible (the vaccine is not working), respectively. the stochastic simulation is initialized by first labeling all the nodes as susceptible and then randomly labeling i 0 nodes as infectious. then, before the simulation starts, 50% of susceptible nodes are labeled as refusing vaccination but susceptible. during the simulation, when a node is vaccinated, the vaccine has a probability of 30% to be ineffective. if it is not effective, the node remains fully susceptible, but will not be vaccinated again. if it is effective, then the immunity is built up linearly over a certain period of time, taken as 2 weeks. we assume that infected persons generally recover in about 5 days, giving a recovery rate g ¼ 0:2. the initial number of infectious individuals i 0 is set to 100 unless otherwise stated, to reduce the number of runs where the disease dies out due to statistical fluctuations. all simulation results presented in sections 4 and 5 are averages of 100 runs, each with a new randomly generated network of the chosen topology. the parameters in the simulations are shown in table 4 . the population size n was chosen to be sufficiently large to be representative of a medium size town and set to n ¼ 200,000, while the degree average is taken as /ks ¼ 5 with a maximum degree m¼ 100 (having a maximum degree only affects the scalefree network since the probability of a node having degree m is practically zero for the other network types). when considering a large group of people, a good first approximation is that the links between these people are random. although it is clear that this cannot accurately represent the population since it lacks, for example, clustering and spatial aggregation (found in such common contexts as schools and work places), it may be possible that if the population is big enough, most if not all nonrandom effects average out. furthermore, random networks lend themselves relatively well to analysis so that a number of interesting (and testable) properties can be derived. as is usually the case, the random network employed here originates from the concepts first presented rigorously by erd + os and ré nyi (1959). our random networks are generated as follows: (1) we begin by creating n unlinked nodes. (2) in order to avoid orphaned nodes, without loss of generality, first every node is linked to another uniformly randomly chosen node that is not a neighbor. (3) two nodes that are not neighbors and not already linked are uniformly randomly selected. if the degree d of both the nodes is less than the maximum degree m, a link is established. if one of the nodes has maximum degree m, a new pair of nodes is uniformly randomly selected. (4) step 3 is repeated n â /ksàn times. when considering certain activities in a population, such as the publishing of scientific work or sexual contact, it has been found that the links are often well described by a scale-free network structure where the relationship between the degree and the number of nodes that have this degree follows a negative power law; see, for example, the review paper by albert and barabá si (2002) . scale-free networks can easily be constructed with the help of a preferential attachment. that is to say, the network is built up step by step and new nodes attach to existing nodes with a probability that is proportional to the degree of the existing nodes. our network is constructed with the help of preferential attachment, but two modifications are made in order to render the scale-free network more comparable with the other networks investigated here. first, the maximum degree is limited to m not by restricting the degree from the outset but by first creating a scale-free network and then pruning all the nodes with a degree larger than m. second, the number of links attached to each new node is either two or three dependent on a certain probability that is set such that after pruning the average degree is very close to that of the random network (i.e. /ks ¼ 5). our scale-free network is generated as follows: (1) start with three fully connected nodes and set the total number of links l¼3. (2) create a new node. with a probability of 0.3, add 2 links. otherwise add 3 links. for each of these additional links to be added find a node to link to as outlined in step 3. (3) loop through the list of nodes and create a link with probability d=ð2lþ, where d is the degree of the currently considered target node. (4) increase l by 2 or 3 depending on the choice in step 2. (5) repeat nà3 times steps 2 and 3. (6) prune nodes with a degree 4 m. small world networks are characterized by the combination of a relatively large number of local links with a small number of non-local links. consequently, there is in principle a very large number of possible small world networks. one of the simplest ways to create a small world network is to first place nodes sequentially on a circle and couple them to their neighbors, similar to the way many coupled map lattices are constructed (willeboordse, 2006) , and to then create some random short cuts. this is basically also the way the small world network used here is generated. the only modification is that the coupling range (i.e. the number of neighbors linked to) is randomly varied between 2 and 3 in order to obtain an average degree equal to that of the random network (i.e. /ks ¼ 5). we also use periodic boundary conditions, which as such is not necessary for a small world network but is commonly done. the motivation for studying small world networks is that small groups of people in a population are often (almost) fully linked (such as family members or co-workers) with some connections to other groups of people. our small world network is generated as follows: (1) create n new unlinked nodes with index i ¼ 1 . . . n. (2) with a probability of 0.55, link to neighboring and second neighboring nodes (i.e. create links i2ià1, i2iþ 1, i2ià2, i2iþ 2). otherwise, also link up to the third neighboring nodes (i.e. create links i2ià1, i2i þ1, i2ià2, i2i þ2, i2ià3, i2i þ3). periodic boundary conditions are used (i.e. the left nearest neighbor of node 1 is node n while the right nearest neighbor of node n is node 1). (3) create the 'large world' network by repeating step 2 for each node. (4) with a probability of 0.05 add a link to a uniformly randomly chosen node excluding self and nodes already linked to. (5) create the small world network by carrying out step 4 for each node. in the random network, the probability for an arbitrary node to be linked to any other arbitrary node is constant and there is no clear notion of locality. in the small world network on the other hand, tightly integrated local connections are supplemented by links to other parts of the network. to model a situation in between where randomly linked local populations (such as the populations of villages in a region) are randomly linked to each other (for example, some members of the population of one village are linked to some members of some other villages), we consider a meta-random network. when increasing the number of shortcuts, a meta-random network transitions to a random network. it can be argued that among the networks investigated here, a meta-random network is the most representative of the population in a state, province or country. our meta-random network is generated as follows: (1) create n new unlinked nodes with index i ¼ 1 . . . n. (2) group the nodes into 100 randomly sized clusters with a minimum size of 20 nodes (the minimum size was chosen such that it is larger than /ks, which equals five throughout, to exclude fully linked graphs). this is done by randomly choosing 99 values in the range from 1 to n to serve as cluster boundaries with the restriction that a cluster cannot be smaller than the minimum size. (3) for each cluster, create an erd + os-ré nyi type random network. (4) for each node, with a probability 0.01, create a link to a uniformly randomly chosen node of a uniformly randomly chosen cluster excluding its own cluster. the network described in this subsection is a near neighbor network and therefore mostly local. nevertheless, there are some shortcuts but shortcuts to very distant parts of the network are not very likely. it could therefore be called a medium world network (situated between small and large world networks). the key feature of this network is that despite being mostly local its degree distribution is identical to that of the random network. our near neighbor network is generated as follows: (1) create n new unlinked nodes with index i ¼ 1 . . . n. (2) for each node, set a target degree by randomly choosing a degree with a probability equal to that for the degree distribution of the random network. (3) if the node has reached its target degree, continue with the next node. if not continue with step 4. (4) with a probability of 0.5, create a link to a node with a smaller index, otherwise create a link to a node with a larger index (using periodic boundary conditions). (5) starting at the nearest neighbor by index and continuing by decreasing (smaller indices) or increasing (larger indices) the index one by one while skipping nodes already linked to, search for the nearest node that has not reached its target degree yet and create a link with this node. (6) create the network by repeating steps 3-5 for each node. for all the strategies, vaccination is voluntary and quantity limited. that is to say only susceptibles who do not refuse vaccination are vaccinated and each day only a certain number of doses is available. for each strategy for each time unit, first a group of eligible nodes is identified and then up to the maximum number of doses is dispensed among the eligible nodes according to the strategy chosen. in this strategy, nodes with the highest degrees are vaccinated first. the motivation for this strategy is that high degree nodes on average can be assumed to transmit a disease more often than low degree nodes. numerically, the prioritized vaccination strategy is implemented as follows: (1) for each time unit, start at the highest degree (i.e. consider nodes with degree d¼m) and repeat the steps below until either the number of doses per time step or the total number of available doses is reached. (2) count the number of susceptible nodes for degree d. (3) if the number of susceptible nodes with degree d is zero, set d ¼ dà1 and return to step 2. (4) if the number of susceptible nodes with degree d is smaller than or equal to the number of available doses, vaccinate all the nodes, then set d ¼ dà1 and continue with step 2. otherwise continue with step 5. (5) if the number of susceptible nodes with degree d is greater than the number of currently available doses, randomly choose nodes with degree d to vaccinate until the available number of doses is used up. (6) when all the doses are used up, end the vaccination for the current time unit and continue when the next time unit arrives. in practice prioritizing on the basis of certain target groups such as health care workers or people at high risk of complications can be difficult. prioritizing on the basis of the number of links is even more difficult. how would such individuals be identified? one of the easiest vaccination strategies to implement is random vaccination. numerically, the random vaccination strategy is implemented as follows: (1) for each time unit, count the total number of susceptible nodes. (2) if the total number of susceptible nodes is smaller than or equal to the number of doses per unit time, vaccinate all the susceptible nodes. otherwise do step 3. (3) if the total number of susceptible nodes is larger than the number of doses per unit time, randomly vaccinate susceptible nodes until all the available doses are used up. one way to reduce the spread of a disease is by splitting the population into many isolated groups. this could be done by vaccinating nodes with links to different groups. however given the network types studied here, breaking links between groups is not really feasible since besides the random cluster network, there is no clear group structure in the other networks. another approach is the follow links strategy, inspired by notions from social networks, where an attempt is made to split the population by vaccinating the neighbors and the neighbor's neighbors and so on of a randomly chosen susceptible node. numerically, the follow links strategy is implemented as follows: (1) count the total number of susceptible nodes. (2) if the total number of susceptible nodes is smaller than or equal to the number of doses per unit time, vaccinate all the susceptible nodes. (3) if the total number of susceptible nodes is greater than the number of available doses per unit time, first randomly choose a susceptible node, label it as the current node, and vaccinate it. (4) vaccinate all the susceptible neighbors of the current node. (5) randomly choose one of the neighbors of the current node. (6) set the current node to the node chosen in step 5. (7) continue with steps 4-6 until all the doses are used up or no available susceptible neighbor can be found. (8) if no available susceptible neighbor can be found in step 7, randomly choose a susceptible node from the population and continue with step 4. contact tracing was successfully used in combating the sars virus. in that case, everyone who had been in contact with an infectious individual was isolated to prevent a further spread of the disease. de facto, this kind of isolation boils down to removing links rendering the infectious node degree 0, a scenario not considered here. here contact tracing tries to isolate an infectious node by vaccinating all its susceptible neighbors. numerically, the contact tracing strategy is implemented as follows: (1) count the total number of susceptible nodes. (2) if the total number of susceptible nodes is smaller than or equal to the number of doses per unit time, vaccinate all the susceptible nodes. (3) count only those susceptible nodes that have an infectious neighbor. (4) if the number of susceptible nodes neighboring an infectious node is smaller than or equal to the number of doses per unit time, vaccinate all these nodes. (5) if the number of susceptible nodes neighboring an infectious node is greater than the number of available doses repeat step 6 until all the doses are used up. (6) randomly choose an infectious node that has susceptible neighbors and vaccinate its neighbors until all the doses are used up. statistical mechanics of complex networks infectious diseases of humans a comparative analysis of influenza vaccination programs when individual behaviour matters: homogeneous and network models in epidemiology compartmental models in epidemiology comparative estimation of the reproduction number for pandemic influenza from daily case notification data trial of 2009 influenza a (h1n1) monovalent mf59-adjuvanted vaccine efficient immunization strategies for computer networks and populations vaccination against 2009 pandemic h1n1 in a population dynamical model of vancouver, canada: timing is everything early real-time estimation of the basic reproduction number of emerging infectious diseases. phys. rev. x 2, 031005. erd + os modeling infectious diseases in humans and animals infectious disease control using contact tracing in random and scale-free networks the effect of network mixing patterns on epidemic dynamics and the efficacy of disease contact tracing effective degree network disease models transmission dynamics and control of severe acute respiratory syndrome generality of the final size formula for an epidemic of a newly invading infectious disease effective degree household network disease models immunization and epidemic dynamics in complex networks network theory and sars: predicting outbreak diversity a note on a paper by erik volz: sir dynamics in random networks effective vaccination strategies for realistic social networks edge based compartmental modelling for infectious disease spread epidemic incidence in correlated complex networks office of the provincial health officer, 2010. b.c.s response to the h1n1 pandemic initial human transmission dynamics of the pandemic (h1n1) 2009 virus in north america dynamics and control of diseases in networks with community structure a high-resolution human contact network for infectious disease transmission networks, epidemics and vaccination through contact tracing estimated epidemiologic parameters and morbidity associated with pandemic h1n1 influenza sir dynamics in random networks with heterogeneous connectivity effects of heterogeneous and clustered contact patterns on infectious disease dynamics dynamical advantages of scale-free networks key: cord-319291-6l688krc authors: hung, chun-min; huang, yueh-min; chang, ming-shi title: alignment using genetic programming with causal trees for identification of protein functions date: 2006-09-01 journal: nonlinear anal theory methods appl doi: 10.1016/j.na.2005.09.048 sha: doc_id: 319291 cord_uid: 6l688krc a hybrid evolutionary model is used to propose a hierarchical homology of protein sequences to identify protein functions systematically. the proposed model offers considerable potentials, considering the inconsistency of existing methods for predicting novel proteins. because some novel proteins might align without meaningful conserved domains, maximizing the score of sequence alignment is not the best criterion for predicting protein functions. this work presents a decision model that can minimize the cost of making a decision for predicting protein functions using the hierarchical homologies. particularly, the model has three characteristics: (i) it is a hybrid evolutionary model with multiple fitness functions that uses genetic programming to predict protein functions on a distantly related protein family, (ii) it incorporates modified robust point matching to accurately compare all feature points using the moment invariant and thin-plate spline theorems, and (iii) the hierarchical homologies holding up a novel protein sequence in the form of a causal tree can effectively demonstrate the relationship between proteins. this work describes the comparisons of nucleocapsid proteins from the putative polyprotein sars virus and other coronaviruses in other hosts using the model. identification of protein function is the main theme of the post-genome era. recently, biomedical scientists have been striving to study proteomics and explore the therapeutic potential of genes for curing disease. thus, a powerful and integrated methodology is required for predicting novel protein function. during the past decade, many algorithms applied to computational biology have been used to solve several of the above problems, with the most frequently used methods including dynamic programming [1] , the hidden markov model [2, 3] , the bayesian theorem, and probabilistic modeling [4] [5] [6] . the journal of molecular biology contains an introduction to hidden markov models for biological sequences written by krogh [7] . as a statistical model, the hidden markov model (hmm) is appropriate for many tasks in molecular biology. a profile-like hmm architecture can be used for searching against sequence databases for new family members and also can make multiple alignments of protein families. moreover, the latest review by tsakonas and dounias [8] indicates that artificial intelligence computational methodologies have been widely applied in bioinformatics, for example genetic algorithms [9] , genetic programming [10] , neural networks [11, 12] , fuzzy logic [13, 14] , and some classification and clustering techniques used in data mining [15] . design of automatic models for comparing protein sequences that are homologous to a known family or super-family has become increasingly important. one standard approach to solving this problem uses 'profiles' of position-specific scoring measure estimated using multiple sequence alignment (msa) [16] . in fact, a potential member of a distantly related protein family cannot be identified with the other members in the same family using only the primary protein structures. consequently, a new protein, such as the case of the sars virus, with numerous unknown functions cannot be accurately predicted by modeling a learning model. this study applies a hybrid methodology based on genetic programming with a causal tree [4, 28, 31] model to predicting protein function. the causal tree [28, 31] with a cause-effect form is proposed for locally comparing protein sequences across each pairwise alignment of msa. the clues to this proposal are that comparisons of distantly related sequences may carry more sensitive and accurate messages regarding protein function if given a segmented alignment using the more comprehensive depiction of the protein family. additionally, the probabilistic scoring measures, similar to profile-to-profile comparisons [17] , can be used to achieve one objective of genetic programming, namely aligning the hierarchical homologies based on profile comparisons. the resulting output is capable of function-related representation for further linkages of protein function. furthermore, the proposed model can generate a tree structure output with a special aligned format rather than a traditional local or global alignment format such as clustalw [23] . traditionally, a local alignment based on the smith-waterman algorithm [7] has been used to compare sequences with gaps using various scoring systems. using scoring systems, many methods are used to evaluate the similarity of two profile columns, for example the ffas method [18] , and prof sim [19] . the ffas method can calculate the correlation coefficients and thus score the 'dot-product' scores of the amino acid frequencies in the two columns. meanwhile, the prof sim method utilizes a single similarity score combined with the divergence score and the significance score of the probability distributions in two columns. recently, the method compass [17, 20] generated the occurrence probability of target residue in one profile column based on the other column of given the target residue frequencies to construct the local profile-to-profile alignment. however, the proposed model was designed to detect the potential functions of a novel protein with a hierarchical homology structure. the hybrid model, namely alignment using genetic programming with causal tree (agct), is a heuristic evolutionary method that contains three basic components: (i) genetic programming with innerexchanged individual strategy, (ii) causal trees [4, 28, 31] with probabilistic reasoning, and (iii) construction of hierarchical homologies with local block-to-block alignment using the methods of moment invariant and robust points matching (rpm) [24] . locally, a standard construction of an alignment requires at least a two-step design: the first step is a scoring evaluation for similarity between two given positions, while the second step is an alignment extension that considers the gap penalties and an extension algorithm. the agct model applies two modified steps to an alignment with a standard construction. the first step is to change the compared objects from positions to blocks that initialize extents based on a particular signal wave, while the second step is to iteratively adjust the boundaries using trimming local fragments rather than the extension algorithm. actually, in blast [21] and its successors [22] , the effective local extension is displayed for the cases of sequence-to-sequence and sequence-to-profile comparison. similarly, in agct, the approach to a local block-to-block comparison modified from profile-to-profile alignment described in [17] shifts the leftmost and rightmost boundaries in relation to the parent segments when high local alignment scores are achieved. notably, a detailed and refined procedure of alignment such as the smith-waterman algorithm [7] is also used in a part of the proposed model for confirming a meaningful fragment detected, because a fragment with a short length reduces the computation time and only some individuals in the population have to be compared making the combination of heuristic and exhaustive search models feasible. the proposed agct model is an evolutionary method based on a probabilistic inference model. the proposed model also incorporates a modified rpm [24] method and a local sequence alignment into the probabilistic model. meanwhile, the model uses a moment invariant theorem [32] for pattern recognitions to extract the feature-nodes from some fragmented signal curves. the rpm compares feature points to precisely construct a soft match for two sets of points. moreover, the modification of rpm changes the original transformation between matrices to the current transformation between a matrix and a tree. for achieving this matrix-to-tree transformation, the proposed constraints, namely rooted one-way winner-take-all (ro-wta), resembling the two-way winner-take-all (wta) constraints [27] , are also used. the proposed ro-wta method is a wta method that initially takes a unique root node by applying wta to the slacked matrix column that is involved for the null point correspondences. then wta is applied to all rows to ensure each row only produces a corresponding feature-node in a causal tree, and finally at most three nodes are taken and used as the branches of the parent node using the first three high values in the same column. moreover, the tree-to-matrix transformation simply visits a tree in preorder to derive a set of pairwise points. finally, an equation, eq. (28), is applied to yield a correspondence matrix for optimizing the rpm. a traditional msa method for protein classification compares more than two sequences to provide the aligned sequences for the observation of conserved domains. however, in a distantly related protein family, the data provides neither meaningful knowledge of biochemical function nor a consistent outcome when using different msa algorithms. clearly, the msa comparisons would be worst if it is applied with the distantly related protein family members. in this model, a small protein fragment transformed into a feature-node would be a compared target point with spatially localized features. individual feature-nodes simplified by six moment invariants [32] are matched with each other for the signal registration of 2-d curves [25, 30] . the 2-d curve is depicted using a block-to-block method that can transform a fragmenting sequence of protein residues into a signal curve by iteratively accumulating a block of amino acid properties. the signal curves are plotted by accumulating hydrophobicity against protein sequence here. because the problem has been transformed into the problem of comparing feature-nodes (points), the comparison of spatially localized features for protein function can be considered an optimization problem of point match for the 2-d shape. in fact, the optimization problem is a difficult problem. especially, the mapping problem must also consider the rigid and non-rigid deformations. a hybrid technique, combining genetic programming and the causal tree, is ideal for solving such difficult problems. genetic programming has emerged from the evolutionary based systems [26, [40] [41] [42] , and the causal tree is derived from probabilistic reasoning in intelligent systems [4] . since the genetic programming can avoid local minima of an energy function, the model is appropriate for recognizing a potential pattern which is never found via direct comparisons. this model uses a popular technique for synthesizing a special signal indicating specific protein functions by accumulating the properties of a group of protein residues. comparing these special signals with each other can provide a wider window for viewing the world of the protein. essentially, the signals must be decomposed for comparison using a method that can prevent a suitable waveform from destroying the protein signals. in the implementation, a technique of wavelet reconstruction for signals [29] is used to depict a noiseless signal curve skeleton. next, the reconstructed curve is differentiated with respect to the positions of each protein residue, and let its results equal zero. the technique thus can easily identify some positions with local minimum accumulating properties. subsequently, to preventing the generation of less than 40 residues, the method would discard some of the neighbor positions around the minima. next, all of the fragmented signals are used to provide an initial dataset for the input of evolutionary computation. although the local minimum values can not determine the true boundaries of the signals at the beginning step, the local alignment afterwards for fragmented proteins using the smith-waterman algorithm [7] would iteratively regularize the boundaries until the suboptimal boundaries are found. next, the model must further extract signal features from the fragmented input signals after decomposing the original signals. technically, the vector features can be associated with the signal features. however, the vector value representing a signal must be invariant when the physical linear-transforms deform the signal curve (shape), such as move, rescale, stretching, and rotation. once the problem is transformed into the image and signal processing domains, the rpm method is easily applied to protein function identification. conceptually, this proposed model includes two stages for feature matching. the first stage is designed to obtain the initial signal features for the signal registration [25, 30] . during the second stage, an inner-exchanged genetic programming is used to repeatedly refine the rpm parameters. this work utilizes the theorem of two-dimensional moment invariants [32] for planar geometric figures from the theorem of moment spaces [33] to extract features of the protein fragments. accumulating hydrophobicity of surrounding residues for each position of protein enables the protein fragments to generate signal curves. next, the signal curve further reduces to a fixed length vector, denoting a feature node, using moment invariants for matching the nodes. next, one of the energy functions in the model introduces the energy function of a correspondence algorithm of non-rigid mapping [24] derived from the tps theorem [34] . finally, genetic programming with the inner-exchanged subpopulation strategy is used to refine the comparison results. the theorem of moment invariants, which is extensively used in computer science, was first proposed in [32] for recognizing visual patterns. in the visual field, an essential step must extract patterns for recognizing visual objects from the original objects. the extracted patterns must also be independent of position, size and orientation. likewise, the model uses the moment invariants of the signal curves to substantially reduce the computational complexity. because the flexibility of the moment invariants used for recognizing protein function was such that the model could find more domains of conservation, the domains should be able to attain more biological meaning by this moment invariant based comparison. the following subsections detail how the model obtains a feature vector denoting a feature-node in a causal tree by using the moment invariant theorem for signal recognition. riemann integrals [32] define two-dimensional (k + l)th order moments for a density distribution f (x, y) as follows: where f (x, y) denotes a piecewise continuous bounded function, and has nonzero values if and only if it is on the finite part of the x y plane. vice versa, the f (x, y) may derive the unique double moment sequence {ṁ k,l } in the moment of any order. by directly integrating the double moment sequence, eq. (2) can express a central moment µ k,l derived from the ordinary moments. where x =ṁ 1,0 /ṁ 0,0 and y =ṁ 0,1 /ṁ 0,0 is an expected value of x and y, respectively. the point (x, y) is termed a center of gravity or center of centroid for x and y. for example, the first four orders have µ 0,0 =ṁ 0,0 ≡ µ, µ 1,0 = µ 0,1 = 0, µ 2,0 =ṁ 2,0 − µx 2 , µ 1,1 =ṁ 1,1 − µx y, µ 0,2 =ṁ 0,2 − µy 2 , µ 3,0 =ṁ 3,0 − 3ṁ 2,0 x + 2µx 3 , µ 2,1 =ṁ 2,1 −ṁ 2,0 y − 2ṁ 1,1 x + 2µx 2 y, µ 1,2 =ṁ 1,2 −ṁ 0,2 x − 2ṁ 1,1 y + 2µx y 2 , µ 0,3 =ṁ 0,3 − 3ṁ 0,2 y + 2µy 3 . in a similar mathematical sense, a measure of area for a two-value image b(m, n) with (k+l)th order moment can approximately obtain eq. (4) from eq. (1) . moreover, the corresponding central moment is expressed as eq. (5): m 0,0 ,m andn are the values for the area length and width, respectively. a pattern of the two-value image which must be independent of position, size, and rotation is derived from similitude moment invariants. likewise, the function b(m, n) with respect to a pair of fixed axes can display the signal curve of protein sequences. moreover, the two-dimensional central moments µ k,l expresses image patterns. additionally, numerous properties of the second central moment are equivalent to a covariance matrix in probability fields. therefore, the covariance matrix for the second central moment is expressed as follows: by the diagonalization, eq. (7) is obtained as follows: where e = e 1,1 e 1,2 e 2,1 e 2,2 , a column vector of matrix e is an eigenvector of matrix u , and the eigenvalues of u determine λ = λ 1 0 0 λ 2 . the values of λ max and λ min corresponding to (λ 1 , λ 2 ) and (λ 2 , λ 1 ) are shown in eqs. (8) and (9), respectively. moreover, eq. (10) denotes a direction angle of the area of the image. from the above theoretical deviations, eqs. (1)-(10) can easily formularize some basic properties of the pattern. notably, an added restriction such as µ 2,0 > µ 0,2 can determine the unique angle of θ . clearly, a discrimination property of the patterns increases when the higher order moments are used as the pattern. the higher order moments with respect to the principal axes can also be easily determined using the method of the principal axes described in [32] . because µ 0,0 = m 0,0 can be used to measure the area, a factor √ α of proportionality, a divisor, should be used to further normalize the central moment in eq. (5) by dividing the variables of the function of eq. (2) by the factor. eq. (2) then can adopt the form f x/ √ α, y/ √ α . as a result the normalized central moment ν k,l equals the original central moment of f (x, y) divided by √ α k+l+2 . because of µ 0,0 being able to measure area, α = µ 0,0 . eq. (11) then can be used to define the normalized central moment, as follows: an identification of the pattern independent of position, size and orientation then can be suitably employed to formularize the moment invariants for any signal curve using eq. (11). to improve pattern recognition capability, an absolute moment invariant should further be achieved by using multiple moment invariants. the normalized central ν k,l in the second and third orders can generate the following six absolute and orthogonal moment invariants listed in eq. (12) . while the skewed invariant listed in the original study [32] is useful for distinguishing mirror images, the model discards this invariant since one skewed orthogonal invariant is not necessary for signal recognition here. the six moment invariants in eq. (12) are not only independent of position, size and orientation, but also can be used to present a feature-node for a fragmented protein sequence. finally, a six-valued vector ω = ω 1 ω 2 ω 3 ω 4 ω 5 ω 6 t can denote a feature-node to enable further convenient processing. suppose that a feature-node ω a = (ω a1 , ω a2 , ω a3 , ω a4 , ω a5 , ω a6 ) can represent a signal curve from the fragmented protein corresponding to one point in the 2-d space. assume a set of vectors ψ = { ω a | a = 1, 2, . . . , k } and the other set of x = { x a | a = 1, 2, . . . , n}. since vector ω a with six variables would increase the problem complexity, a simple linear-transform is used to obtain a scalar for solving the point-matching problem using the tps of two variables. one problem of the fragmented sequence-mapping then can simply be transformed into the other new problem. the new problem involving the two feature-node sets of x and ψ fits a mapping function f ( ω a ) using tps, and simultaneously minimizes the tps energy function defined as follows: typically, two types of deformations of a compared object exist, rigid (affine) and non-rigid (non-affine). the rigid deformation includes parallel translation and rotation. meanwhile, the non-rigid deformation includes shrink, dilation, and distortion. by minimizing the first error measurement term of (13), the feature-node set x maps as closely as possible to the other featurenode set ψ . generally, an infinite number of mappings f can minimize the first term since the mapping is non-rigid. the energy function of eq. (13) can uniquely determine a minimizing function f specified by two parameter matricesã andw in eq. (14) if and only if it has a fixed value λ. whereã represents a (d + 1) with c being a constant for any ω [34] , is related to the tps kernel. the kernel comprises the information regarding the internal structural relationships of the feature-node set and a non-rigid warp comes into play when the warping coefficient matrixw is considered in eq. (14) . the second term of eq. (13), which is responsible for regularizing the mapping function between x and ψ , is essentially a smoothness constraint. the lagrange parameter λ in eq. (13) regularizes the warping of the matches of feature-nodes. the precise matches appear if λ approaches zero. when a solution for f in eq. (14) is substituted into the tps energy function in eq. (13) , and using mercer-hilbert-schmidt theorems of riesz and sz-nagy [35] to prove eq. (15), eq. (16) is obtained: whereλ 1 ≥λ 2 ≥ · · ·λ k ≥ 0 denote eigenvalues of the continuous eigenfunction φ 1 , φ 2 , . . . , φ k , andw t is the transposed matrix ofw . where x and ψ are merely concatenated versions of the vector values of the nodes vectors. each row of the matrix derives from the original vectors. in the model, an energy function with the affineã and warpingw parameters, which is derived from eqs. (13)(16), is used as a fitness function with genetic programming. furthermore, the local border between two regions of protein fragments should be considered an important feature for understanding protein functions. therefore, a distinctive characteristic of the tps can always decompose a geometrical transformation into a global affine transformation and a local non-affine warping component. using this characteristic, a functional signal generally can be detected based on the same essential protein family even if it is undergoing various environment parameters. the parameters include the setting of primary forming signal of the sequences, for example the fragment length in relation to signal response. alternatively, the rotation, translation, and global shear of the signals should slightly influence the detection of the functional signals based on the feature geometrical characteristics (pattern). consequently, the smoothness termw in eq. (16) is specifically applied to be the warping components, and this term should be penalized when signal deformation is ineffective to their essential properties. visually, two compared objects with similar patterns should increase the influence of tps to interpolate a fitting spline. correspondingly, the similar protein fragments compared with one another can be identified by tps matching. the proposed model uses tps methods to compare with two feature-node sets for multiple optimization problems. the tps method raises three issues of optimizations. traditionally, the main issues include optimizations of mapping and correspondence. generally, the work of correspondence must permute a one-to-one relationship between two feature-node sets, and the work of mapping must fit the two feature-node sets to an optimal function f . however, this model defines a new issue that the problem solved must simultaneously optimize the hierarchical heaping structure by constructing a hierarchical relationship among feature-nodes. thus, the work of correspondence in the model must achieve the hierarchical relationship of homologies rather than the one-to-one relationship. alternatively, the model must also optimize heaping. heaping optimization should meet the following minimum requirements: (i) only generating the node of a unique root, (ii) a parent node can compare with multiple children nodes, (iii) using the limitation of the number of branches, the model must only choose the first certain children nodes depending on homology likelihood to attach to a parent node, and (iv) the model must place the predicted feature-nodes and the predicting feature-nodes to the external and internal nods of a causal tree, respectively. these requirements describe the one-to-many query strategy in the homology search model. more precisely, heaping also accomplishes a descending order of homology probability for children nodes. nevertheless, the computations have increased difficulty in simultaneously optimizing the hierarchical correspondence, non-rigid mapping, and heaping. the subsection formally defines a correspondence matrix for a causal tree via the strategy of a one-to-many query. assume that a sequence {q 1 } with unknown functions is used as a query sequence to compare with the fragmented member sequences {q x } z x=2 for constructing an optimal hierarchical structure. the optimal structure should comprise the maximum of homology relationships among these protein fragments of both 'query' and 'member'. moreover, z denotes the number of the input sequence classified to the same distantly related protein family. furthermore, letm denote a correspondence matrix representing the correspondence between two node-sets in a tree. assume the matrix contains (n + 1) × (k + 1) elements: where each m ia is an element of the matrix, with a value that is a real number between zero and one. one query sequence q 1 is divided into s 1 fragments. the summation of m ia in eq. (17) for the constraints of each row inm must equal one. the node of the same label thus appears in the tree a maximum of once. the wta method and sinkhorn balancing theorem [36] can be used to normalize m ia to satisfy the row and column constraints in refs. [24, 27, 36] . however, this model only uses the row constraint to modify wta as ro-wta. alternatively, the correspondence matrix for a causal tree must satisfy the constraint n+1 a=s 1 +1,i =a m ia = 1 in eq. (17) to implement a one-to-many relationship. simultaneously, a constraint n+1 a=s 1 +1 m aa = 0 is used to prevent a self-correspondence. the self-correspondence that appears when one feature-node points to itself is invalid. furthermore, a constraint n i=s 1 +1 m i(n+1) = 1 and ∀i ≤ s 1 m i(n+1) can decide the node of the unique root. from the definition of the correspondence matrix in eq. (17), let: where k represents the number of parent nodes with the starting number one and ending at number k . however, the alternative representation is beginning at number s 1 + 1 and ending at number n. in fact, the two numbering representations are identical for the convenience of representation and implementation. from eqs. (18), (19) and (20) define the set of parent and children nodes in a tree, respectively. where j and (l x , l y ) denote different coordinates of the one and two dimensions, respectively. moreover, the variable s p denotes the number of fragments in each protein sequence. next, eqs. (21)-(25) further define a causal tree denoted ct using the correspondence matrix m. a set of causal trees , similarly described in [35] , can be expressed in terms of i as follows: where denotes the set of causal trees i . a causal tree includes the components r i , i i and e i , which denote a root node, internal node-set, and external node-set, respectively. the causal tree is a tree network that is constructed from the binary random variables labeled, and is also illustrated in fig. 1. fig. 2 shows an example of internal and external node assignment. of nine fragments, five corresponding fragments of internal nodes 5, 6, 7, 8, represent a set of fragmenting members from each member sequence {q x } z x=2 . the external nodes numbered 1, 2, 3, and 4 in fig. 1 are for a query sequence {q 1 } in accordance with the query fragments of fig. 2 in the numbering representation. in fact, the internal nodes can be used for inferring the probabilistic reasoning [4] for predicting protein function. the reasoning process infers along a cost path in the causal tree by assessing orderly conditional probabilities combined together from a root-node to an external feature-node. each internal feature-node ω ∈ i indicates a given protein fragment of different properties. likewise, each external feature-node ω ∈ e indicates an unknown protein fragment function. for heaping the causal tree, the constraintsh 0 ,h 1 andh 2 are used to allocate a correct location of feature-nodes in the causal tree. theh 0 in eq. (22) defines a root-node r by applying the wta method to column (n + 1) of correspondence matrixm. alternatively, a feature-node is a root-node if and only if no parent node points to it. additionally, eqs. (23) and (24) define the correct locations of internal and external nodes, respectively. the node-pairs −−→ (a, i ) and − −−→ (a, a ) represent a direction from a parent node to a child node. for clarity, fig. 3 illustrates an example of the correspondence matrix for a causal tree. the proposed ro-wta selects the maximum value from the real numbers within the parenthesis only for each row and the n + 1 column. besides ro-wta, the diagonal elements must all be zero to eliminate self-correspondence. eventually, eq. (25) defines a causal tree with the entities of a correspondence matrix ct as follows: fig. 3 . correspondence matrix for a causal tree. where ∩ and ∪ denote set intersection and union, respectively. because the causal tree ct may heap some feature-nodes to a cause-effect form, a set of given causes can also be inferred to obtain an unknown effect. the model has transformed the tps method of two point-sets correspondence into an optimization of the causal tree. next, the model must control the tree growth by adding an equation in eq. (26) into the resulting energy function. the value n in eq. (26) denotes the total number of all input fragments as the tree size. by adding eq. (26) to the energy function, the model can attain an approximated tree in accordance with the correspondence matrix. where i denotes the labeled identification of a feature-node. in the presence of this feature-node added to a causal tree j , the value n i returns 1, and otherwise it returns 0. finally, eq. (27) combines eqs. (16), (17) and (26) to create the final energy functions, as follows: by minimizing eq. (27), the model can use genetic programming to obtain a globally suboptimal solution with respect to mapping, correspondence, and heaping. the first term in eq. (27) is an error measure term for matching the similarity between two feature-nodes. meanwhile, the second term in eq. (27) can guard against all matches, and a controlling parameter ζ determines a degree of influence for the term. next, the third term of eq. (27) , which refers to a barrier function, is an entropy term with an annealing temperature t in statistical physics. by controlling the entropy barrier function, the distribution of random variables m ia in ct gradually stabilizes. alternatively, the model can generate the minimum number of states required to push the objective minimum away from the discrete points such that some rules emerge from m ia . a gradually reduced temperature t can suitably control thermal fluctuation to yield an optimal solution. regarding related applications, the application of this entropy term to statistical physics is also briefly described in [37] [38] [39] . subsequently, the fourth term in eq. (27) is used to penalize the differences of an affine mapping parameterã and an identity matrix i . the differences stimulate some unphysical reflections that result from flipping through the whole plane. the fifth term in eq. (27) , which is a standard tps regularization term, penalizes the local warping coefficientsw [24] . using an approximated least-squares approach and qr-decomposition technique, the model can solve the parameters (ã,w ) given fixed ct. additionally, two parameters λ 1 and λ 2 are used to control the influence of the fourth and fifth terms, respectively. finally, keeping a matrix consistent with a tree during the evolutionary process of genetic programming is very difficult. thus, the final term in eq. (27) where m ia must conform to ro-wta. currently, a parent node can only connect a maximum of three children nodes for this model. consequently, the summation of all elements in each column must equal three. when the model attains an optimal causal tree, a probability theorem is further used to model a training model for identifying protein function. consideration of a given protein family can be requested and answers obtained to a query with an unknown protein sequence of what the interrelationship among the particular protein fragments is. unfortunately, the calculation of a probability distribution in bayes networks is computationally intractable (np-complete). however, a causal tree of fixed structure may efficiently compute the result of the probabilistic distribution [31] . using the basic probability theorem, then assuming that p(x i , x a ) denotes the joint probability distribution of two random variables x i and x a , and that p(x i | x a ) denotes the conditional probability of x i given x a ; this conditional probability is defined as follows: using the bayes inference described in [4] , eqs. (30) and (31) are easily derived by eq. (29) . where x i and x a are conditionally independent given x a . subsequently, a well defined bayes model can be obtained by further generalizing beyond simple markov models and considering the probability distribution in the form of general trees. by the expansions of eqs. (30) and (31) , a probability distribution can be applied to learning and inferring for biological models when the tree network has a specific predetermined structure. during a learning phase, the variables within internal nodes and external nodes represent a set of member states and query states, respectively. these variables are both observable during the learning phase. instead, these variables are hidden during an inferring phase when the structure and probability distributions of a causal tree are temporally fixed. this process fully corresponds to the operation of hidden states in a hidden markov model. iteratively, the model eventually produces a refined prediction result based on fitness function estimation. recalling the example of a causal tree, fig. 1 defines nine random binary variables on the tree. the case produces a joint probability distribution of the nine variables using a product of the following terms: where each internal node x i separates variables above and below it. for example, using eq. (31) can show that the node x 7 gains the conditional probability p( initially, the root variable x 7 in eq. (32) first generates a value using the result from eq. (28) . given x 7 , the probabilistic model can determine the conditional probabilities p(x 6 | x 7 ), p(x 9 | x 7 ), and p(x 5 | x 7 ). this model then can generate the values of x 6 , x 9 , and x 5 , respectively. this model thus can calculate the remaining variables recursively. finally, this model obtains an estimated value as one of the multiple fitness functions described in the following section. by rearranging and observing the terms in eq. (32) , an ordered product of terms is also very easily implemented by a preorder tree search. besides the fitness function of the probabilistic and rpm models mentioned above, the model needs another fitness function to determine the boundaries of protein fragments. eq. (33) shows the normalized measures of a local alignment score α divided by the sequence length and obtained by the smith-waterman algorithm [7] . where δ ia denotes local alignment score and ρ ia represents sequence length. furthermore, eqs. (34) and (35) are used to determine a value of parameters ζ and λ 2 for adjusting the reasonable parameters in eq. (27) . since temperature t in eq. (27) is gradually decreased during the evolution, these m ia values are rapidly and accordingly decreased. from eq. (28), it is best for m ia to be in the interval of [0.0 1.0] to protect it from an infinite drop in value that cannot be computed due to the reasonable computation capacity limitation. thus, eq. (34) can be easily obtained from the inequality equation and normal distribution. where e 1 is the natural logarithm that provides a rough scope for estimating the resolution of 256 states that sufficiently fit this computing issue. moreover, µ, σ represent the mean and standard deviation of all feature data, respectively. furthermore, hall and titterington [47] proposed a solution for estimating the smoothing parameter λ 2 , by eq. (35) . where f(λ 2 ) denotes an influence matrix, i represents an identity matrix, and σ is a standard deviation of all feature data. a method of qr-decomposition can be used to determine the value of (i − f( the method decomposes the data of featurenodes ψ = (q 1 : q 2 )( r 0 ) into the product of orthonormal matrix and upper triangular matrix. finally, a process of iteratively estimating eq. (35) is used to find the optimal value λ 2 with respect to minimizing the error in eq. (35). this section focuses on an implementation design of genetic programming that evolves three related subpopulations with inner-exchanged individuals. in 1992, koza et al. were early pioneers of the concept of genetic programming [26, [40] [41] [42] which was extended from the genetic algorithm. genetic programming can automatically create programs for solving general problems. genetic programming is based on a grammar-based methodology of an evolutionary theorem using the darwinian survival principle. therefore, genetic programming can describe highly complicated models for any domain problem without sufficient domain knowledge. first, problem solving requires randomly yielding an initial population of individuals. subsequently, a fitness function can be used to evaluate individuals in the population and select excellent individuals for further performing specific genetic operations. the genetic operations typically include crossover, replication, and mutation operations. the operations enable the parent population at a given generation to form a new offspring population at the next generation. iteratively, genetic programming continuously performs the above evolutionary steps until obtaining an optimal individual or satisfying certain conditions. a representation of genetic programming for a domain problem is composed of several functional programs. the programs form a rooted tree structure as a solution for solving the problem. because genetic programming uses a tree form, the model can easily provide a flexible hierarchical presentation to estimate the relationship of homologies for predicting protein function. characteristically, genetic programming provides a mechanism for searching for the fittest individual. the individual is the problem solution provided by using the computer program in the search space of all possible computer programs. the search space comprises terminalnodes (external nodes) and function-nodes (internal nodes) that are appropriate to represent the specific problem [26] . when a causal tree following genetic operations generates an infeasible individual, the model must regularize this infeasible individual by using a tournament pick (tp) method. the tp method using the depth-first search (dfs) strategy is proposed to accelerate the performance of regularizing this infeasible individual to a feasible one. the tp+dfs method recursively cures the infeasible nodes residing in this individual. the method replaces an infeasible node with a feasible node, in a preorder visiting for trees, randomly selected from the tournament that contains numerous feasible nodes. this advantage of the tp+dfs method results from the mutation appearing at a suitable time, and estimates the fitness function simultaneously following the crossover. when all individuals conform to all constraints of a causal tree, a fitness function can be used for individual estimation. to improve prediction speed and accuracy, this model uses a multiple fitness function strategy. this strategy implements an exchanged individual mechanism involving three subpopulations. the model next considers the following parameters for developing a co-evolution with the strategy of inner-exchanged individuals in a population. the model exchanges the individuals in the three subpopulations 0, 1, and 2. there are nn individuals in subpopulation 0, nm individuals in subpopulation 1, and nk individuals in subpopulation 2, respectively. the size of the subpopulations should have nn nm, nk for considering a trade-off between speed and accuracy. the selection method used in the model always uses the tournament selection strategy rather than the fitnessproportionate selection strategy. additionally, the tournament selection (ts) differs from tp in the target level. the former selects the individuals in the population as its targets on the size 7 tournament. meanwhile, the latter selects the programs (nodes) of an individual (tree) to do the same. regarding the probabilities of genetic operations, the genetic operation of crossover and reproduction gives values of 0.9 and 0.1, respectively. to control tree shape, the tree depth must be 20 or less after performing the crossover operation. furthermore, each internal node has between one and three children connected to it. related applications employing genetic programming can be found in [43] [44] [45] . fig. 4 illustrates the inner-exchange strategy of individuals in three subpopulations of a population for achieving the cooperation of the multiple fitness function. initially, a set of initial feature nodes is fed into the subpopulation 0. subsequently, two feature-node sets produced by random orders of all feature-nodes match each other with tps methods to find the best correspondence between them. the subpopulation 0 is responsible for comparing the functional signals via heuristic matching. for effective heuristic match, the model assumes that the identical conserved protein domains demonstrate genetically similar functions given analogous functional signal shapes. consequently, a physical deformation of the signal shapes should be preserved in somewhat ancestral contours. subsequently, in fig. 4 , the n best individuals of subpopulation 0 must emigrate into subpopulation 1 for modifying the correct boundaries of the protein fragments. in subpopulation 1, an exhausted local alignment of two fragments (nodes) is used to adjust the boundaries of two protein fragments. meanwhile, a moment invariant method calculates a new set of feature data for updating the signal data. moreover, the model must realign the protein fragments once the boundaries of the related fragments have been changed. next, the m best individuals of subpopulation 1 must emigrate into subpopulation 2 to confirm the significant protein fragments by training of real cases. moreover, the n best individuals of subpopulation 0 also must emigrate into subpopulation 2 to do the same work. finally, the k best individuals of subpopulation 2 must emigrate into subpopulation 0 to resume the individuals having specific biological evidence. these resumed individuals can improve the quality of the individuals of subpopulation 0 using the guidelines from real cases. simultaneously, subpopulation 2 eliminates the worst individuals from subpopulation 2 to prevent them from backing to subpopulation 0 or 2. when all of the individuals in the three subpopulations have been coevolved by genetic programming, a final solution should select the optimum individual from subpopulation 0. specifically, the solution takes the form of a causal tree. each path from the root node to any external node represents the functional heredity process. this path can provide an auxiliary guide to biologists for comprehensively understanding the unknown function of query sequences. the strategies used by exchanged individuals are feasible and essential for improving the final solution quality. more specifically, the model cooperatively evolves the causal trees via the cooperation of multiple fitness functions. the multiple fitness functions use eq. (27) as the fitness function of subpopulation 0, eq. (33) as the fitness function of subpopulation 1, and the product of eq. (30) for all nodes as the fitness function of subpopulation 2. fig. 5 describes the statements of the whole algorithm constructing this agct model. the protein nucleocapsid (nsp1) of a coronavirus family from various species was tested to clarify the protein function of nsp1 of sars coronavirus. for seven input sequences, one sequence is a query sequence and six are member sequences. the model simultaneously inputs the sequences to analyze its performance. for convenience, table 1 lists the information on sequence number, abbreviated names, accessions, definitions, and sources obtained from the ncbi web site for the proteins, as follows: these coronaviruses are members of a family of enveloped viruses that replicate in the cytoplasm of animal host cells [46] . sars coronavirus tw1 is a severe acute respiratory syndrome associated coronavirus known as the tw1 isolate. for simplicity, the model only inputs the sequence of putative nsp1 protein of tw1 for analysis with the known protein nsp1 in other hosts. however, this analysis is useful for understanding the novel protein based on inference using other proteins with known functions. although the traditional msa can provide accurate knowledge regarding the conserved domains for these proteins, it cannot clearly express homology relationships over these conserved domains. in fact, the proposed model can provide a hierarchical contour of protein fragments. the contour can also help biologists to increase understanding of the functional heredity relationships only depending on the primary structure of proteins. due to the space limitation, this matching causal tree for the problem using this model is not shown. this section analyzes the performance of convergence of energy at different evolutionary generations using the agct model. table 2 lists some of test parameters used for the agct model to obtain a suboptimal solution to the problems. the parameters which are appropriate for the present case are obtained using eqs. (33)(35) . moreover, a reasonable value of λ 2 = 1.8 is used for eq. (27) through analysis using eq. (35) . the analysis attains the reasonable value using a general cross validation. to analyze the agct model, cases (i) to (v) confirm these typically varied situations. first, case (i) designs for illustrating the ideal solution can be found prematurely. next, case (ii) designs a situation involving a sufficiently large subpopulation size but related small migration size for the subpopulations. next, case (iii) designs a situation involving a small population size and small migration size. case (iv) then designs a situation involving sufficient population size and large migration size. finally, case (v) attempts to understand the real circumstances of evolution at subpopulations 0 and 1 when the energy of subpopulation 2 demonstrates convergence. experimentally, the key parameters are the rates of ii 100 10 50 10 50 10 100 iii 20 10 20 10 20 10 100 iv 100 20 50 30 50 30 200 v 1 0 2 1 0 2 8 1 200 table 3 test results of agct model table 3 lists the results of the five cases. the sensitivity column and specificity column of table 3 display the prediction and discrimination capacities, respectively. since the capacity estimation is based on averaging all generations, the ratios of comparison between the heuristic signal match and exhaustive sequence alignment are quite low. in fact, the high prediction capacity can improve the final solution quality. the high discrimination capacity can increase the execution speed. although the model obtains low sensitivity results, the model attains relatively high specificity results. actually, the results of the later generations have higher sensitivity than previous generations (data not shown). thus, the sensitivity results, which decrease the execution performance during the co-evolution in this model, do not influence the prediction accuracy. in fact, the rises and falls depend on a threshold (cut-off) value, because of a trade-off between sensitivity and specificity. thus, eq. (33) specifies a threshold value of 1.5 to determine whether or not two protein fragments are homologous. moreover, eq. (27) specifies a threshold of value of 0.3 to decide whether or not two signals are analogous. from the observation of the data in table 3 , case ii has better sensitivity than the other cases. furthermore, case ii is also competitive with other reported results. although fewer individuals are involved in case iii than case ii, the resulting reports in case iii are the best except for this sensitivity result. the following discussion can further explain the above confusion. although cases ii and iv involve the same numbers of individuals, case iv involves more immigrated or emigrated individuals than case ii. consequently, case iv obtains worse results than case ii. the above demonstrates that, appropriate individual migration is essential for improving solution quality. however, the amount of migration should not be too large to perform the normal evolution by genetic programming. meanwhile, population size is not particularly important in the present cases since the rpm method and smith-waterman algorithm can undoubtedly obtain the local optimal result for each generation of genetic programming. therefore, even through case iii involves fewer individuals than case ii, case iii can still achieve the performance results approaching case ii. finally, the mechanism of premature convergence in cases i and iv attained the worst results. clearly, case v improves the results of case v through a long-term evolution. accordingly, fig. 6 displays the performance of convergence in each subpopulation on fixed temperature. the energy of subpopulations 0 converges at 0.488 and has a different rate of convergence to the other two subpopulations. clearly, subpopulation 2 evolves faster than the other two subpopulations. fig. 7 illustrates the improvement of the energy convergence of subpopulation 0 by using the five temperature rates (tr) for decreasing annealing temperature, including 1, 0.95, 0.90, 0.85, and 0.80. eventually, the suboptimal solution to this present experiment is yielded using the genetic programming via a cooperation of the multiple fitness functions. this cooperation can continuously change the positions appropriate for splitting the protein sequence. due to the space limitation in this paper, the starting and ending positions of each protein fragment are not shown. subsequently, because of case ii containing more individuals than the others, fig. 8 compares the energy convergences based on case ii with each other for obtaining an appropriate tr value. more precisely, the rate of value 0.85 is the best choice for situations involving rapid and steady convergence. additionally, fig. 9 compares four different tr values 0.97, 0.95, 0.9, and 0.85 for analyzing the energy of subpopulation 0. clearly, case iv really is the worst case. next, fig. 10 simultaneously illustrates the energy convergence of three subpopulations based on case v by varying the rates of temperature decrease. clearly, the model proportionally and consistently reduces the energies of each subpopulation. from the observation of generations in fig. 10 , subpopulations 1 and 2 already begin to converge at around the fortieth generation. finally, the model gradually performs the rpm work in subpopulation 0 between the fortieth and eightieth generations. the agct model utilizes a novel representation of tree structure to illustrate the concept of applying functional homologies to predict protein function. this model differs in numerous respects from the other traditional model of msa. nevertheless, this model still includes various traditionally popular methods used for the high level application, predicting protein function. these methods include genetic programming, wavelet transform, profile comparison, local alignment with the smith-waterman algorithm, moment invariant theorem, rpm with tps, bayes inference, and the causal tree model. besides the above methods, this model introduces various strategies, including the inner-exchanged individuals in subpopulations, cooperation of multiple fitness functions, guided evolution involving real cases, and tp+dfs regularization. the prediction accuracy thus can be improved. this hybrid model is developed to exploit global search capabilities in genetic programming for predicting protein functions of a distantly related protein family that have difficulties in the conserved domain identification. moreover, the rpm involved can identify more sequence homologies through softening comparisons. essentially, this work contributed a complex and integrated methodology that enables new advances in predicting protein function in the post-genome era. we believe that the model robustness can be increased in the future if biologists generate more experimental data in laboratories. future works will apply this model to investigate issues related to protein interaction. sequence comparison by dynamic programming a hidden markov model that finds genes in e. coli dna hidden markov models in computational biology: applications to protein modeling probabilistic reasoning in intelligent systems: networks of plausible inference modeling splice sites with bayes networks prediction of the secondary structure of proteins from their amino acid sequence identification of common molecular subsequences hybrid computational intelligence schemes in complex domains: an extended review genetic algorithms in molecular recognition and design searching for discrimination rules in protease proteolytic cleavage activity using genetic programming with a min-max scoring function logistic regression and artificial neural network classification models: a methodology review statistical mechanics beyond the hopfield model: solvable problems in neural network theory applying fuzzy logic to medical decision making in the intensive care unit increasing the efficiency of fuzzy logic-based gene expression data analysis epps: mining the cog database by an extended phylogenetic patterns search biological sequence analysis probabilistic models of proteins and nucleic acids probabilistic scoring measures for profile-profile comparison yield more accurate short seed alignments comparison of sequence profiles. strategies for structural predictions using sequence information within the twilight zone: a sensitive profile-profile comparison tool based on information theory compass: a tool for comparison of multiple protein alignments with assessment of statistical significance basic local alignment search tool gapped blast and psi-blast: a new generation of protein database search programs clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice a new algorithm for non-rigid point matching ant colony system with extremal dynamics for point matching and pose estimation genetic programming on the programming of computers by means of natural selection new algorithms for 2-d and 3-d point matching: pose estimation and correspondence probabilistic prediction of protein secondary structure using causal networks perfect sampling for wavelet reconstruction of signals a computational vision approach to image registration visual pattern recognition by moment invariants moment spaces and inequalities spline models for observational data functional analysis a relationship between arbitrary positive matrices and doubly stochastic matrices a new method for mapping optimization problems onto neural networks statistical physics algorithms that converge a novel optimizing network architecture with applications genetic and evolutionary algorithms come of age genetic programming iii: darwinian invention and problem solving: book review genetic programming and evolutionary generalization discovering knowledge from medical databases using evolutionary algorithms genetic programming for knowledge discovery in chest-pain diagnosis classifying proteins as extracellular using programmatic motifs and genetic programming fields virology common structure of techniques for choosing smoothing parameters in regression problems key: cord-168862-3tj63eve authors: porter, mason a. title: nonlinearity + networks: a 2020 vision date: 2019-11-09 journal: nan doi: nan sha: doc_id: 168862 cord_uid: 3tj63eve i briefly survey several fascinating topics in networks and nonlinearity. i highlight a few methods and ideas, including several of personal interest, that i anticipate to be especially important during the next several years. these topics include temporal networks (in which the entities and/or their interactions change in time), stochastic and deterministic dynamical processes on networks, adaptive networks (in which a dynamical process on a network is coupled to dynamics of network structure), and network structure and dynamics that include"higher-order"interactions (which involve three or more entities in a network). i draw examples from a variety of scenarios, including contagion dynamics, opinion models, waves, and coupled oscillators. in its broadest form, a network consists of the connectivity patterns and connection strengths in a complex system of interacting entities [121] . the most traditional type of network is a graph g = (v, e) (see fig. 1a) , where v is a set of "nodes" (i.e., "vertices") that encode entities and e ⊆ v × v is a set of "edges" (i.e., "links" or "ties") that encode the interactions between those entities. however, recent uses of the term "network" have focused increasingly on connectivity patterns that are more general than graphs [98] : a network's nodes and/or edges (or their associated weights) can change in time [70, 72] (see section 3), nodes and edges can include annotations [26] , a network can include multiple types of edges and/or multiple types of nodes [90, 140] , it can have associated dynamical processes [142] (see sections 3, 4, and 5) , it can include memory [152] , connections can occur between an arbitrary number of entities [127, 131] (see section 6) , and so on. associated with a graph is an adjacency matrix a with entries a i j . in the simplest scenario, edges either exist or they don't. if edges have directions, a i j = 1 when there is an edge from entity j to entity i and a i j = 0 when there is no such edge. when a i j = 1, node i is "adjacent" to node j (because we can reach i directly from j), and the associated edge is "incident" from node j and to node i. the edge from j to i is an "out-edge" of j and an "in-edge" of i. the number of out-edges of a node is its "out-degree", and the number of in-edges of a node is its "in-degree". for an undirected network, a i j = a ji , and the number of edges that are attached to a node is the node's "degree". one can assign weights to edges to represent connections with different strengths (e.g., stronger friendships or larger transportation capacity) by defining a function w : e −→ r. in many applications, the weights are nonnegative, although several applications [180] (such as in international relations) incorporate positive, negative, and zero weights. in some applications, nodes can also have selfedges and multi-edges. the spectral properties of adjacency (and other) matrices give important information about their associated graphs [121, 187] . for undirected networks, it is common to exploit the beneficent property that all eigenvalues of symmetric matrices are real. traditional studies of networks consider time-independent structures, but most networks evolve in time. for example, social networks of people and animals change based on their interactions, roads are occasionally closed for repairs and new roads are built, and airline routes change with the seasons and over the years. to study such time-dependent structures, one can analyze "temporal networks". see [70, 72] for reviews and [73, 74] for edited collections. the key idea of a temporal network is that networks change in time, but there are many ways to model such changes, and the time scales of interactions and other changes play a crucial role in the modeling process. there are also other [i drew this network using tikz-network, by jürgen hackl and available at https://github.com/hackl/tikz-network), which allows one to draw networks (including multilayer networks) directly in a l a t e x file.] . an example of a multilayer network with three layers. we label each layer using di↵erent colours for its state nodes and its edges: black nodes and brown edges (three of which are unidirectional) for layer 1, purple nodes and green edges for layer 2, and pink nodes and grey edges for layer 3. each state node (i.e. nodelayer tuple) has a corresponding physical node and layer, so the tuple (a, 3) denotes physical node a on layer 3, the tuple (d, 1) denotes physical node d on layer 1, and so on. we draw intralayer edges using solid arcs and interlayer edges using broken arcs; an interlayer edge is dashed (and magenta) if it connects corresponding entities and dotted (and blue) if it connects distinct ones. we include arrowheads to represent unidirectional edges. we drew this network using tikz-network (jürgen hackl, https://github.com/hackl/tikz-network), which allows one to draw multilayer networks directly in a l at ex file. , which is by jürgen hackl and is available at https://github.com/hackl/tikz-network. panel (b) is inspired by fig. 1 of [72] . panel (d), which is in the public domain, was drawn by wikipedia user cflm001 and is available at https://en.wikipedia.org/wiki/simplicial_complex.] important modeling considerations. to illustrate potential complications, suppose that an edge in a temporal network represents close physical proximity between two people in a short time window (e.g., with a duration of two minutes). it is relevant to consider whether there is an underlying social network (e.g., the friendship network of mathematics ph.d. students at ucla) or if the people in the network do not in general have any other relationships with each other (e.g., two people who happen to be visiting a particular museum on the same day). in both scenarios, edges that represent close physical proximity still appear and disappear over time, but indirect connections (i.e., between people who are on the same connected component, but without an edge between them) in a time window may play different roles in the spread of information. moreover, network structure itself is often influenced by a spreading process or other dynamics, as perhaps one arranges a meeting to discuss a topic (e.g., to give me comments on a draft of this chapter). see my discussion of adaptive networks in section 5. for convenience, most work on temporal networks employs discrete time (see fig. 1(b) ). discrete time can arise from the natural discreteness of a setting, dis-cretization of continuous activity over different time windows, data measurement that occurs at discrete times, and so on. one way to represent a discrete-time (or discretized-time) temporal network is to use the formalism of "multilayer networks" [90, 140] . one can also use multilayer networks to study networks with multiple types of relations, networks with multiple subsystems, and other complicated networked structures. fig. 1 (c)) has a set v of nodesthese are sometimes called "physical nodes", and each of them corresponds to an entity, such as a person -that have instantiations as "state nodes" (i.e., node-layer tuples, which are elements of the set v m ) on layers in l. one layer in the set l is a combination, through the cartesian product l 1 × · · · × l d , of elementary layers. the number d indicates the number of types of layering; these are called "aspects". a temporal network with one type of relationship has one type of layering, a timeindependent network with multiple types of social relationships also has one type of layering, a multirelational network that changes in time has two types of layering, and so on. the set of state nodes in m is v m ⊆ v × l 1 × · · · × l d , and the set of indicates that there is an edge from node j on layer β to node i on layer α (and vice versa, if m is undirected). for example, in fig. 1(c) , there is a directed intralayer edge from (a,1) to (b,1) and an undirected interlayer edge between (a,1) and (a,2). the multilayer network in fig. 1 (c) has three layers, |v | = 5 physical nodes, d = 1 aspect, |v m | = 13 state nodes, and |e m | = 20 edges. to consider weighted edges, one proceeds as in ordinary graphs by defining a function w : e m −→ r. as in ordinary graphs, one can also incorporate self-edges and multi-edges. multilayer networks can include both intralayer edges (which have the same meaning as in graphs) and interlayer edges. the multilayer network in fig. 1 (c) has 4 directed intralayer edges, 10 undirected intralayer edges, and 6 undirected interlayer edges. in most studies thus far of multilayer representations of temporal networks, researchers have included interlayer edges only between state nodes in consecutive layers and only between state nodes that are associated with the same entity (see fig. 1 (c)). however, this restriction is not always desirable (see [184] for an example), and one can envision interlayer couplings that incorporate ideas like time horizons and interlayer edge weights that decay over time. for convenience, many researchers have used undirected interlayer edges in multilayer analyses of temporal networks, but it is often desirable for such edges to be directed to reflect the arrow of time [176] . the sequence of network layers, which constitute time layers, can represent a discrete-time temporal network at different time instances or a continuous-time network in which one bins (i.e., aggregates) the network's edges to form a sequence of time windows with interactions in each window. each d-aspect multilayer network with the same number of nodes in each layer has an associated adjacency tensor a of order 2(d + 1). for unweighted multilayer networks, each edge in e m is associated with a 1 entry of a, and the other entries (the "missing" edges) are 0. if a multilayer network does not have the same number of nodes in each layer, one can add empty nodes so that it does, but the edges that are attached to such nodes are "forbidden". there has been some research on tensorial properties of a [35] (and it is worthwhile to undertake further studies of them), but the most common approach for computations is to flatten a into a "supra-adjacency matrix" a m [90, 140] , which is the adjacency matrix of the graph g m that is associated with m. the entries of diagonal blocks of a m correspond to intralayer edges, and the entries of off-diagonal blocks correspond to interlayer edges. following a long line of research in sociology [37] , two important ingredients in the study of networks are examining (1) the importances ("centralities") of nodes, edges, and other small network structures and the relationship of measures of importance to dynamical processes on networks and (2) the large-scale organization of networks [121, 193] . studying central nodes in networks is useful for numerous applications, such as ranking web pages, football teams, or physicists [56] . it can also help reveal the roles of nodes in networks, such as those that experience high traffic or help bridge different parts of a network [121, 193] . mesoscale features can impact network function and dynamics in important ways. small subgraphs called "motifs" may appear frequently in some networks [111] , perhaps indicating fundamental structures such as feedback loops and other building blocks of global behavior [59] . various types of largerscale network structures, such as dense "communities" of nodes [47, 145] and coreperiphery structures [33, 150] , are also sometimes related to dynamical modules (e.g., a set of synchronized neurons) or functional modules (e.g., a set of proteins that are important for a certain regulatory process) [164] . a common way to study large-scale structures1 is inference using statistical models of random networks, such as through stochastic block models (sbms) [134] . much recent research has generalized the study of large-scale network structure to temporal and multilayer networks [3, 74, 90] . various types of centrality -including betweenness centrality [88, 173] , bonacich and katz centrality [65, 102] , communicability [64] , pagerank [151, 191] , and eigenvector centrality [46, 146] -have been generalized to temporal networks using a variety of approaches. such generalizations make it possible to examine how node importances change over time as network structure evolves. in recent work, my collaborators and i used multilayer representations of temporal networks to generalize eigenvector-based centralities to temporal networks [175, 176] .2 one computes the eigenvector-based centralities of nodes for a timeindependent network as the entries of the "dominant" eigenvector, which is associated with the largest positive eigenvalue (by the perron-frobenius theorem, the eigenvalue with the largest magnitude is guaranteed to be positive in these situations) of a centrality matrix c(a). examples include eigenvector centrality (by using c(a) = a) [17] , hub and authority scores3 (by using c(a) = aa t for hubs and a t a for authorities) [91] , and pagerank [56] . given a discrete-time temporal network in the form of a sequence of adjacency matrices i j denotes a directed edge from entity i to entity j in time layer t, we construct a "supracentrality matrix" c(ω), which couples centrality matrices c(a (t) ) of the individual time layers. we then compute the dominant eigenvector of c(ω), where ω is an interlayer coupling strength.4 in [175, 176] , a key example was the ranking of doctoral programs in the mathematical sciences (using data from the mathematics genealogy project [147] ), where an edge from one institution to another arises when someone with a ph.d. from the first institution supervises a ph.d. student at the second institution. by calculating timedependent centralities, we can study how the rankings of mathematical-sciences doctoral programs change over time and the dependence of such rankings on the value of ω. larger values of ω impose more ranking consistency across time, so centrality trajectories are less volatile for larger ω [175, 176] . multilayer representations of temporal networks have been very insightful in the detection of communities and how they split, merge, and otherwise evolve over time. numerous methods for community detection -including inference via sbms [135] , maximization of objective functions (especially "modularity") [117] , and methods based on random walks and bottlenecks to their traversal of a network [38, 80] -have been generalized from graphs to multilayer networks. they have yielded insights in a diverse variety of applications, including brain networks [183] , granular materials [129] , political voting networks [113, 117] , disease spreading [158] , and ecology and animal behavior [45, 139] . to assist with such applications, there are efforts to develop and analyze multilayer random-network models that incorporate rich and flexible structures [11] , such as diverse types of interlayer correlations. activity-driven (ad) models of temporal networks [136] are a popular family of generative models that encode instantaneous time-dependent descriptions of network dynamics through a function called an "activity potential", which encodes the mechanism to generate connections and characterizes the interactions between enti-ties in a network. an activity potential encapsulates all of the information about the temporal network dynamics of an ad model, making it tractable to study dynamical processes (such as ones from section 4) on networks that are generated by such a model. it is also common to compare the properties of networks that are generated by ad models to those of empirical temporal networks [74] . in the original ad model of perra et al. [136] , one considers a network with n entities, which we encode by the nodes. we suppose that node i has an activity rate a i = ηx i , which gives the probability per unit time to create new interactions with other nodes. the scaling factor η ensures that the mean number of active nodes per unit time is η we define the activity rates such that x i ∈ [ , 1], where > 0, and we assign each x i from a probability distribution f(x) that can either take a desired functional form or be constructed from empirical data. the model uses the following generative process: • at each discrete time step (of length ∆t), start with a network g t that consists of n isolated nodes. • with a probability a i ∆t that is independent of other nodes, node i is active and generates m edges, each of which attaches to other nodes uniformly (i.e., with the same probability for each node) and independently at random (without replacement). nodes that are not active can still receive edges from active nodes. • at the next time step t + ∆t, we delete all edges from g t , so all interactions have a constant duration of ∆t. we then generate new interactions from scratch. this is convenient, as it allows one to apply techniques from markov chains. because entities in time step t do not have any memory of previous time steps, f(x) encodes the network structure and dynamics. the ad model of perra et al. [136] is overly simplistic, but it is amenable to analysis and has provided a foundation for many more general ad models, including ones that incorporate memory [200] . in section 6.4, i discuss a generalization of ad models to simplicial complexes [137] that allows one to study instantaneous interactions that involve three or more entities in a network. many networked systems evolve continuously in time, but most investigations of time-dependent networks rely on discrete or discretized time. it is important to undertake more analysis of continuous-time temporal networks. researchers have examined continuous-time networks in a variety of scenarios. examples include a compartmental model of biological contagions [185] , a generalization of katz centrality to continuous time [65] , generalizations of ad models (see section 3.1.3) to continuous time [198, 199] , and rankings in competitive sports [115] . in a recent paper [2] , my collaborators and i formulated a notion of "tie-decay networks" for studying networks that evolve in continuous time. they distinguished between interactions, which they modeled as discrete contacts, and ties, which encode relationships and their strength as a function of time. for example, perhaps the strength of a tie decays exponentially after the most recent interaction. more realistically, perhaps the decay rate depends on the weight of a tie, with strong ties decaying more slowly than weak ones. one can also use point-process models like hawkes processes [99] to examine similar ideas using a node-centric perspective. suppose that there are n interacting entities, and let b(t) be the n × n timedependent, real, non-negative matrix whose entries b i j (t) encode the tie strength between agents i and j at time t. in [2] , we made the following simplifying assumptions: 1. as in [81] , ties decay exponentially when there are no interactions: where α ≥ 0 is the decay rate. 2. if two entities interact at time t = τ, the strength of the tie between them grows instantaneously by 1. see [201] for a comparison of various choices, including those in [2] and [81] , for tie evolution over time. in practice (e.g., in data-driven applications), one obtains b(t) by discretizing time, so let's suppose that there is at most one interaction during each time step of length ∆t. this occurs, for example, in a poisson process. such time discretization is common in the simulation of stochastic dynamical systems, such as in gillespie algorithms [41, 142, 189] . consider an n × n matrix a(t) in which a i j (t) = 1 if node i interacts with node j at time t and a i j (t) = 0 otherwise. for a directed network, a(t) has exactly one nonzero entry during each time step when there is an interaction and no nonzero entries when there isn't one. for an undirected network, because of the symmetric nature of interactions, there are exactly two nonzero entries in time steps that include an interaction. we write equivalently, if interactions between entities occur at times τ ( ) such that 0 ≤ τ (0) < τ (1) < . . . < τ (t ) , then at time t ≥ τ (t ) , we have in [2] , my coauthors and i generalized pagerank [20, 56] to tie-decay networks. one nice feature of their tie-decay pagerank is that it is applicable not just to data sets, but also to data streams, as one updates the pagerank values as new data arrives. by contrast, one problematic feature of many methods that rely on multilayer representations of temporal networks is that one needs to recompute everything for an entire data set upon acquiring new data, rather than updating prior results in a computationally efficient way. a dynamical process can be discrete, continuous, or some mixture of the two; it can also be either deterministic or stochastic. it can take the form of one or several coupled ordinary differential equations (odes), partial differential equations (pdes), maps, stochastic differential equations, and so on. a dynamical process requires a rule for updating the states of its dependent variables with respect one or more independent variables (e.g., time), and one also has (one or a variety of) initial conditions and/or boundary conditions. to formalize a dynamical process on a network, one needs a rule for how to update the states of the nodes and/or edges. the nodes (of one or more types) of a network are connected to each other in nontrivial ways by one or more types of edges. this leads to a natural question: how does nontrivial connectivity between nodes affect dynamical processes on a network [142] ? when studying a dynamical process on a network, the network structure encodes which entities (i.e., nodes) of a system interact with each other and which do not. if desired, one can ignore the network structure entirely and just write out a dynamical system. however, keeping track of network structure is often a very useful and insightful form of bookkeeping, which one can exploit to systematically explore how particular structures affect the dynamics of particular dynamical processes. prominent examples of dynamical processes on networks include coupled oscillators [6, 149] , games [78] , and the spread of diseases [89, 130] and opinions [23, 100] . there is also a large body of research on the control of dynamical processes on networks [103, 116] . most studies of dynamics on networks have focused on extending familiar models -such as compartmental models of biological contagions [89] or kuramoto phase oscillators [149] -by coupling entities using various types of network structures, but it is also important to formulate new dynamical processes from scratch, rather than only studying more complicated generalizations of our favorite models. when trying to illuminate the effects of network structure on a dynamical process, it is often insightful to provide a baseline comparison by examining the process on a convenient ensemble of random networks [142] . a simple, but illustrative, dynamical process on a network is the watts threshold model (wtm) of a social contagion [100, 142] . it provides a framework for illustrating how network structure can affect state changes, such as the adoption of a product or a behavior, and for exploring which scenarios lead to "virality" (in the form of state changes of a large number of nodes in a network). the original wtm [194] , a binary-state threshold model that resembles bootstrap percolation [24] , has a deterministic update rule, so stochasticity can come only from other sources (see section 4.2). in a binary state model, each node is in one of two states; see [55] for a tabulation of well-known binary-state dynamics on networks. the wtm is a modification of mark granovetter's threshold model for social influence in a fully-mixed population [62] . see [86, 186] for early work on threshold models on networks that developed independently from investigations of the wtm. threshold contagion models have been developed for many scenarios, including contagions with multiple stages [109] , models with adoption latency [124] , models with synergistic interactions [83] , and situations with hipsters (who may prefer to adopt a minority state) [84] . in a binary-state threshold model such as the wtm, each node i has a threshold r i that one draws from some distribution. suppose that r i is constant in time, although one can generalize it to be time-dependent. at any time, each node can be in one of two states: 0 (which represents being inactive, not adopted, not infected, and so on) or 1 (active, adopted, infected, and so on). a binary-state model is a drastic oversimplification of reality, but the wtm is able to capture two crucial features of social systems [125] : interdependence (an entity's behavior depends on the behavior of other entities) and heterogeneity (as nodes with different threshold values behave differently). one can assign a seed number or seed fraction of nodes to the active state, and one can choose the initially active nodes either deterministically or randomly. the states of the nodes change in time according to an update rule, which can either be synchronous (such that it is a map) or asynchronous (e.g., as a discretization of continuous time) [142] . in the wtm, the update rule is deterministic, so this choice affects only how long it takes to reach a steady state; it does not affect the steady state itself. with a stochastic update rule, the synchronous and asynchronous versions of ostensibly the "same" model can behave in drastically different ways [43] . in the wtm on an undirected network, to update the state of a node, one compares its fraction s i /k i of active neighbors (where s i is the number of active neighbors and k i is the degree of node i) to the node's threshold r i . an inactive node i becomes active (i.e., it switches from state 0 to state 1) if s i /k i ≥ r i ; otherwise, it stays inactive. the states of nodes in the wtm are monotonic, in the sense that a node that becomes active remains active forever. this feature is convenient for deriving accurate approximations for the global behavior of the wtm using branchingprocess approximations [55, 142] or when analyzing the behavior of the wtm using tools such as persistent homology [174] . a dynamical process on a network can take the form of a stochastic process [121, 142] . there are several possible sources of stochasticity: (1) choice of initial condition, (2) choice of which nodes or edges to update (when considering asynchronous updating), (3) the rule for updating nodes or edges, (4) the values of parameters in an update rule, and (5) selection of particular networks from a random-graph ensemble (i.e., a probability distribution on graphs). some or all of these sources of randomness can be present when studying dynamical processes on networks. it is desirable to compare the sample mean of a stochastic process on a network to an ensemble average (i.e., to an expectation over a suitable probability distribution). prominent examples of stochastic processes on networks include percolation [153] , random walks [107] , compartment models of biological contagions [89, 130] , bounded-confidence models with continuous-valued opinions [110] , and other opinion and voter models [23, 100, 142, 148] . compartmental models of biological contagions are a topic of intense interest in network science [89, 121, 130, 142] . a compartment represents a possible state of a node; examples include susceptible, infected, zombified, vaccinated, and recovered. an update rule determines how a node changes its state from one compartment to another. one can formulate models with as many compartments as desired [18] , but investigations of how network structure affects dynamics typically have employed examples with only two or three compartments [89, 130] . researchers have studied various extensions of compartmental models, contagions on multilayer and temporal networks [4, 34, 90] , metapopulation models on networks [30] for simultaneously studying network connectivity and subpopulations with different characteristics, non-markovian contagions on networks for exploring memory effects [188] , and explicit incorporation of individuals with essential societal roles (e.g., health-care workers) [161] . as i discuss in section 4.4, one can also examine coupling between biological contagions and the spread of information (e.g., "awareness") [50, 192] . one can also use compartmental models to study phenomena, such as dissemination of ideas on social media [58] and forecasting of political elections [190] , that are much different from the spread of diseases. one of the most prominent examples of a compartmental model is a susceptibleinfected-recovered (sir) model, which has three compartments. susceptible nodes are healthy and can become infected, and infected nodes can eventually recover. the steady state of the basic sir model on a network is related to a type of bond percolation [63, 68, 87, 181] . there are many variants of sir models and other compartmental models on networks [89] . see [114] for an illustration using susceptible-infectedsusceptible (sis) models. suppose that an infection is transmitted from an infected node to a susceptible neighbor at a rate of λ. the probability of a transmission event on one edge between an infected node and a susceptible node in an infinitesimal time interval dt is λ dt. assuming that all infection events are independent, the probability that a susceptible node with s infected neighbors becomes infected (i.e., for a node to transition from the s compartment to the i compartment, which represents both being infected and being infective) during dt is if an infected node recovers at a constant rate of µ, the probability that it switches from state i to state r in an infinitesimal time interval dt is µ dt. when there is no source of stochasticity, a dynamical process on a network is "deterministic". a deterministic dynamical system can take the form of a system of coupled maps, odes, pdes, or something else. as with stochastic systems, the network structure encodes which entities of a system interact with each other and which do not. there are numerous interesting deterministic dynamical systems on networksjust incorporate nontrivial connectivity between entities into your favorite deterministic model -although it is worth noting that some stochastic features (e.g., choosing parameter values from a probability distribution or sampling choices of initial conditions) can arise in these models. for concreteness, let's consider the popular setting of coupled oscillators. each node in a network is associated with an oscillator, and we want to examine how network structure affects the collective behavior of the coupled oscillators. it is common to investigate various forms of synchronization (a type of coherent behavior), such that the rhythms of the oscillators adjust to match each other (or to match a subset of the oscillators) because of their interactions [138] . a variety of methods, such as "master stability functions" [132] , have been developed to study the local stability of synchronized states and their generalizations [6, 142] , such as cluster synchrony [133] . cluster synchrony, which is related to work on "coupled-cell networks" [59] , uses ideas from computational group theory to find synchronized sets of oscillators that are not synchronized with other sets of synchronized oscillators. many studies have also examined other types of states, such as "chimera states" [128] , in which some oscillators behave coherently but others behave incoherently. (analogous phenomena sometimes occur in mathematics departments.) a ubiquitous example is coupled kuramoto oscillators on a network [6, 39, 149] , which is perhaps the most common setting for exploring and developing new methods for studying coupled oscillators. (in principle, one can then build on these insights in studies of other oscillatory systems, such as in applications in neuroscience [7] .) coupled kuramoto oscillators have been used for modeling numerous phenomena, including jetlag [104] and singing in frogs [126] . indeed, a "snowbird" (siam) conference on applied dynamical systems would not be complete without at least several dozen talks on the kuramoto model. in the kuramoto model, each node i has an associated phase θ i (t) ∈ [0, 2π). in the case of "diffusive" coupling between the nodes5, the dynamics of the ith node is governed by the equation where one typically draws the natural frequency ω i of node i from some distribution g(ω), the scalar a i j is an adjacency-matrix entry of an unweighted network, b i j is the coupling strength on oscillator i from oscillator j (so b i j a i j is an element of an adjacency matrix w of a weighted network), and f i j (y) = sin(y) is the coupling function, which depends only on the phase difference between oscillators i and j because of the diffusive nature of the coupling. once one knows the natural frequencies ω i , the model (4) is a deterministic dynamical system, although there have been studies of coupled kuramoto oscillators with additional stochastic terms [60] . traditional studies of (4) and its generalizations draw the natural frequencies from some distribution (e.g., a gaussian or a compactly supported distribution), but some studies of so-called "explosive synchronization" (in which there is an abrupt phase transition from incoherent oscillators to synchronized oscillators) have employed deterministic natural frequencies [16, 39] . the properties of the frequency distribution g(ω) have a significant effect on the dynamics of (4). important features of g(ω) include whether it has compact support or not, whether it is symmetric or asymmetric, and whether it is unimodal or not [149, 170] . the model (4) has been generalized in numerous ways. for example, researchers have considered a large variety of coupling functions f i j (including ones that are not diffusive) and have incorporated an inertia term θ i to yield a second-order kuramoto oscillator at each node [149] . the latter generalization is important for studies of coupled oscillators and synchronized dynamics in electric power grids [196] . another noteworthy direction is the analysis of kuramoto model on "graphons" (see, e.g., [108] ), an important type of structure that arises in a suitable limit of large networks. an increasingly prominent topic in network analysis is the examination of how multilayer network structures -multiple system components, multiple types of edges, co-occurrence and coupling of multiple dynamical processes, and so onaffect qualitative and quantitative dynamics [3, 34, 90] . for example, perhaps certain types of multilayer structures can induce unexpected instabilities or phase transitions in certain types of dynamical processes? there are two categories of dynamical processes on multilayer networks: (1) a single process can occur on a multilayer network; or (2) processes on different layers of a multilayer network can interact with each other [34] . an important example of the first category is a random walk, where the relative speeds and probabilities of steps within layers versus steps between layers affect the qualitative nature of the dynamics. this, in turn, affects methods (such as community detection [38, 80] ) that are based on random walks, as well as anything else in which the diffusion is relevant [22, 36] . two other examples of the first category are the spread of information on social media (for which there are multiple communication channels, such as facebook and twitter) and multimodal transportation systems [51] . for instance, a multilayer network structure can induce congestion even when a system without coupling between layers is decongested in each layer independently [1] . examples of the second category of dynamical process are interactions between multiple strains of a disease and interactions between the spread of disease and the spread of information [49, 50, 192] . many other examples have been studied [3] , including coupling between oscillator dynamics on one layer and a biased random walk on another layer (as a model for neuronal oscillations coupled to blood flow) [122] . numerous interesting phenomena can occur when dynamical systems, such as spreading processes, are coupled to each other [192] . for example, the spreading of one disease can facilitate infection by another [157] , and the spread of awareness about a disease can inhibit spread of the disease itself (e.g., if people stay home when they are sick) [61] . interacting spreading processes can also exhibit other fascinating dynamics, such as oscillations that are induced by multilayer network structures in a biological contagion with multiple modes of transmission [79] and novel types of phase transitions [34] . a major simplification in most work thus far on dynamical processes on multilayer networks is a tendency to focus on toy models. for example, a typical study of coupled spreading processes may consider a standard (e.g., sir) model on each layer, and it may draw the connectivity pattern of each layer from the same standard random-graph model (e.g., an erdős-rényi model or a configuration model). however, when studying dynamics on multilayer networks, it is particular important in future work to incorporate heterogeneity in network structure and/or dynamical processes. for instance, diseases spread offline but information spreads both offline and online, so investigations of coupled information and disease spread ought to consider fundamentally different types of network structures for the two processes. network structures also affect the dynamics of pdes on networks [8, 31, 57, 77, 112] . interesting examples include a study of a burgers equation on graphs to investigate how network structure affects the propagation of shocks [112] and investigations of reaction-diffusion equations and turing patterns on networks [8, 94] . the latter studies exploit the rich theory of laplacian dynamics on graphs (and concomitant ideas from spectral graph theory) [107, 187] and examine the addition of nonlinear terms to laplacians on various types of networks (including multilayer ones). a mathematically oriented thread of research on pdes on networks has built on ideas from so-called "quantum graphs" [57, 96] to study wave propagation on networks through the analysis of "metric graphs". metric graphs differ from the usual "combinatorial graphs", which in other contexts are usually called simply "graphs". 6 in metric graphs, in addition to nodes and edges, each edge e has a positive length l e ∈ (0, ∞]. for many experimentally relevant scenarios (e.g., in models of circuits of quantum wires [195] ), there is a natural embedding into space, but metric graphs that are not embedded in space are also appropriate for some applications. as the nomenclature suggests, one can equip a metric graph with a natural metric. if a sequence {e j } m j=1 of edges forms a path, the length of the path is j l j . the distance ρ(v 1 , v 2 ) between two nodes, v 1 and v 2 , is the minimum path length between them. we place coordinates along each edge, so we can compute a distance between points x 1 and x 2 on a metric graph even when those points are not located at nodes. traditionally, one assumes that the infinite ends (which one can construe as "leads" at infinity, as in scattering theory) of infinite edges have degree 1. it is also traditional to assume that there is always a positive distance between distinct nodes and that there are no finite-length paths with infinitely many edges. see [96] for further discussion. to study waves on metric graphs, one needs to define operators, such as the negative second derivative or more general schrödinger operators. this exploits the fact that there are coordinates for all points on the edges -not only at the nodes themselves, as in combinatorial graphs. when studying waves on metric graphs, it is also necessary to impose boundary conditions at the nodes [96] . many studies of wave propagation on metric graphs have considered generalizations of nonlinear wave equations, such as the cubic nonlinear schrödinger (nls) equation [123] and a nonlinear dirac equation [154] . the overwhelming majority of studies in metric graphs (with both linear and nonlinear waves) have focused on networks with a very small number of nodes, as even small networks yield very interesting dynamics. for example, marzuola and pelinovsky [106] analyzed symmetry-breaking and symmetry-preserving bifurcations of standing waves of the cubic nls on a dumbbell graph (with two rings attached to a central line segment and kirchhoff boundary conditions at the nodes). kairzhan et al. [85] studied the spectral stability of half-soliton standing waves of the cubic nls equation on balanced star graphs. sobirov et al. [168] studied scattering and transmission at nodes of sine-gordon solitons on networks (e.g., on a star graph and a small tree). a particularly interesting direction for future work is to study wave dynamics on large metric graphs. this will help extend investigations, as in odes and maps, of how network structures affect dynamics on networks to the realm of linear and nonlinear waves. one can readily formulate wave equations on large metric graphs by specifying relevant boundary conditions and rules at each junction. for example, joly et al. [82] recently examined wave propagation of the standard linear wave equation on fractal trees. because many natural real-life settings are spatially embedded (e.g., wave propagation in granular materials [101, 129] and traffic-flow patterns in cities), it will be particularly valuable to examine wave dynamics on (both synthetic and empirical) spatially-embedded networks [9] . therefore, i anticipate that it will be very insightful to undertake studies of wave dynamics on networks such as random geometric graphs, random neighborhood graphs, and other spatial structures. a key question in network analysis is how different types of network structure affect different types of dynamical processes [142] , and the ability to take a limit as model synthetic networks become infinitely large (i.e., a thermodynamic limit) is crucial for obtaining many key theoretical insights. dynamics of networks and dynamics on networks do not occur in isolation; instead, they are coupled to each other. researchers have studied the coevolution of network structure and the states of nodes and/or edges in the context of "adaptive networks" (which are also known as "coevolving networks") [66, 159] . whether it is sensible to study a dynamical process on a time-independent network, a temporal network with frozen (or no) node or edge states, or an adaptive network depends on the relative time scales of the dynamics of network structure and the states of nodes and/or edges of a network. see [142] for a brief discussion. models in the form of adaptive networks provide a promising mechanistic approach to simultaneously explain both structural features (e.g., degree distributions and temporal features (e.g., burstiness) of empirical data [5] . incorporating adaptation into conventional models can produce extremely interesting and rich dynamics, such as the spontaneous development of extreme states in opinion models [160] . most studies of adaptive networks that include some analysis (i.e., that go beyond numerical computations) have employed rather artificial adaption rules for adding, removing, and rewiring edges. this is relevant for mathematical tractability, but it is important to go beyond these limitations by considering more realistic types of adaptation and coupling between network structure (including multilayer structures, as in [12] ) and the states of nodes and edges. when people are sick, they stay home from work or school. people also form and remove social connections (both online and offline) based on observed opinions and behaviors. to study these ideas using adaptive networks, researchers have coupled models of biological and social contagions with time-dependent networks [100, 142] . an early example of an adaptive network of disease spreading is the susceptibleinfected (si) model in gross et al. [67] . in this model, susceptible nodes sometimes rewire their incident edges to "protect themselves". suppose that we have an n-node network with a constant number of undirected edges. each node is either susceptible (i.e., of type s) or infected (i.e., of type i). at each time step, and for each edge -so-called "discordant edges" -between nodes of different types, the susceptible node becomes infected with probability λ. for each discordant edge, with some probability κ, the incident susceptible node breaks the edge and rewires to some other susceptible node. this is a "rewire-to-same" mechanism, to use the language from some adaptive opinion models [40, 97] . (in this model, multi-edges and selfedges are not allowed.) during each time step, infected nodes can also recover to become susceptible again. gross et al. [67] studied how the rewiring probability affects the "basic reproductive number", which measures how many secondary infections on average occur for each primary infection [18, 89, 130] . this scalar quantity determines the size of a critical infection probability λ * to maintain a stable epidemic (as determined traditionally using linear stability analysis of an endemic state). a high rewiring rate can significantly increase λ * and thereby significantly reduce the prevalence of a contagion. although results like these are perhaps intuitively clear, other studies of contagions on adaptive networks have yielded potentially actionable (and arguably nonintuitive) insights. for example, scarpino et al. [161] demonstrated using an adaptive compartmental model (along with some empirical evidence) that the spread of a disease can accelerate when individuals with essential societal roles (e.g., health-care workers) become ill and are replaced with healthy individuals. another type of model with many interesting adaptive variants are opinion models [23, 142] , especially in the form of generalizations of classical voter models [148] . voter dynamics were first considered in the 1970s by clifford and sudbury [29] as a model for species competition, and the dynamical process that they introduced was dubbed "the voter model"7 by holley and liggett shortly thereafter [69] . voter dynamics are fun and are popular to study [148] , although it is questionable whether it is ever possible to genuinely construe voter models as models of voters [44] . holme and newman [71] undertook an early study of a rewire-to-same adaptive voter model. inspired by their research, durrett et al. [40] compared the dynamics from two different types of rewiring in an adaptive voter model. in each variant of their model, one considers an n-node network and supposes that each node is in one of two states. the network structure and the node states coevolve. pick an edge uniformly at random. if this edge is discordant, then with probability 1 − κ, one of its incident nodes adopts the opinion state of the other. otherwise, with complementary probability κ, a rewiring action occurs: one removes the discordant edge, and one of the associated nodes attaches to a new node either through a rewire-to-same mechanism (choosing uniformly at random among the nodes with the same opinion state) or through a "rewire-to-random" mechanism (choosing uniformly at random among all nodes). as with the adaptive si model in [67] , self-edges and multi-edges are not allowed. the models in [40] evolve until there are no discordant edges. there are several key questions. does the system reach a consensus (in which all nodes are in the same state)? if so, how long does it take to converge to consensus? if not, how many opinion clusters (each of which is a connected component, perhaps interpretable as an "echo chamber", of the final network) are there at steady state? how long does it take to reach this state? the answers and analysis are subtle; they depend on the initial network topology, the initial conditions, and the specific choice of rewiring rule. as with other adaptive network models, researchers have developed some nonrigorous theory (e.g., using mean-field approximations and their generalizations) on adaptive voter models with simplistic rewiring schemes, but they have struggled to extend these ideas to models with more realistic rewiring schemes. there are very few mathematically rigorous results on adaptive voter models, although there do exist some, under various assumptions on initial network structure and edge density [10] . researchers have generalized adaptive voter models to consider more than two opinion states [163] and more general types of rewiring schemes [105] . as with other adaptive networks, analyzing adaptive opinion models with increasingly diverse types of rewiring schemes (ideally with a move towards increasing realism) is particularly important. in [97] , yacoub kureh and i studied a variant of a voter model with nonlinear rewiring (where the probability that a node rewires or adopts is a function of how well it "fits in" within its neighborhood), including a "rewire-tonone" scheme to model unfriending and unfollowing in online social networks. it is also important to study adaptive opinion models with more realistic types of opinion dynamics. a promising example is adaptive generalizations of bounded-confidence models (see the introduction of [110] for a brief review of bounded-confidence models), which have continuous opinion states, with nodes interacting either with nodes or with other entities (such as media [21] ) whose opinion is sufficiently close to theirs. a recent numerical study examined an adaptive bounded-confidence model [19] ; this is an important direction for future investigations. it is also interesting to examine how the adaptation of oscillators -including their intrinsic frequencies and/or the network structure that couples them to each other -affects the collective behavior (e.g., synchronization) of a network of oscillators [149] . such ideas are useful for exploring mechanistic models of learning in the brain (e.g., through adaptation of coupling between oscillators to produce a desired limit cycle [171] ). one nice example is by skardal et al. [167] , who examined an adaptive model of coupled kuramoto oscillators as a toy model of learning. first, we write the kuramoto system as where f i j is a 2π-periodic function of the phase difference between oscillators i and j. one way to incorporate adaptation is to define an "order parameter" r i (which, in its traditional form, quantifies the amount of coherence of the coupled kuramoto oscillators [149] ) for the ith oscillator by and to consider the following dynamical system: where re(ζ) denotes the real part of a quantity ζ and im(ζ) denotes its imaginary part. in the model (6), λ d denotes the largest positive eigenvalue of the adjacency matrix a, the variable z i (t) is a time-delayed version of r i with time parameter τ (with τ → 0 implying that z i → r i ), and z * i denotes the complex conjugate of z i . one draws the frequencies ω i from some distribution (e.g., a lorentz distribution, as in [167] ), and we recall that b i j is the coupling strength on oscillator i from oscillator j. the parameter t gives an adaptation time scale, and α ∈ r and β ∈ r are parameters (which one can adjust to study bifurcations). skardal et al. [167] interpreted scenarios with β > 0 as "hebbian" adaptation (see [27] ) and scenarios with β < 0 as anti-hebbian adaptation, as they observed that oscillator synchrony is promoted when β > 0 and inhibited when β < 0. most studies of networks have focused on networks with pairwise connections, in which each edge (unless it is a self-edge, which connects a node to itself) connects exactly two nodes to each other. however, many interactions -such as playing games, coauthoring papers and other forms of collaboration, and horse racesoften occur between three or more entities of a network. to examine such situations, researchers have increasingly studied "higher-order" structures in networks, as they can exert a major influence on dynamical processes. perhaps the simplest way to account for higher-order structures in networks is to generalize from graphs to "hypergraphs" [121] . hypergraphs possess "hyperedges" that encode a connection between on arbitrary number of nodes, such as between all coauthors of a paper. this allows one to make important distinctions, such as between a k-clique (in which there are pairwise connections between each pair of nodes in a set of k nodes) and a hyperedge that connects all k of those nodes to each other, without the need for any pairwise connections. one way to study a hypergraph is as a "bipartite network", in which nodes of a given type can be adjacent only to nodes of another type. for example, a scientist can be adjacent to a paper that they have written [119] , and a legislator can be adjacent to a committee on which they sit [144] . it is important to generalize ideas from graph theory to hypergraphs, such as by developing models of random hypergraphs [25, 26, 52 ]. another way to study higher-order structures in networks is to use "simplicial complexes" [53, 54, 127] . a simplicial complex is a space that is built from a union of points, edges, triangles, tetrahedra, and higher-dimensional polytopes (see fig. 1d ). simplicial complexes approximate topological spaces and thereby capture some of their properties. a p-dimensional simplex (i.e., a p-simplex) is a p-dimensional polytope that is the convex hull of its p + 1 vertices (i.e., nodes). a simplicial complex k is a set of simplices such that (1) every face of a simplex from s is also in s and (2) the intersection of any two simplices σ 1 , σ 2 ∈ s is a face of both σ 1 and σ 2 . an increasing sequence k 1 ⊂ k 2 ⊂ · · · ⊂ k l of simplicial complexes forms a filtered simplicial complex; each k i is a subcomplex. as discussed in [127] and references therein, one can examine the homology of each subcomplex. in studying the homology of a topological space, one computes topological invariants that quantify features of different dimensions [53] . one studies "persistent homology" (ph) of a filtered simplicial complex to quantify the topological structure of a data set (e.g., a point cloud) across multiple scales of such data. the goal of such "topological data analysis" (tda) is to measure the "shape" of data in the form of connected components, "holes" of various dimensionality, and so on [127] . from the perspective of network analysis, this yields insight into types of large-scale structure that complement traditional ones (such as community structure). see [178] for a friendly, nontechnical introduction to tda. a natural goal is to generalize ideas from network analysis to simplicial complexes. important efforts include generalizing configuration models of random graphs [48] to random simplicial complexes [15, 32] ; generalizing well-known network growth mechanisms, such as preferential attachment [13] ; and developing geometric notions, like curvature, for networks [156] . an important modeling issue when studying higher-order network data is the question of when it is more appropriate (or convenient) to use the formalisms of hypergraphs or simplicial complexes. the computation of ph has yielded insights on a diverse set of models and applications in network science and complex systems. examples include granular materials [95, 129] , functional brain networks [54, 165] , quantification of "political islands" in voting data [42] , percolation theory [169] , contagion dynamics [174] , swarming and collective behavior [179] , chaotic flows in odes and pdes [197] , diurnal cycles in tropical cyclones [182] , and mathematics education [28] . see the introduction to [127] for pointers to numerous other applications. most uses of simplicial complexes in network science and complex systems have focused on tda (especially the computation of ph) and its applications [127, 131, 155] . in this chapter, however, i focus instead on a somewhat different (and increasingly popular) topic: the generalization of dynamical processes on and of networks to simplicial complexes to study the effects of higher-order interactions on network dynamics. simplicial structures influence the collective behavior of the dynamics of coupled entities on networks (e.g., they can lead to novel bifurcations and phase transitions), and they provide a natural approach to analyze p-entity interaction terms, including for p ≥ 3, in dynamical systems. existing work includes research on linear diffusion dynamics (in the form of hodge laplacians, such as in [162] ) and generalizations of a variety of other popular types of dynamical processes on networks. given the ubiquitous study of coupled kuramoto oscillators [149] , a sensible starting point for exploring the impact of simultaneous coupling of three or more oscillators on a system's qualitative dynamics is to study a generalized kuramoto model. for example, to include both two-entity ("two-body") and three-entity interactions in a model of coupled oscillators on networks, we write [172] x where f i describes the dynamics of oscillator i and the three-oscillator interaction term w i jk includes two-oscillator interaction terms w i j (x i , x j ) as a special case. an example of n coupled kuramoto oscillators with three-term interactions is [172] where we draw the coefficients a i j , b i j , c i jk , α 1i j , α 2i j , α 3i jk , α 4i jk from various probability distributions. including three-body interactions leads to a large variety of intricate dynamics, and i anticipate that incorporating the formalism of simplicial complexes will be very helpful for categorizing the possible dynamics. in the last few years, several other researchers have also studied kuramoto models with three-body interactions [92, 93, 166] . a recent study [166] , for example, discovered a continuum of abrupt desynchronization transitions with no counterpart in abrupt synchronization transitions. there have been mathematical studies of coupled oscillators with interactions of three or more entities using methods such as normal-form theory [14] and coupled-cell networks [59] . an important point, as one can see in the above discussion (which does not employ the mathematical formalism of simplicial complexes), is that one does not necessarily need to explicitly use the language of simplicial complexes to study interactions between three or more entities in dynamical systems. nevertheless, i anticipate that explicitly incorporating the formalism of simplicial complexes will be useful both for studying coupled oscillators on networks and for other dynamical systems. in upcoming studies, it will be important to determine when this formalism helps illuminate the dynamics of multi-entity interactions in dynamical systems and when simpler approaches suffice. several recent papers have generalized models of social dynamics by incorporating higher-order interactions [75, 76, 118, 137] . for example, perhaps somebody's opinion is influenced by a group discussion of three or more people, so it is relevant to consider opinion updates that are based on higher-order interactions. some of these papers use some of the terminology of simplicial complexes, but it is mostly unclear (except perhaps for [75] ) how the models in them take advantage of the associated mathematical formalism, so arguably it often may be unnecessary to use such language. nevertheless, these models are very interesting and provide promising avenues for further research. petri and barrat [137] generalized activity-driven models to simplicial complexes. such a simplicial activity-driven (sad) model generates time-dependent simplicial complexes, on which it is desirable to study dynamical processes (see section 4), such as opinion dynamics, social contagions, and biological contagions. the simplest version of the sad model is defined as follows. • each node i has an activity rate a i that we draw independently from a distribution f(x). • at each discrete time step (of length ∆t), we start with n isolated nodes. each node i is active with a probability of a i ∆t, independently of all other nodes. if it is active, it creates a (p − 1)-simplex (forming, in network terms, a clique of p nodes) with p − 1 other nodes that we choose uniformly and independently at random (without replacement). one can either use a fixed value of p or draw p from some probability distribution. • at the next time step, we delete all edges, so all interactions have a constant duration. we then generate new interactions from scratch. this version of the sad model is markovian, and it is desirable to generalize it in various ways (e.g., by incorporating memory or community structure). iacopini et al. [76] recently developed a simplicial contagion model that generalizes an si process on graphs. consider a simplicial complex k with n nodes, and associate each node i with a state x i (t) ∈ {0, 1} at time t. if x i (t) = 0, node i is part of the susceptible class s; if x i (t) = 1, it is part of the infected class i. the density of infected nodes at time t is ρ(t) = 1 n n i=1 x i (t). suppose that there are d parameters 1 , . . . , d (with d ∈ {1, . . . , n − 1}), where d represents the probability per unit time that a susceptible node i that participates in a d-dimensional simplex σ is infected from each of the faces of σ, under the condition that all of the other nodes of the face are infected. that is, 1 is the probability per unit time that node i is infected by an adjacent node j via the edge (i, j). similarly, 2 is the probability per unit time that node i is infected via the 2-simplex (i, j, k) in which both j and k are infected, and so on. the recovery dynamics, in which an infected node i becomes susceptible again, proceeds as in the sir model that i discussed in section 4.2. one can envision numerous interesting generalizations of this model (e.g., ones that are inspired by ideas that have been investigated in contagion models on graphs). the study of networks is one of the most exciting and rapidly expanding areas of mathematics, and it touches on myriad other disciplines in both its methodology and its applications. network analysis is increasingly prominent in numerous fields of scholarship (both theoretical and applied), it interacts very closely with data science, and it is important for a wealth of applications. my focus in this chapter has been a forward-looking presentation of ideas in network analysis. my choices of which ideas to discuss reflect their connections to dynamics and nonlinearity, although i have also mentioned a few other burgeoning areas of network analysis in passing. through its exciting combination of graph theory, dynamical systems, statistical mechanics, probability, linear algebra, scientific computation, data analysis, and many other subjects -and through a comparable diversity of applications across the sciences, engineering, and the humanities -the mathematics and science of networks has plenty to offer researchers for many years. congestion induced by the structure of multiplex networks tie-decay temporal networks in continuous time and eigenvector-based centralities multilayer networks in a nutshell multilayer networks in a nutshell temporal and structural heterogeneities emerging in adaptive temporal networks synchronization in complex networks mathematical frameworks for oscillatory network dynamics in neuroscience turing patterns in multiplex networks morphogenesis of spatial networks evolving voter model on dense random graphs generative benchmark models for mesoscale structure in multilayer networks birth and stabilization of phase clusters by multiplexing of adaptive networks network geometry with flavor: from complexity to quantum geometry chaos in generically coupled phase oscillator networks with nonpairwise interactions topology of random geometric complexes: a survey explosive transitions in complex networksõ structure and dynamics: percolation and synchronization factoring and weighting approaches to clique identification mathematical models in population biology and epidemiology how does active participation effect consensus: adaptive network model of opinion dynamics and influence maximizing rewiring anatomy of a large-scale hypertextual web search engine a model for the influence of media on the ideology of content in online social networks frequency-based brain networks: from a multiplex network to a full multilayer description statistical physics of social dynamics bootstrap percolation on a bethe lattice configuration models of random hypergraphs annotated hypergraphs: models and applications hebbian learning architecture and evolution of semantic networks in mathematics texts a model for spatial conflict reaction-diffusion processes and metapopulation models in heterogeneous networks multiple-scale theory of topology-driven patterns on directed networks generalized network structures: the configuration model and the canonical ensemble of simplicial complexes structure and dynamics of core/periphery networks the physics of spreading processes in multilayer networks mathematical formulation of multilayer networks navigability of interconnected networks under random failures identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems explosive phenomena in complex networks graph fission in an evolving voter model a practical guide to stochastic simulations of reaction-diffusion processes persistent homology of geospatial data: a case study with voting limitations of discrete-time approaches to continuous-time contagion dynamics is the voter model a model for voters? the use of multilayer network analysis in animal behaviour on eigenvector-like centralities for temporal networks: discrete vs. continuous time scales community detection in networks: a user guide configuring random graph models with fixed degree sequences nine challenges in incorporating the dynamics of behaviour in infectious diseases models modelling the influence of human behaviour on the spread of infectious diseases: a review anatomy and efficiency of urban multimodal mobility random hypergraphs and their applications elementary applied topology two's company, three (or more) is a simplex binary-state dynamics on complex networks: pair approximation and beyond quantum graphs: applications to quantum chaos and universal spectral statistics the structural virality of online diffusion patterns of synchrony in coupled cell networks with multiple arrows finite-size effects in a stochastic kuramoto model dynamical interplay between awareness and epidemic spreading in multiplex networks threshold models of collective behavior on the critical behavior of the general epidemic process and dynamical percolation a matrix iteration for dynamic network summaries a dynamical systems view of network centrality adaptive coevolutionary networks: a review epidemic dynamics on an adaptive network pathogen mutation modeled by competition between site and bond percolation ergodic theorems for weakly interacting infinite systems and the voter model modern temporal network theory: a colloquium nonequilibrium phase transition in the coevolution of networks and opinions temporal networks temporal networks temporal network theory an adaptive voter model on simplicial complexes simplical models of social contagion turing instability in reaction-diffusion models on complex networks games on networks the large graph limit of a stochastic epidemic model on a dynamic multilayer network a local perspective on community structure in multilayer networks structure of growing social networks wave propagation in fractal trees synergistic effects in threshold models on networks hipsters on networks: how a minority group of individuals can lead to an antiestablishment majority drift of spectrally stable shifted states on star graphs maximizing the spread of influence through a social network second look at the spread of epidemics on networks centrality prediction in dynamic human contact networks mathematics of epidemics on networks multilayer networks authoritative sources in a hyperlinked environment dynamics of multifrequency oscillator communities finite-size-induced transitions to synchrony in oscillator ensembles with nonlinear global coupling pattern formation in multiplex networks quantifying force networks in particulate systems quantum graphs: i. some basic structures fitting in and breaking up: a nonlinear version of coevolving voter models from networks to optimal higher-order models of complex systems hawkes processes complex spreading phenomena in social systems: influence and contagion in real-world social networks wave mitigation in ordered networks of granular chains centrality metric for dynamic networks control principles of complex networks resynchronization of circadian oscillators and the east-west asymmetry of jet-lag transitivity reinforcement in the coevolving voter model ground state on the dumbbell graph random walks and diffusion on networks the nonlinear heat equation on dense graphs and graph limits multi-stage complex contagions opinion formation and distribution in a bounded-confidence model on various networks network motifs: simple building blocks of complex networks portrait of political polarization six susceptible-infected-susceptible models on scale-free networks a network-based dynamical ranking system for competitive sports community structure in time-dependent, multiscale, and multiplex networks multi-body interactions and non-linear consensus dynamics on networked systems scientific collaboration networks. i. network construction and fundamental results network structure from rich but noisy data collective phenomena emerging from the interactions between dynamical processes in multiplex networks nonlinear schrödinger equation on graphs: recent results and open problems complex contagions with timers a theory of the critical mass. i. interdependence, group heterogeneity, and the production of collective action interaction mechanisms quantified from dynamical features of frog choruses a roadmap for the computation of persistent homology chimera states: coexistence of coherence and incoherence in networks of coupled oscillators network analysis of particles and grains epidemic processes in complex networks topological analysis of data master stability functions for synchronized coupled systems cluster synchronization and isolated desynchronization in complex networks with symmetries bayesian stochastic blockmodeling modelling sequences and temporal networks with dynamic community structures activity driven modeling of time varying networks simplicial activity driven model the multilayer nature of ecological networks network analysis and modelling: special issue of dynamical systems on networks: a tutorial the role of network analysis in industrial and applied mathematics a network analysis of committees in the u.s. house of representatives communities in networks spectral centrality measures in temporal networks reality inspired voter models: a mini-review the kuramoto model in complex networks core-periphery structure in networks (revisited) dynamic pagerank using evolving teleportation memory in network flows and its effects on spreading dynamics and community detection recent advances in percolation theory and its applications dynamics of dirac solitons in networks simplicial complexes and complex systems comparative analysis of two discretizations of ricci curvature for complex networks dynamics of interacting diseases null models for community detection in spatially embedded, temporal networks modeling complex systems with adaptive networks social diffusion and global drift on networks the effect of a prudent adaptive behaviour on disease transmission random walks on simplicial complexes and the normalized hodge 1-laplacian multiopinion coevolving voter model with infinitely many phase transitions the architecture of complexity the importance of the whole: topological data analysis for the network neuroscientist abrupt desynchronization and extensive multistability in globally coupled oscillator simplexes complex macroscopic behavior in systems of phase oscillators with adaptive coupling sine-gordon solitons in networks: scattering and transmission at vertices topological data analysis of continuum percolation with disks from kuramoto to crawford: exploring the onset of synchronization in populations of coupled oscillators motor primitives in space and time via targeted gain modulation in recurrent cortical networks multistable attractors in a network of phase oscillators with threebody interactions analysing information flows and key mediators through temporal centrality metrics topological data analysis of contagion maps for examining spreading processes on networks eigenvector-based centrality measures for temporal networks supracentrality analysis of temporal networks with directed interlayer coupling tunable eigenvector-based centralities for multiplex and temporal networks topological data analysis: one applied mathematicianõs heartwarming story of struggle, triumph, and ultimately, more struggle topological data analysis of biological aggregation models partitioning signed networks on analytical approaches to epidemics on networks using persistent homology to quantify a diurnal cycle in hurricane felix resolution limits for detecting community changes in multilayer networks analytical computation of the epidemic threshold on temporal networks epidemic threshold in continuoustime evolving networks network models of the diffusion of innovations graph spectra for complex networks non-markovian infection spread dramatically alters the susceptible-infected-susceptible epidemic threshold in networks temporal gillespie algorithm: fast simulation of contagion processes on time-varying networks forecasting elections using compartmental models of infection ranking scientific publications using a model of network traffic coupled disease-behavior dynamics on complex networks: a review social network analysis: methods and applications a simple model of global cascades on random networks braess's paradox in oscillator networks, desynchronization and power outage inferring symbolic dynamics of chaotic flows from persistence continuous-time discrete-distribution theory for activitydriven networks an analytical framework for the study of epidemic models on activity driven networks modeling memory effects in activity-driven networks models of continuous-time networks with tie decay, diffusion, and convection key: cord-017062-dkw2sugl authors: singh, indu; swami, rajan; khan, wahid; sistla, ramakrishna title: delivery systems for lymphatic targeting date: 2013-10-08 journal: focal controlled drug delivery doi: 10.1007/978-1-4614-9434-8_20 sha: doc_id: 17062 cord_uid: dkw2sugl the lymphatic system has a critical role in the immune system’s recognition and response to disease, and it is an additional circulatory system throughout the entire body. most solid cancers primarily spread from the main site via the tumour’s surrounding lymphatics before haematological dissemination. targeting drugs to lymphatic system is quite complicated because of its intricate physiology. therefore, it tends to be an important target for developing novel therapeutics. currently, nanocarriers have encouraged the lymphatic targeting, but still there are challenges of locating drugs and bioactives to specific sites, maintaining desired action and crossing all the physiological barriers. lymphatic therapy using drug-encapsulated colloidal carriers especially liposomes and solid lipid nanoparticles emerges as a new technology to provide better penetration into the lymphatics where residual disease exists. optimising the proper procedure, selecting the proper delivery route and target area and making use of surface engineering tool, better carrier for lymphotropic system can be achieved. thus, new methods of delivering drugs and other carriers to lymph nodes are currently under investigation. the lymphatic system was fi rst recognised by gaspare aselli in 1627, and the anatomy of the lymphatic system was almost completely characterised by the early nineteenth century. however, knowledge of the blood circulation continued to grow rapidly in the last century [ 1 ] . two different theories are proposed which are in favour of origin of the lymphatic vessels. firstly, centrifugal theory of embryologic origin of the lymphatics was described in the early twentieth century by sabin and later by lewis, postulating that lymphatic endothelial cells (lecs) are derived from the venous endothelium. later the centripetal theory of lymphatic development was proposed by huntington and mcclure in 1910 which describes the development of the lymphatic system beginning with lymphangioblasts, mesenchymal progenitor cells, arising independently of veins. the venous connection to the lymphatic system then happens later in development [ 2 ] . the lymphatic vessels in the embryo are originated at mid-gestation and are developed after the cardiovascular system is fully established and functional [ 3 ] . a dual origin of lymphatic vessels from embryonic veins and mesenchymal lymphangioblasts is also proposed [ 4 ] . recent studies provide strong support of the venous origin of lymphatic vessels [ 5 -8 ] . the recent discovery of various molecular markers has allowed for more in-depth research of the lymphatic system and its role in health and disease. the lymphatic system has recently been elucidated as playing an active role in cancer metastasis. the knowledge of the active processes involved in lymphatic metastasis provides novel treatment targets for various malignancies. the lymphatic system consists of the lymphatic vessels, lymph nodes, spleen, thymus, peyer's patches and tonsils, which play important roles in immune surveillance and response. the lymphatic system serves as the body's second vascular system in vertebrates and functions co-dependently with the cardiovascular system [ 9 , 10 ] . the lymphatic system comprises a single irreversible, open-ended transit network without a principal driving force [ 9 ] . it consists of fi ve main types of conduits including the capillaries, collecting vessels, lymph nodes, trunks and ducts. the lymphatic system originates in the dermis with initial lymphatic vessels and blind-ended lymphatic capillaries that are nearly equivalent in size to but less abundant than regular capillaries [ 9 , 11 ] . lymphatic capillaries consist of a single layer of thin-walled, non-fenestrated lymphatic endothelial cells (lecs), alike to blood capillaries. the lecs, on the contrary to blood vessels, have poorly developed basement membrane and lack tight junctions and adherent junctions too. these very porous capillaries act as gateway for large particles, cells and interstitial fl uid. particles as large as 100 nm in diameter can extravasate into the interstitial space, get phagocytosed by macrophages and are ultimately passed on to lymph nodes [ 11 -14 ] . lymphatic capillary endothelial cells are affi xed to the extracellular matrix by elastic anchoring fi laments, which check vessel collapse under high interstitial pressure. these initial lymphatics, under a positive pressure gradient, distend and create an opening between loosely anchored endothelial cells letting for the entry of lymph, a protein-rich exudate from the blood capillaries [ 12 , 15 , 16 ] . in initial lymphatic vessels, overlying endothelial cell-cell contacts prevent fl uid refl ux back into the interstitial space [ 17 , 18 ] . after the collection of lymph by the lymphatic capillaries, it is transported through a system of converging lymphatic vessels of progressively larger size, is fi ltered through lymph nodes where bacteria and particulate matter are removed and fi nally goes back to the blood circulation. lymph is received from the initial capillary lymphatic by deeper collecting vessels that contain valves to maintain unidirectional fl ow of lymph. these collecting vessels have basement membranes and are surrounded by smooth muscle cells with intrinsic contractile activity that in combination with contraction of surrounding skeletal muscles and arterial pulsations propels the lymph to lymph nodes [ 19 -21 ] . the collecting lymphatic vessels unite into lymphatic trunks, and the lymph is fi nally returned to the venous circulation via the thoracic duct into the left subclavian vein [ 22 , 23 ] . the fl ow of lymph toward the circulatory system is supported by increases in interstitial pressure as well as contractions of the lymphatic vessels themselves. roughly 25 l of lymphatic fl uid enters the cardiovascular system each day [ 11 ] . the key functions of the lymphatic system are maintenance of normal tissue fl uid balance, absorption of lipids and fat-soluble vitamins from the intestine and magnetism and transport of immune cells. lymphatics transport the antigen-presenting cells as well as antigens from the interstitium of peripheral tissues to the draining lymph nodes where they initiate immune responses via b-and t-cells in the lymph nodes [ 9 , 12 , 24 , 25 ] . tissue fl uid balance is maintained by restoring interstitial fl uid to the cardiovascular system [ 9 ] . although capillaries have very low permeability to proteins, these molecules as well as other macromolecules and bacteria accumulate in the interstitium. due to the accumulation of these large molecules in the interstitium, signifi cant tissue oedema would result. the lymphatic system offers the mechanism by which these large molecules re-enter the blood circulation [ 26 ]. the lymphatic system is the site of many diseases such as metastitial tuberculosis (tb), cancer and fi lariasis [ 27 ] . due to the peculiar nature and anatomy of the lymphatic system, localisation of drugs in the lymphatics has been particularly diffi cult to achieve. the lymphatic system has an active role in cancer metastasis. although many cancers may be treated with surgical resection, microscopic disease may remain and lead to locoregional recurrence. conventional systemic chemotherapy cannot prove effective for delivering drugs to the lymphatic system without dose-limiting toxicities [ 28 ]. lymphatic system functions in the clearance of particulate matter from the interstitium following presentation to lymph nodes have created interest in developing microparticulate systems to target regional lymph nodes. molecule's composition is important in determining uptake into the lymphatics and retention within the lymph nodes. colloidal materials, for example, liposomes, activated carbon particles, emulsions, lipids and polymeric particulates, are highly taken up by the lymphatics; that's why nowadays these substances are emerging as potential carriers for lymphatic drug targeting [ 29 ] . the vast majority of drugs following oral administration are absorbed directly into portal blood, but a number of lipophilic molecules may get access to the systemic circulation via the lymphatic pathway [ 30 , 31 ] . intestinal lymphatic transport of lipophilic molecules is signifi cant and presents benefi ts in a number of situations: the lymphatic system also acts as the primary systemic transport pathway for b-and t-lymphocytes as well as the main route of metastatic spread of a number of solid tumours [ 36 , 37 ] . therefore, lymphatic absorption of the immunomodulatory and anticancer compounds may be more effective [ 38 , 39 ] . the presence of wide amounts of hiv-susceptible immune cells in the lymphoid organs makes antiretroviral drug targeting to these sites of tremendous interest in hiv therapy. this strategy comprises once again targeting nanosystems to immune cell populations, particularly macrophages. also evidence further suggests that lymph and lymphoid tissue, and in particular gut-associated lymphoid tissue, play a major role in the development of hiv and antivirals which target acquired immunodefi ciency syndrome (aids) may therefore be more effective when absorbed via the intestinal lymphatics [ 40 , 41 ] targeting drugs to lymphatic system is a tough and challenging task, and it totally depends upon the intricate physiology of the lymphatic system. targeting facilitates direct contact of drug with the specifi c site, decreasing the dose of the drugs and minimising the side effects caused by them. currently, nanocarriers have encouraged the lymphatic targeting, but still there are challenges of locating drugs and bioactives to specifi c sites, maintaining desired action and crossing all the physiological barriers. these hurdles could be overcome by the use of modifi ed nanosystems achieved by the surface engineering phenomena. from the growing awareness of the importance of lymph nodes in cancer prognosis, their signifi cance for vaccine immune stimulation and the comprehension that the lymph nodes harbour hiv as well as other infectious diseases stems the development of new methods of lymph node drug delivery [ 47 -50 ] . new methods of delivering drugs and other carriers to lymph nodes are currently under investigation. lymph node dissemination is the primary cause of the spread of majority of solid cancers [ 51 ] . in regard to cancer metastasis, the status of the lymph node is a major determinant of the patient's diagnosis. the most important factor that determines the appropriate care of the patient is correct lymph node staging [ 52 ] . but patient survivals have been shown to improve by the therapeutic interventions that treat metastatic cancer in lymph nodes with either surgery or local radiation therapy [ 53 ] . viraemia is an early indication of primary infection with hiv followed by a specifi c hiv immune response and a dramatic decline of virus in the plasma [ 54 ] . long after the hiv virus can be found in the blood, hiv can be found in high levels in mononuclear cells located in lymph nodes. viral replication in these lymph nodes has been reported to be about 10-to 100-fold higher than in the peripheral blood mononuclear cells [ 55 ] . standard oral or intravenous drug delivery to these lymph node mononuclear cells is diffi cult [ 56 ] . even if highly active antiretroviral therapy (haart) can reduce plasma viral loads in hiv-infected patients by 90 %, active virus can still be isolated from lymph nodes even after 30 months of haart therapy. lymph nodes are the key element of the life cycle of several parasite organisms, including fi laria. lymphatic vessels and lymph nodes of infected patients can carry adult worms. this adult fi laria obstructs the lymphatic drainage that results into swelling of extremities that are distal to the infected lymph node. these very symptoms of swollen limbs in patients with fi larial disease have been termed elephantiasis. the eradication of adult worms in lymph nodes is not frequently possible, and commonly a much extended course of medical therapy is required for it to be successful [ 57 ] . new methods of curing anthrax have become a burning interest following the recent outburst of anthrax infections and deaths in the usa as a result of terrorism. in anthrax infection, endospores from bacillus anthracis that gain access into the body are phagocytosed by macrophages and carried to regional lymph nodes where the endospores germinate inside the macrophages and become vegetative bacteria [ 58 ] . according to one literature, computed tomography of the chest was performed on eight patients infected with inhalational anthrax. mediastinal lymphadenopathy was found in seven of the eight patients [ 59 ] . in another case report of a patient, the anthrax bacillus was shown to be rapidly sterilised within the blood stream after initiation of antibiotic therapy. however, viable anthrax bacteria were still present in postmortem mediastinal lymph node specimens [ 60 ] . treatment and control of these diseases are hard to accomplish because of the limited access of drugs to mediastinal nodes using common pathways of drug delivery. also, the anatomical location of mediastinal nodes represents a diffi cult target for external beam irradiation. newer methods to target antituberculosis drugs to these lymph nodes could possibly decrease the amount of time of drug therapy. tb requires lengthy treatment minimum of approximately 6 months probably because of its diffi culty in delivering drugs into the tubercular lesions. the tb infection is caused by mycobacteria that invade and grow chiefl y in phagocytic cells. lymph node tb is the most common form of extrapulmonary tb rating approximately as 38.3 %. this is frequently found to spread from the lungs to lymph nodes. in one study, total tb lymph node involvement was found as 71 % of the intrathoracic lymph nodes, 26 % of the cervical lymph nodes and 3 % of the axillary lymph nodes [ 61 ] . targeted delivery of drugs can be achieved utilising carriers with a specifi ed affi nity to the target tissue. there are two approaches for the targeting, i.e. passive and active. in passive targeting, most of the carriers accumulate to the target site during continuous systemic circulation to deliver the drug substance, the behaviour of which depends highly upon the physicochemical characteristics of the carriers. whereas much effort has been concentrated on active targeting, this involves delivering drugs more actively to the target site. passive targeting involves the transport of carriers through leaky tumour vasculature into the tumour interstitium and cells by convection or passive diffusion. further, nanocarriers and drug then accumulate at the target site by the enhanced permeation and retention (epr) effect [ 62 ] . the epr effect is most prominent mainly in cancer targeting. moreover, the epr effect is pertinent for about all fast-growing solid tumours [ 63 ] . the epr effect will be most positive if nanocarriers can escape immune surveillance and circulate for a long period. very high local concentrations of drug-loaded nanocarriers can be attained at the target site, for example, about 10-to 50-fold higher than in normal tissue within 1-2 days [ 64 ] . however, there exist some limitations for passively targeting the tumour; fi rst is the degree of tumour vascularisation and angiogenesis which is important for passive targeting of nanocarriers [ 65 ] . and, second, due to the poor lymphatic drainage in tumours, the interstitial fl uid pressure increases which correlates nanocarrier size relationship with the epr effect: larger and long-circulating nanocarriers (100 nm) are more retained in the tumour, whereas smaller molecules easily diffuse [ 66 ] . active targeting is based upon the attachment of targeting ligands on the surface of the nanocarrier for appropriate receptor binding that are expressed at the target site. the ligand particularly binds to a receptor overexpressed in particular diseased cells or tumour vasculature and not expressed by normal cells. in addition, targeted receptors should be present uniformly on all targeted cells. targeting ligands are either monoclonal antibodies (mabs) and antibody fragments or non-antibody ligands (peptidic or not). these can also be termed as ligand-targeted therapeutics [ 67 , 68 ] . targeting approaches for lymphatic targeting are shown in fig. 20 .1 . current research is focussed on two types of carriers, namely, colloidal carriers and polymeric carriers. targeting strategies for lymphatics are shown in fig. 20 much effort has been concentrated to achieve lymphatic targeting of drugs using colloidal carriers. the physicochemical nature of the colloid itself has been shown to be of particular relevance, with the main considerations being size of colloid and hydrophobicity. the major purpose of lymphatic targeting is to provide an effective anticancer chemotherapy to prevent the metastasis of cancer cells by accumulating the drug in the regional lymph node. emulsions are probably well-known particulate carriers with comparative histories of research and have been widely used as a carrier for lymph targeting. hashida et al. demonstrated that injection of water-in-oil (w/o) or oil-in-water (o/w) emulsions favoured lymphatic transport of mitomycin c via the intraperitoneal and intramuscular routes and uptake into the regional lymphatics was reported in the order of o/w > w/o > aqueous solution. the nanoparticle-in-oil emulsion system, containing anti-fi larial drug in gelatin nanoparticles, was studied for enhancing lymphatic targeting [ 69 ] . pirarubicin and lipiodol emulsion formulation was developed for treating gastric cancer and metastatic lymph nodes [ 70 , 71 ] . after endoscopic injection of the pirarubicin-lipiodol emulsion, the drug retained over 7 days at the injection site and in the regional lymph node. hauss et al. in their study have explored the lymphotropic potential of emulsions and self-emulsifying drug delivery systems (sedds). they investigated the effects of a range of lipid-based formulations on the bioavailability and lymphatic transport of ontazolast following oral administration to conscious rats and found that all the lipid formulations increased the bioavailability of ontazolast comparative to the control suspension, the sedds promoted more rapid absorption and maximum lymphatic transport is found with the emulsion [ 72 , 73 ] . lymphatic delivery of drug-encapsulated liposomal formulations has been investigated extensively in the past decade. liposomes possess ideal features for delivering therapeutic agents to the lymph nodes which are based on their size, which prevents their direct absorption into the blood; the large amount of drugs and other therapeutic agents that liposomes can carry; and their biocompatibility. the utility of liposomes as a carrier for lymphatic delivery was fi rst investigated by segal et al. in 1975 [ 74 ] . orally administered drug-incorporated liposomes enter the systemic circulation via the portal vein and intestinal lymphatics. drugs entering the intestinal lymphatic through the intestinal lumen avoid liver and fi rst-pass metabolism as they fi rst migrate to lymphatic vessels and draining lymph nodes before entering systemic circulation. lymphatic uptake of carriers via the intestinal route increases bioavailability of a number of drugs. for oral delivery of drug-encapsulated liposomal formulations, intestinal absorbability and stability are the primary formulation concerns. ling et al. evaluated oral delivery of a poorly bioavailable hydrophilic drug, cefotaxime, in three different forms: liposomal formulation, aqueous-free drug and a physical mixture of the drug and empty liposomes [ 75 ] . the liposomal formulation of the drug turned out to exhibit a 2.7-fold increase in its oral bioavailability compared to the aqueous dosage and a 2.3-fold increase for the physical mixture. they also accounted that the liposomal formulation leads to a signifi cant enhancement of the lymphatic localisation of the drug relative to the other two formulations. as a result, liposome systems emerged as useful carriers for poorly bioavailable hydrophilic drugs, promoting their lymphatic transport in the intestinal lymph as well as their systemic bioavailability. conventional liposomal formulations contain anticancer drugs incorporated in them for intravenous infusion in treating various types of cancers. doxil, a chemotherapeutic formulation of pegylated liposomes of doxorubicin, is widely used as fi rst-line therapy of aids-related kaposi's sarcoma, breast cancer, ovarian cancer and other solid tumours [ 76 -80 ] . liposomal delivery of anticancer drug actinomycin d via intratesticular injection has shown greater concentration of the drug in the local lymph nodes. furthermore, a study by hirnle et al. found liposomes as a better carrier for intralymphatically delivered drugs contrasted with bleomycin emulsions [ 81 ] . systemic liposomal chemotherapy is preferred mainly because of its reduced side effects compared to the standard therapy and improved inhibition of the anticancer drugs from enzymatic digestion in the systemic circulation. effective chemotherapy by pulmonary route could overcome various lacunas associated with systemic chemotherapy like serious non-targeted toxicities, poor drug penetration into the lymphatic vessels and surrounding lymph node and fi rst-pass clearance concentrating drugs in the lungs and draining lymphatics in the case of oral delivery. latimer et al. developed liposomes of paclitaxel and a vitamin e analogue α-tocopheryloxy acetic acid (α-tea) in an aerosol formulation for treating murine mammary tumours and metastases [ 82 ] . similarly, lawson et al. performed a comparative study for the anti-proliferative effi cacy of a 9-nitro-camptothecin (9-nc)-encapsulated dilauroylphosphatidylcholine liposomal delivery, α-tea and a combination therapy of 9-nc and α-tea, in a metastatic murine mammary tumour model. liposome-encapsulated individual as well as combination treatment was delivered via an aerosol for curing metastases of lungs and of the surrounding lymph node. the animals treated with the combination therapy were found to have less proliferative cells compared to the animals treated with 9-nc alone when immunostained with ki-67. the in vivo anticancer effi cacy studies demonstrated that the combination treatment greatly hindered the tumour progression compared to each treatment alone, leading to the prolonged survival rate [ 83 ] . high levels of drugs could be targeted to lymph nodes containing tb using liposomal antituberculosis drug therapy [ 84 ] . deep lung lymphatic drainage could also be visualised using 99mtc radioactive marker-incorporated liposomes. in addition, botelho et al. delivered aerosolised nanoradioliposomal formulation to wild boars and observed their deep lung lymphatic network and surrounding lymph nodes [ 85 ] . also, this technique has offered new information of the complicated structure of lymphatic network and has emerged as a new and non-invasive molecular imaging technique for the diagnosis of early dissemination of lung cancers as compared to the conventional computed tomography. solid lipid nanoparticles (sln) could be a good formulation strategy for incorporating drugs with poor oral bioavailability due to low solubility in gi tract or pre-systemic hepatic metabolism (fi rst-pass effect) permitting transportation into the systemic circulation through the intestinal lymphatics. bargoni et al. have performed various studies on absorption and distribution of sln after duodenal administration [ 86 -89 ] . in one study, 131 i-17-iodoheptadecanoic acid-labelled drug-free sln were delivered into the duodenal lumen of fed rats, and transmission electron microscopy and photon correlation spectroscopy results of the lymph and blood samples verifi ed the transmucosal transport of sln [ 86 ] . in a later study of tobramycin-loaded sln after duodenal administration, the improvement of drug absorption and bioavailability was ascribed mostly to the favoured transmucosal transport of sln to the lymph compared to the blood [ 88 ] . the same group conducted a study using idarubicin-loaded sln, administered via the duodenal route rather than intravenous route, and observed enhancement in drug bioavailability [ 89 ] . reddy et al. prepared etoposide-loaded tripalmitin (etpl) sln radiolabelled with 99mtc and administered the etpl nanoparticles subcutaneously, intraperitoneally and intravenously, to mice bearing dalton's lymphoma tumours, and 24 h after subcutaneous administration, gamma scintigraphy and the radioactivity measurements showed that the etpl sln revealed a clearly higher degree of tumour uptake given via subcutaneous route (8-and 59-fold higher than that of the intraperitoneal and intravenous routes, respectively) and reduced accumulation in reticuloendothelial system organs [ 90 ] . targeting therapies are of great potential in small cell lung cancer considering intrathoracic lymph node metastasis occurring in approximately 70 % of the limited stage patients and to nearly 80 % of the extensive stage patients [ 91 ] . considering the case of non-small cell lung cancer, extensive rate of metastasis of lymphatics is seen in greater than 80 % of stage iv patients [ 92 ] . videira et al. compared the biodistribution of inhaled 99mtc-d,l -hexamethylpropyleneamine oxime (hmpao)radiolabelled sln with that of the free tracer administered through the same route, and gamma scintigraphic results specifi ed that the radiolabelled sln were primarily cleared from lungs via the lymphatics [ 93 , 94 ] . nanocapsules tend to be the most promising approach for lymphatic targeting because of their possibility of attaining distinct qualities with an easy manufacturing process. nanocapsules coated with hydrophobic polymers could be easily captured by lymphatic cells in the body, when administered, because the hydrophobic particle is generally recognised as a foreign substance. the lymphatic targeting ability of poly(isobutylcyanoacrylate) nanocapsules encapsulating 12-(9-anthroxy) stearic acid upon intramuscular administration was evaluated and compared with three conventional colloidal carriers [ 69 ] . in vivo study in rats proved that poly(isobutylcyanoacrylate) nanocapsules retained in the right iliac regional lymph nodes in comparison with other colloidal carriers following intramuscular administration. for effective targeted and sustained delivery of drugs to lymph, several polymeric particles have been designed and studied. the polymers are categorised in two types based on their origin either natural polymers like dextran, alginate, chitosan, gelatin, pullulan and hyaluronan or synthetic polymers like plga, pla and pmma. dextran a natural polysaccharide has been used as a carrier for a range of drug molecules due to its outstanding biocompatibility. bhatnagar et al. synthesised cyclosporine a-loaded dextran acetate particles labelled with 99mtc. these particles gradually distributed cyclosporine a all through the lymph nodes following subcutaneous injection into the footpad of rats [ 95 ] . dextran (average molecular weights of 10, 70 and 500 kda)-conjugated lymphotropic delivery system of mitomycin c has been studied and it was reported that after intramuscular injection in mice, this mitomycin c-dextran conjugates retained for a longer period in regional lymph nodes for nearly 48 h while the free mitomycin was quickly cleared. hyaluronan, also called as hyaluronic acid, is a natural biocompatible polymer that follows lymphatic drainage from the interstitial spaces. cai et al. demonstrated a novel intralymphatic drug delivery method synthesising a cisplatin-hyaluronic acid conjugate for breast cancer treatment. following subcutaneous injection into the upper mammary fat pad of female rats, most of the carrier localised in the regional nodal tissue compared to the standard cisplatin formulation [ 96 ] . poly(lactide-co-glycolide) as synthetic polymer that is used to prepare biodegradable nanospheres has been accounted to deliver drugs and diagnostic agents to the lymphatic system. similarly, nanospheres coated with block copolymers of poloxamers and poloxamines with radiolabelled 111 in-oxine are used to trace the nanoparticles in vivo. upon s.c. injection, the regional lymph node showed a maximum uptake of 17 % of the administered dose [ 97 ] . dunne et al. synthesised a conjugate of block copolymer cis-diamminedichloroplatinum(ii) (cddp) and poly(ethylene oxide)-block-poly(lysine) (peo-b-plys) for treating lymph node metastasis. one animal treatment with 10 wt.% cddp-polymer resulted into limited tumour growth in the draining lymph nodes and prevention of systemic metastasis [ 98 ] . johnston and coworkers designed a biodegradable intrapleural (ipl) implant of paclitaxel consisting gelatin sponge impregnated with poly(lactide-co-glycolide) (plga-ptx) for targeting thoracic lymphatics. in rat model, this system exhibited lymphatic targeting capability and showed sustained drug release properties [ 99 ] . kumanohoso et al. designed a new drug delivery system for bleomycin by loading it into a small cylinder of biodegradable polylactic acid to target lesions. this system showed signifi cantly higher antitumour effect compared to bleomycin solution and no treatment [ 100 ] . to treat lesions, a new biodegradable colloidal particulatebased nanocarrier system was designed to target thoracic lymphatics and lymph nodes. various nano-and microparticles of charcoal, polystyrene and poly(lactideco-glycolide) were studied for the lymphatic distribution after intrapleural implantation in rats, and after 3 h of intrapleural injection, the lymphatic uptake was observed [ 101 ] . kobayashi et al. utilised dendrimer-based contrast agents for dynamic magnetic resonance lymphangiography [ 102 ] . gadolinium (gd)-containing dendrimers of different sizes and molecular structures (pamam-g8, pamam-g4 and dab-g5) (pamam, polyamidoamine; dab, diaminobutyl) are used as contrast agents. size and molecular structure play a great role in distribution and pharmacokinetics of dendrimers. for example, pamam-g8 when injected intravenously had a comparatively long life in the circulatory system with minimum leakage out of the vessels, whereas pamam-g4 cleared rapidly from the systemic circulation due to rapid renal clearance but had immediate survival in lymphatic circulation. the smaller-sized dab-g5 showed greater accumulation and retention in lymph nodes useful for lymph node imaging using mr-lg. gadomer-17 and gd-dtpadimeglumine (magnevist) were evaluated as controls. imaging experiments revealed that all of the reagents are able to visualise the deep lymphatic system except gd-dtpa-dimeglumine. to visualise the lymphatic vessels and lymph nodes, pamam-g8 and dab-g5 were used, respectively. while pamam-g4 provided good contrast of both the nodes and connecting vessels, gadomer-17 was able to visualise lymph nodes, but not as clear as gd-based dendrimers. kobayashi also delivered various gd-pamam (pamam-g2, pamam-g4, pamam-g6, pamam-g8) and dab-g5 dendrimers to the sentinel lymph nodes and evaluated its visualisation with other nodes. the g6 dendrimer provided excellent opacifi cation of sentinel lymph nodes and was able to be absorbed and retained in the lymphatic system [ 103 ] . using a combination of mri and fl uorescence with pamam-g6-gd-cy, the sentinel nodes were more clearly observed signifying the potential of the dendrimers as platform for dual imaging. kobayashi et al. further overcame the sensitivity limitation and depth limitations of each individual method by the simultaneous use of two modalities (radionuclide and optical imaging). making use of pamam-g6 dendrimers conjugated with near-infrared (nir) dyes and an 111 in radionuclide probe, multimodal nanoprobes were developed for radionuclide and multicolour optical lymphatic imaging [ 104 , 105 ] . later kobayashi also proposed the use of quantum dots for labelling cancer cells and dendrimer-based optical agents for visualising lymphatic drainage and identifying sentinel lymph nodes [ 106 ] . polylysine dendrimers have been best used for targeting the lymphatic system and lymph nodes. carbon nanotubes (cnt) possess various mechanochemical properties like high surface area, mechanical strength and thermal and chemical stability which cause them to be versatile carriers for drugs, proteins, radiologicals and peptides to target tumour tissues. hydrophilic multiwalled carbon nanotubes (mwnts) coated with magnetic nanoparticles (mn-mwnts) have emerged as an effective delivery system for lymphatic targeting following subcutaneous injection of these particles into the left footpad of sprague dawley rats; the left popliteal lymph nodes were dyed black. mn-mwnts were favourably absorbed by lymphatic vessels following their transfer into lymph nodes and no uptake was seen in chief internal organs such as the liver, spleen, kidney, heart and lungs. gemcitabine loaded in these particles was evaluated for its lymphatic delivery effi ciency and mn-mwnts-gemcitabine displayed the maximum concentration of gemcitabine in the lymph nodes [ 107 ] . mcdevitt et al. synthesised tumour-targeting water-soluble cnt constructs by covalent attachment of monoclonal antibodies like rituximab and lintuzumab using 1,4,7,10-tetraazacyclododecane-1,4,7,10-tetraacetic acid (dota) as a metal ion chelator while the fl uorescent probe was fl uorescein. cnt-([ 111 in] dota) (rituximab) explicitly targeted a disseminated human lymphoma in vivo trials compared to the controls cnt-([ 111 in] dota) (lintuzumab) and [ 111 in]rituximab [ 108 ] . tsuchida and coworkers evaluated the drug delivery effi ciency of water-dispersed carbon nanohorns in a non-small cell lung cancer model. polyethylene glycol (peg)-doxorubicin conjugate bound oxidised single-wall carbon nanohorns (oxswnhs) injected intratumourally into mice bearing human non-small cell lung cancer (nci-h460) caused a signifi cant retardation of tumour growth. histological analyses showed (probably by means of interstitial lymphatic fl uid transport), migration of oxswnhs to the axillary lymph node occurred which is a major site of breast cancer metastasis near the tumour [ 109 ] . shimada et al. described a silica particle-based lymphatic drug delivery system of bleomycin and compared its therapeutic effi cacy to that of free bleomycin solution in a transplanted tumour model in animals. silica particle-adsorbed bleomycin showed considerable inhibitory effect on tumour growth and lymph node metastasis compared to free bleomycin solution [ 110 ] . activated carbon particles of aclarubicin are used for adsorption and sustained release into lymph nodes. upon subcutaneous administration into the fore foot-pads of rats these particles showed signifi cantly elevated distribution of aclarubicin to the auxiliary lymph nodes compared to aqueous solution of the drug [ 111 ] . activated carbon particles of aclacinomycin a, adriamycin, mitomycin c and pepleomycin have also been used by another group for adsorption. higher level of drug concentration was maintained in the new dosage form than in the solution form [ 112 ] . antibody-drug conjugates enhance the cytotoxic activity of anticancer drugs by conjugating them with antibodies. antibodies conjugated with cytostatic drugs such as calicheamicin have been used for the treatment of various lymphomas, including non-hodgkin b-cell lymphoma (nhl), follicular lymphoma (fl) and diffuse large b-cell lymphoma (dlbcl) [ 113 -116 ] . cd20 b-cell marker is expressed on the surface membrane of pre-b-lymphocytes and mature b-lymphocytes. the anti-cd20 mab rituximab (rituxan) is now the most potential antibody for the treatment of non-hodgkin b-cell lymphomas (b-nhl) [ 117 ] . rituximab-conjugated calicheamicin elevated the antitumour activity of rituximab against human b-cell lymphoma (bcl) xenografts in preclinical models [ 118 ] . cd22 is a b-lymphoid lineage-specifi c differentiation antigen expressed on the surface of both normal and malignant b-cells. hence, the cd22-specifi c antibody could be effective in delivering chemotherapeutic drugs to malignant b-cells. also, cd22 (siglec-2) antibodies targeting to cd22 are suited for a trojan horse strategy. thus, antibody-conjugated therapeutic agents bind to the siglec and are carried effi ciently into the cell [ 119 ] . a lot of interest has been seen in clinical progress of the conjugated anti-cd22 antibodies, especially inotuzumab ozogamicin (cmc-544) [ 120 ] . cd30 is expressed in the malignant hodgkin and reed-sternberg cells of classical hodgkin lymphoma (hl) and anaplastic large-cell lymphoma. younes and bartlett reported an ongoing phase i dose-escalation trial in relapsed and refractory hl patients with seattle genetics (sgn-35) , a novel anti-cd30-antibody-monomethylauristatin e conjugate. sgn-35 was stable in the blood and released the conjugate only upon internalisation into cd30-expressing tumour cells [ 121 ] . huang et al. constructed (anti-her2/ neu -igg3-ifn α ), another antibody-drug conjugate, and examined its effect on a murine b-cell lymphoma, 38c13, expressing human her2/ neu , and this signifi cantly inhibited 38c13/her2 tumour growth in vivo [ 122 ] . hybrid systems use combination of two or more delivery forms for effective targeting. khatri et al. prepared and investigated the in vivo effi cacy of plasmid dna-loaded chitosan nanoparticles for nasal mucosal immunisation against hepatitis b. chitosan-dna nanoparticles prepared by the coacervation process adhered to the nasal or gastrointestinal epithelia and are easily transported to the nasal-associated lymphoid tissue (nalt) and peyer's patches of the gut-associated lymphoid tissue (galt) both as iga inductive site [ 123 ] , in which chitosan-dna might be taken in by m cell, and transported across the mucosal boundary and thereby transfect immune cells within nalt or galt [ 124 ] . a work demonstrates targeting of three peptides containing sequences that bind to cell markers expressed in the tumour vasculature (p24-nrp-1 and p39-flt-1) [ 125 , 126 ] and tumour lymphatics (p47-lyp-1) [ 127 ] and were tested for their ability to target 3(nitrilotriacetic acid)-ditetradecylamine (nta3-dtda) containing liposomes to subcutaneous b16-f1 tumours. signifi cantly, a potential antitumour effect was seen after administration of doxorubicin-loaded peg750 liposomes engrafted with p24-nrp-1. hybrid liposomes composed of l -α-dimyristoylphosphatidylcholine and polyoxyethylene (25) dodecyl ether prepared by sonication showed remarkable reduction of tumour volume in model mice of acute lymphatic leukaemia (all) treated intravenously with hl-25 without drugs after the subcutaneous inoculation of human all (molt-4) cells was verifi ed in vivo. prolonged survival (>400 %) was noted in model mice of all after the treatment with hl-25 without drugs [ 128 ] . in a report, lyp-1 peptide-conjugated pegylated liposomes loaded with fl uorescein or doxorubicin were prepared for targeting and treating lymphatic metastatic tumours. the in vitro cellular uptake and in vivo near-infrared fl uorescence imaging results confi rmed that lyp-1-modifi ed liposome increased uptake by tumour cells and metastatic lymph nodes. in another study, in vitro cellular uptake of peg-plga nanoparticle (lyp-1-nps) was about four times that of peg-plga nanoparticles without lyp-1 (nps). in vivo study, about eight times lymph node uptake of lyp-1-nps was seen in metastasis than that of nps, indicated lyp-1-np as a promising carrier for targetspecifi c drug delivery to lymphatic metastatic tumours [ 129 ] . currently, surgery, radiation therapy and chemotherapy are the principal methods for cancer treatment. gene therapies may act synergistically or additively with them. for example, another case demonstrated that replacement of the p53 (protein 53) gene in p53-defi cient cancer cell lines enhanced the sensitivity of these cells to ad-p53 (adenovirus-expressed protein 53) and cisplatin (cddp) and resulted into greater tumour cell death [ 130 ] . later, son and huang [ 131 ] stated that treatment of cddp-resistant tumour cells with cddp increased the sensitivity of these cells to transduction by dna-carrying liposomes. also, chen et al. [ 132 ] described that to improve tumour killing, herpes simplex virus thymidine kinase (hsv-tk) and interleukin (il) expression can be combined. on the whole, greater therapeutic effect can be achieved by effectively combining conventional cancer treatments and gene therapy together. mainly colloidal carriers have emerged as potential targeting agents to lymphatic system. physicochemical properties affect the effi ciency of colloid uptake into the lymphatic system [ 28 ]. these properties include size, number of particles, surface charge, molecular weight and colloid lipophilicity. physicochemical properties are altered by adsorption of group of hydrophilic polymers like poloxamers and poloxamines to the particle surface. these properties modifi ed the biodistribution of particles in vivo, particularly the avoidance of the reticuloendothelial system (res) upon intravenous administration [ 133 , 134 ] . in one study, it was opined that opsonisation may cause alteration of the particle surface in vivo [ 135 ] . size could be important factor in defi ning the behaviour of particulates after subcutaneous injection. small particles with diameter less than a few nanometres generally exchanged through the blood capillaries, whereas larger particles of diameters up to a few tens of nanometres absorbed into the lymph capillaries. but particles over a size of few hundred nanometres remain trapped in the interstitial space for a long time [ 136 ] . christy et al. have shown a relationship between colloid size and ease of injection site drainage using model polystyrene nanospheres after subcutaneous administration to the rat [ 137 ] . results showed distribution of polystyrene nanospheres in the size range 30-260 nm 24 h after administration and 74-99 % of the recovered dose retained at the administration site, and as particle diameter increased, drainage became slower. it has been proposed earlier that the optimum colloid size range for lymphoscintigraphic agents is l0-50 nm [ 138 ] . size has less importance when colloids are administered intraperitoneally (i.p.) within the nanometre size range, as drainage is only from a cavity into the initial lymphatics; hence, no diffusion is required through the interstitial space [ 28 ]. the size limit of the open junctions of the initial lymphatic wall is the only barrier to uptake from the peritoneal cavity into the lymphatics [ 139 ] . more number of particles at the injection site decreases their rate of drainage, owing to increased obstruction of their diffusion through the interstitial space [ 139 , 140 ] . scientists at nottingham university investigated this effect using polystyrene nanospheres of 60 nm. following administration to the rat, the concentration range of nanospheres was approximately 0.05-3.0 mg/ml. lower lymphatic uptake was seen on increasing the concentration of nanospheres in the injection volume due to slower drainage from the injection site. injecting oily vehicles intramuscularly to the rat, the effect of injection volume has been studied. increasing volume of sesame oil accelerated oil transport into the lymphatic system. upon s.c. administration, volumes of aqueous polystyrene particle suspensions have been investigated in the range 50-150 μl [ 39 ]. surface charge studies have been done utilising liposome as colloidal carrier. the surface charge of liposomes affected their lymphatic uptake from s.c. and i.p. injection sites. negatively charged liposomes showed faster drainage than that for positive liposomes after i.p. administration [ 141 ] . patel et al. also indicated that liposome localisation in the lymph nodes followed a particular order negative > positive > neutral [ 142 ] . macromolecule having high molecular weight has a decreased ability for exchange across blood capillaries and lymphatic drainage becomes the route of drainage from the injection site which shows a linear relationship between the molecular weight of macromolecules and the proportion of the dose absorbed by the lymphatics. for a compound to be absorbed by the lymphatics, the molecular weight should range between 1,000 and 16,000 [ 141 , 143 ] . the effect of molecular weight becomes negligible when targeting carriers to the lymphatic system as the molecular weight of a colloidal carrier is generally less than 1,000 da. the most important determinant of the phagocytic response and so lymphatic uptake is the lipophilicity of a colloid [ 144 ] . opsonins generally unite with lipophilic rather than hydrophilic surfaces; hence, the hydrophilic particles show reduced phagocytosis [ 145 ] . hydrophobic polystyrene nanospheres adsorbed with hydrophilic block copolymers showed drastic reduction in phagocytosis prior to i.v. administration [ 146 ] . in the case of polystyrene nanospheres of 60-nm diameter, peo chains of the poloxamers and poloxamines adsorbed onto the surface of the particle described the relationship between interstitial injection site drainage and lymph node uptake in rat [ 144 ] . uncoated nanospheres of this diameter showed reduced drainage from the injection site with 70 % of the administered dose remaining after 24 h. the adsorption of block copolymers can enhance the drainage from the injection site such that levels remaining at the injection site may be as little as 16 % after 24 h, with very hydrophilic polymers such as poloxamine 908. uptake of nanospheres into the regional lymph nodes may also be improved by the adsorption of block copolymers with intermediate lengths of polyoxyethylene, such as poloxamine 904. this polymer may sequester up to 40 % of the given dose by the lymph nodes after 24 h [ 147 ] . surface modifi cation could prove as an effective strategy for potential targeting to lymphatic system. the infl uence can be quoted in following ways. coating of a carrier with hydrophilic and sterically stabilised peg layer can successfully enhance lymphatic absorption, reducing specifi c interaction of particle with the interstitial surrounding, and inhibit the formation of too large particle structure [ 49 ] . surface modifi cation of liposomes with peg also does not have a significant effect on lymph node uptake. small liposomes coated with peg showed greatest clearance from the s.c. injection site with small 86-nm peg-coated liposomes having <40 % remaining at the injection site at 24 h. larger neutral and negatively charged liposomes had a clearance >60 % remaining at the initial s.c. injection site. however, this smaller amount of large liposomes that were cleared from the injection site was compensated by better retention in the lymph node [ 148 ] . oussoren et al. reported that the amount of liposomes cleared from the injection site was somewhat greater with the peg-coated liposomes [ 149 ] . this improved clearance did not result in improved lymph node retention because the fraction of peg liposomes retained by the lymph node is decreased. phillips et al. also studied the slightly improved clearance of peg-coated liposomes from the s.c. injection site [ 148 ] . porter and coworkers demonstrated that pegylation of poly-l -lysine dendrimers resulted into better absorption from s.c. injection sites and stated that the extent of lymphatic transport may be improved by increasing the size of the pegylated dendrimer complex. they estimated the lymphatic uptake and lymph node retention properties of several generation four dendrimers coated with peg or 4-benzene sulphonate after subcutaneous administration in rats. for this surface modifi cation study, three types of pegs with molecular weights of 200, 570 or 2,000 da were taken. peg200-derived dendrimers showed rapid and complete absorption into the blood when injected subcutaneously, and only 3 % of the total given dose was found in the pooled thoracic lymph over 30 h, whereas peg570-and peg2000derived dendrimers showed lesser absorption, and a higher amount was recovered in lymphatics (29 %) over 30 h. however, the benzene sulphonate-capped dendrimer was not well absorbed either in the blood or in lymph following subcutaneous injection [ 150 ] . carriers capped with nonspecifi c human antibodies as ligands showed greater lymphatic uptake and lymph node retention compared to uncoated one at the s.c. site. liposomes coated with the antibody, igg, have been shown to increase lymph node localisation of liposomes to 4.5 % of the injected dose at 1 h, but this level decreased to 3 % by 24 h [ 151 ] . in a study, the liposomes containing positively charged lipids had approximately 2-3 times the lymph node localisation (up to 3.6 % of the injected dose) than liposomes containing neutral or negatively charged lipids (1.2 % of the injected dose) [ 149 ] . attachment of mannose to the surface of a liposome increased lymph node uptake by threefold compared to control liposomes [ 152 ] . another study demonstrated hbsag entrapped dried liposomes with their surfaces modifi ed with galactose. pharmacokinetic study in rats showed that galactosylated liposomes delivered higher amounts of hbsag to the regional lymph nodes than other ungalactosylated formulations [ 153 ] . lectin is another ligand that can be attached to the carriers for improved targeting to intestinal lymphatics. bovine serum albumin containing acid phosphatase model protein and polystyrene microspheres conjugated with mouse m-cell-specifi c ulex europaeus lectin. ex vivo results showed that there was favoured binding of the lectin-conjugated microspheres to the follicle-associated epithelium. final results indicated that coupling of ligands such as lectin specifi c to cells of the follicleassociated epithelium can improve the targeting of encapsulated candidate antigens for delivery to the peyer's patches of the intestine for better oral delivery [ 154 ] . to improve carrier retention in lymph nodes, a new method of increasing lymphatic uptake of subcutaneously injected liposome utilises the high-affi nity ligands biotin and avidin. biotin is a naturally occurring cofactor and avidin is a protein derived from eggs. avidin and biotin are having extremely high affi nity for each other. for instance, upon injection, the avidin and the biotin liposomes move into the lymphatic vessels. biotin liposomes that migrate through the lymphatic vessels meet the avidin resulting in an aggregate that becomes trapped in the lymph nodes [ 155 , 156 ] . the biotin liposome/avidin system has promising potential as therapeutic agent for delivery to lymph nodes. it can be applied not only to s.c. targeting of lymph nodes but also to intracavitary lymph node targeting [ 50 ] . different ligands with their application in lymphatic targeting are represented in table 20 .1 . the lymphatics have the potential to play a major role in anticancer treatment as lymphatic spread is recognised to precede haematological spread in many cancers including melanoma, breast, colon, lung and prostate cancers. currently, the focus is on the development of drug carriers that can localise chemotherapy to the lymphatic system, thus improving the treatment of localised disease while minimising the exposure of healthy organs to cytotoxic drugs. the delivery of novel carriers to lymph nodes for therapeutic purposes has much promise. giving importance to the lymphatic route in metastasis, this delivery system may have great potential for targeted delivery of various therapeutic agents to tumours and their metastatic lymph nodes. various delivery systems have been discussed here but colloidal carriers, especially, liposomes have been the carrier of choice to date. the purpose of this review is to provide an improved and effective lymphotropic system with a satisfactory quality for clinical use and to establish a preparation method applicable for industrial production. surface-engineered lymphotropic systems may prove as an effective carrier for anti-hiv, anticancer and oral vaccine delivery in near future. 3. delivery of antigens to gut-associated lymphoid tissue (galt) intestinal delivery [ 154 ] 4. microparticles active targeting of peripheral lymph nodes doppler ultrasonography contrast agent [ 159 ] 5. lymph vaccine delivery [ 148 -150 ] 6. 13. block copolymer of poloxamine and poloxamer nanospheres regional lymph nodes [ 144 ] 14. lyp-1 nanoparticles, liposomes targeted to lymphatic vessels and also in tumour cells within hypoxic area antitumour [ 129 ] 15. liposomes targeting to lymph node mediastinal lymph node targeting [ 155 ] 16. liposome targeting to lymph node increased lymph node retention [ 145 , 151 ] 25 the physiology of the lymphatic system the anatomy and development of the jugular lymph sacs in the domestic cat (felis domestica) on the origin of the lymphatic system from the veins and the development of the lymph hearts and thoracic duct in the pig dual origin of avian lymphatics lineage tracing demonstrates the venous origin of the mammalian lymphatic vasculature an essential role for prox1 in the induction of the lymphatic endothelial cell phenotype prox1 function is required for the development of the murine lymphatic system live imaging of lymphatic development in the zebrafi sh developmental and pathological lymphangiogenesis: from models to human disease tumor lymphangiogenesis and melanoma metastasis cardiovascular physiology new insights into the molecular control of the lymphatic vascular system and its role in disease advanced colloid-based systems for effi cient delivery of drugs and diagnostic agents to the lymphatic tissues the structure of lymphatic capillaries in lymph formation specifi c adhesion molecules bind anchoring fi laments and endothelial cells in human skin initial lymphatics focal adhesion molecules expression and fi brillin deposition by lymphatic and blood vessel endothelial cells in culture the second valve system in lymphatics evidence for a second valve system in lymphatics: endothelial microvalves ultrastructural studies on the lymphatic anchoring fi laments new horizons for imaging lymphatic function lymphatic smooth muscle: the motor unit of lymph drainage the fi ne structure and functioning of tissue channels and lymphatics clinically oriented anatomy lymphangiogenesis in development and human disease acyclic nucleoside phosphonate analogs delivered in ph-sensitive liposomes liposomes for drug targeting in the lymphatic system liposomes to target the lymphatics by subcutaneous administration novel method of greatly enhanced delivery of liposomes to lymph nodes current concepts in lymph node imaging old friends, new ways: revisiting extended lymphadenectomy and neoadjuvant chemotherapy to improve outcomes targeted delivery of indinavir to hiv-1 primary reservoirs with immunoliposomes studies on lymphoid tissue from hivinfected individuals: implications for the design of therapeutic strategies lymphoid tissue targeting of anti-hiv drugs using liposomes a randomized clinical trial comparing single-and multi-dose combination therapy with diethylcarbamazine and albendazole for treatment of bancroftian fi lariasis infección bacteriana por ántrax bioterrorism-related inhalational anthrax: the fi rst 10 cases reported in the united states fatal inhalational anthrax in a 94-year-old connecticut woman extrapulmonary tuberculosis: clinical and epidemiologic spectrum of 636 cases nanoparticles for drug delivery in cancer treatment polymeric drugs for effi cient tumor-targeted drug delivery based on epr-effect exploiting the enhanced permeability and retention effect for tumor targeting drug targeting and tumor heterogeneity does a targeting ligand infl uence nanoparticle tumor localization or uptake? high affi nity restricts the localization and tumor penetration of single-chain fv antibody molecules vcam-1 directed immunoliposomes selectively target tumor vasculature in vivo lymphatic targeting with nanoparticulate system a lymphotropic colloidal carrier system for diethylcarbamazine: preparation and performance evaluation evaluation of endoscopic pirarubicin-lipiodol emulsion injection therapy for gastric cancer targeted lymphatic transport and modifi ed systemic distribution of ci-976, a lipophilic lipid-regulator drug, via a formulation approach self-emulsifying drug delivery systems (sedds) of coenzyme q 10: formulation development and bioavailability assessment liposomes as vehicles for the local release of drugs enhanced oral bioavailability and intestinal lymphatic transport of a hydrophilic drug using liposomes delivery of liposomal doxorubicin (doxil) in a breast cancer tumor model: investigation of potential enhancement by pulsed-high intensity focused ultrasound exposure reduced cardiotoxicity and comparable effi cacy in a phase iii trial of pegylated liposomal doxorubicin hcl (caelyx™/doxil®) versus conventional doxorubicin for fi rst-line treatment of metastatic breast cancer doxil offers hope to ks sufferers liposomal doxorubicin (doxil): in vitro stability, pharmacokinetics, imaging and biodistribution in a head and neck squamous cell carcinoma xenograft model caelyx/doxil for the treatment of metastatic ovarian and breast cancer patent blue v encapsulation in liposomes: potential applicability to endolympatic therapy and preoperative chromolymphography aerosol delivery of liposomal formulated paclitaxel and vitamin e analog reduces murine mammary tumor burden and metastases novel vitamin e analogue and 9-nitro-camptothecin administered as liposome aerosols decrease syngeneic mouse mammary tumor burden and inhibit metastasis use of liposome preparation to treat mycobacterial infections nanoradioliposomes molecularly modulated to study the lung deep lymphatic drainage solid lipid nanoparticles in lymph and plasma after duodenal administration to rats duodenal administration of solid lipid nanoparticles loaded with different percentages of tobramycin transmucosal transport of tobramycin incorporated in sln after duodenal administration to rats. part i-a pharmacokinetic study pharmacokinetics and tissue distribution of idarubicin-loaded solid lipid nanoparticles after duodenal administration to rats infl uence of administration route on tumor uptake and biodistribution of etoposide loaded solid lipid nanoparticles in dalton's lymphoma tumor bearing mice metastatic patterns in small-cell lung cancer: correlation of autopsy fi ndings with clinical parameters in 537 patients metastatic pattern in non-resectable non-small cell lung cancer solid lipid nanoparticles (sln) for controlled drug delivery-a review of the state of the art lymphatic uptake of pulmonary delivered radiolabelled solid lipid nanoparticles infl ammation imaging using tc-99m dextran intralymphatic chemotherapy using a hyaluronan-cisplatin conjugate lymph node localisation of biodegradable nanospheres surface modifi ed with poloxamer and poloxamine block co-polymers block copolymer carrier systems for translymphatic chemotherapy of lymph node metastases translymphatic chemotherapy by intrapleural placement of gelatin sponge containing biodegradable paclitaxel colloids controls lymphatic metastasis in lung cancer enhancement of therapeutic effi cacy of bleomycin by incorporation into biodegradable poly-d, l-lactic acid targeting colloidal particulates to thoracic lymph nodes comparison of dendrimer-based macromolecular contrast agents for dynamic micro-magnetic resonance lymphangiography delivery of gadolinium-labeled nanoparticles to the sentinel lymph node: comparison of the sentinel node visualization and estimations of intra-nodal gadolinium concentration by the magnetic resonance imaging multimodal nanoprobes for radionuclide and fi ve-color near-infrared optical lymphatic imaging a dendrimer-based nanosized contrast agent dual-labeled for magnetic resonance and optical fl uorescence imaging to localize the sentinel lymph node in mice multicolor imaging of lymphatic function with two nanomaterials: quantum dot-labeled cancer cells and dendrimerbased optical agents hydrophilic multi-walled carbon nanotubes decorated with magnetite nanoparticles as lymphatic targeted drug delivery vehicles tumor targeting with antibody-functionalized, radiolabeled carbon nanotubes waterdispersed single-wall carbon nanohorns as drug carriers for local cancer chemotherapy enhanced effi cacy of bleomycin adsorbed on silica particles against lymph node metastasis derived from a transplanted tumor selective distribution of aclarubicin to regional lymph nodes with a new dosage form: aclarubicin adsorbed on activated carbon particles carbon dye as an adjunct to isosulfan blue dye for sentinel lymph node dissection safety, pharmacokinetics, and preliminary clinical activity of inotuzumab ozogamicin, a novel immunoconjugate for the treatment of b-cell non-hodgkin's lymphoma: results of a phase i study therapeutic potential of cd22-specifi c antibody-targeted chemotherapy using inotuzumab ozogamicin (cmc-544) for the treatment of acute lymphoblastic leukemia antibody-targeted chemotherapy with cmc-544: a cd22-targeted immunoconjugate of calicheamicin for the treatment of b-lymphoid malignancies preclinical anti-tumor activity of antibody-targeted chemotherapy with cmc-544 (inotuzumab ozogamicin), a cd22-specifi c immunoconjugate of calicheamicin, compared with non-targeted combination chemotherapy with cvp or chop rituximab (rituxan®/mabthera®): the fi rst decade cd20-specifi c antibody-targeted chemotherapy of non-hodgkin's b-cell lymphoma using calicheamicin-conjugated rituximab siglecs as targets for therapy in immune-cell-mediated disease clinical activity of the immunoconjugate cmc-544 in b-cell malignancies: preliminary report of the expanded maximum tolerated dose (mtd) cohort of a phase 1 study objective responses in a phase i dose-escalation study of sgn-35, a novel antibody-drug conjugate (adc) targeting cd30, in patients with relapsed or refractory hodgkin lymphoma targeted delivery of interferonalpha via fusion to anti-cd20 results in potent antitumor activity against b-cell lymphoma chitosan for mucosal vaccination polysaccharide colloidal particles as delivery systems for macromolecules suppression of tumor growth and metastasis by a vegfr-1 antagonizing peptide identifi ed from a phage display library antiangiogenic and antitumor activities of peptide inhibiting the vascular endothelial growth factor binding to neuropilin-1 a tumor-homing peptide with a targeting specifi city related to lymphatic vessels chemotherapy with hybrid liposomes for acute lymphatic leukemia leading to apoptosis in vivo lyp-1-conjugated nanoparticles for targeting drug delivery to lymphatic metastatic tumors successful treatment of primary and disseminated human lung cancers by systemic delivery of tumor suppressor genes using an improved liposome vector exposure of human ovarian carcinoma to cisplatin transiently sensitizes the tumor cells for liposome-mediated gene transfer combination gene therapy for liver metastasis of colon carcinoma in vivo polymeric microspheres as drug carriers the organ distribution and circulation time of intravenously injected colloidal carriers sterically stabilized with a blockcopolymerpoloxamine 908 fate of liposomes in vivo: a brief introductory review the characterisation of radio colloids used for administration to the lymphatic system effect of size on the lymphatic uptake of a model colloid system radiolabeled colloids and macromolecules in the lymphatic system electron microscopic studies on the peritoneal resorption of intraperitoneally injected latex particles via the diaphragmatic lymphatics lymphatic transport of liposomeencapsulated drugs following intraperitoneal administration-effect of lipid composition assessment of the potential uses of liposomes for lymphoscintigraphy and lymphatic drug delivery failure of 99-technetium marker to represent intact liposomes in lymph nodes effect of molecular weight on the lymphatic absorption of water-soluble compounds following subcutaneous administration surface engineered nanospheres with enhanced drainage into lymphatics and uptake by macrophages of the regional lymph nodes serum opsonins and liposomes: their interaction and opsonophagocytosis physicochemical principles of pharmacy targeting of colloids to lymph nodes: infl uence of lymphatic physiology and colloidal characteristics evaluation of [(99m) tc] liposomes as lymphoscintigraphic agents: comparison with [(99m) tc] sulfur colloid and [(99m) tc] human serum albumin lymphatic uptake and biodistribution of liposomes after subcutaneous injection: iii. infl uence of surface modifi cation with poly(ethyleneglycol) pegylation of polylysine dendrimers improves absorption and lymphatic targeting following sc administration in rats lymph node localization of non-specifi c antibody-coated liposomes modifi ed in vivo behavior of liposomes containing synthetic glycolipids enhanced lymph node delivery and immunogenicity of hepatitis b surface antigen entrapped in galactosylated liposomes targeted delivery of antigens to the gut-associated lymphoid tissues: 2. ex vivo evaluation of lectinlabelled albumin microspheres for targeted delivery of antigens to the m-cells of the peyer's patches avidin/biotin-liposome system injected in the pleural space for drug delivery to mediastinal lymph nodes pharmacokinetics and biodistribution of 111 in avidin and 99 tc biotin-liposomes injected in the pleural space for the targeting of mediastinal nodes folate-peg-ckk 2-dtpa, a potential carrier for lymph-metastasized tumor targeting nanotechnology in cancer therapeutics: bioconjugated nanoparticles for drug delivery molecular targeting of lymph nodes with l-selectin ligand-specifi c us contrast agent: a feasibility study in mice and dogs hyaluronan in drug delivery lymphatic targeting of zidovudine using surfaceengineered liposomes alginate/chitosan microparticles for tamoxifen delivery to the lymphatic system homing of negatively charged albumins to the lymphatic system: general implications for drug targeting to peripheral tissues and viral reservoirs key: cord-285350-64mzmiv3 authors: bhagatkar, nikita; dolas, kapil; ghosh, r. k.; das, sajal k. title: an integrated p2p framework for e-learning date: 2020-06-29 journal: peer peer netw appl doi: 10.1007/s12083-020-00919-0 sha: doc_id: 285350 cord_uid: 64mzmiv3 the focus of this paper is to design and develop a peer-to-peer presentation system (p2p-ps) that supports e-learning through live media streaming coupled with a p2p shared whiteboard. the participants use the “ask doubt” feature to raise and resolve doubts during a session of ongoing presentation. the proposed p2p-ps system preserves causality between ask doubt and its resolution while disseminating them to all the participants. a buffered approach is employed to enhance the performance of p2p shared whiteboard, which may be used either in tandem with live media streaming or in standalone mode. the proposed system further extends p2p interactions on stored contents (files) built on top of a p2p file sharing and searching module with additional features. the added features allow the creation of mash-up presentations with annotations, posts, comments on audio, video, and pdf files as well as a discussion forum. we have implemented the p2p file sharing and searching system on the de bruijn graph-based overlay for low latency. extensive experiments were carried out on emulab to validate the p2p-ps system using 200 physical nodes. the effectiveness of an e-learning system can be judged by the form and opportunities for interactions it offers to the peers for the assimilation of both live and stored contents. the goal of this paper is to build a platform that facilitates meaningful peer-to-peer (p2p) collaborations between the presenter and the audience. the proposed 1. collaborative creation of forms, reports or documents (e.g., overleaf [7] and google drive [6] ). 2. collaborative development of large computer programs (e.g., github [5]). 3. peer-to-peer tutoring/reciprocal learning by teaching (e.g., duolingo [53] , coursera [2] , kahoot [3] , and brainly [1] , zoom [13, 15] ). social networking inspired the first two categories of e-learning systems. these systems combine online access through web-based interfaces that can mimic a live presentation or classroom lectures with a facebook like secure interaction environment between the presenter and the audience. the third category of systems focus on the delivery of recorded video contents to learners with features like tracking progress of the learners. they include online tests, quizzes, assignments setting up a twoway interaction between the learners and the teachers and also connect with learning management system (lms) like canvas [18] or moodle [25] . however, brainly [1] fig. 1 collaborative e-learning systems follows a slightly different approach. it is predominantly a peer-to-peer learning tool that harnesses the crowdsourcing of like-minded students (peers) to combine problemsolving skills. it uses deep learning techniques based on historical data to predict particular requirements of a learner. thus, a majority of social network based e-learning platforms provide only passive discussion forums with variations like incorporating peer mentoring and progress tracking. man-to-many interaction is integral to active learning as it happens in a live classroom or a physical meeting of a group of participants. a video conferencing system such as skype [11] or hangout [26] allows one-to-many screen sharing but not many-to-many sharing. however, the primary purpose of such a system is communication. so, the issues like scaling and many-to-many interaction are not addressed in a video conferencing system. for example, a group call in skype is restricted to only 25. zoom [13] and webex [8] , on the other hand, are developed as a cloud supported video meeting platforms with multiple screen sharing feature. webex is a product of cisco inc. zoom can now be considered as the leader of modern enterprise video communication. these video conferencing systems scale up well in simultaneous manyto-many sharing. the participants can also use a whiteboard for co-annotations during video meetings. however, the video sharing feature deteriorates with the increase in the number of users. both zoom and webex being proprietary systems require subscriptions. zoom's basic (free) version does not offer many of the sharing features and limited to 40 minutes meeting slots. webex does not have a free version. recording of proceedings is allowed but with certain applicable limits on size. furthermore, with increase in demand the basic users are forced to switch to audio conferencing mode. neither zoom nor webex allow media annotation. with touch screens document annotations are possible as inline images. as is the case with all proprietary systems, the architecture and internal details of implementations of both zoom and webex are either too sketchy or not available. therefore, we are unable to make a meaningful comparison. the p2p-ps proposed in this paper is a light weight system that provides all the features zoom or webex can offer. it does not require any server or cloud support for operation. additionally, it allows one to create presentations and p2p discussions on the stored material. co-annotations are possible during live video sessions. annotations on store material are linked as meta data. one can annotate either media or documents. for annotations on a media clip in a file, a user marks out both start and end time and puts text of the annotation. such annotations are not possible either on zoom or webex. any user may comment on annotated material, setup a live video chat with other peers sharing a whiteboard alongside. the system can thus provide breakout sessions of groups for brain storming on a variety of stored content which aid reflective learning. it will, therefore, cater to the needs of a small group of learners and educators typical to a university campus. in current situation of pandemic outbreak of covid-19, our p2p-ps may, in fact, be an ideal complementary system for offloading pressure on zoom on webex. to organize the stored contents, we used an efficient implementation of distributed hash table ( dht) based on de bruijn graphs. a summary of features of the proposed p2p-ps system is as follows: -it deploys a modified mesh-based architecture to leverage the spare capacities at the peers for flow control during the streaming of live media. the first peer who starts a streaming session is referred to as the 'speaker' while other peers joining later are 'listeners'. -it allows a listener to initiate a query (e.g., seek clarifications) during live streaming using the "ask doubt" feature. the speaker alone may enable the dissemination of queries for maintaining the causality relation between a query and its corresponding explanation. -all interacting peers (the speaker and listeners) may use p2p shared whiteboard using shapes, colors, and free-pen to illustrate points or seek clarifications. the whiteboard may be used in conjunction with live streaming. -it incorporates an efficient dht-based sharing and searching of media and document files, which is implemented using de bruijn graph-based overlays. -it allows tagging of stored material (both media and documents) for annotations, posts, comments, and announcements, which are stored separately to preserve the file mappings in dht. additionally, a gossip protocol is used for fast synchronization of posts, comments, and announcements. -it facilitates the creation of a mash-up presentation on the stored contents, which may optionally be accompanied by a p2p shared whiteboard. we evaluated the proposed p2p-ps with the help of emulab [49] , a free testbed for emulation of network and distributed applications. we used 200 physical nodes on emulab to carry out experiments. these physical nodes are container nodes on which we installed the p2p-ps software for emulation. we carried out separate experiments for streaming and stored contents. experimental results not only establish the proof of concept but also indicate that interactive p2p-ps can become a useful e-learning tool for remote classroom teaching as well as reflective learning through p2p interactions on the stored contents. the rest of the paper is organized as follows. section 2 gives an overview of the system architecture. section 3 deals with live streaming using a modified mesh organization, while section 4 focuses on the p2p shared whiteboard. section 5 describes p2p-ps for stored contents. the design and implementation using de bruijn graphbased dht is discussed in section 6. section 8 presents annotations and discussion forum. it gives a comprehensive description on annotations of media and document files, posting of annotations, comments, and announcements. emulation based experimental results are discussed in section 9. existing literature on p2p research that motivated our work is summarized in section 10. finally, section 11 concludes the paper with directions for future research. figure 2 depicts a component level view of the p2p-ps system architecture. live streaming uses a modified mesh architecture that exploits spare capacity at peers and adjusts both inflow and outflow rates for dynamic fanout at peers. the streaming may optionally be accompanied by an online p2p shared whiteboard, or a pdf file, or both. a shared whiteboard may also be used in standalone mode as a scratchpad to brainstorm ideas using shapes, colors, and free-pen through p2p interactions. a listener may interrupt the speaker during a presentation session and send a query (seek clarifications) as in a live physical lecture. the proposed system maintains the causality of a listener's query and the speaker's explanation by ensuring that the dissemination of queries occurs only after the speaker enables them. a stored streaming media may also create a replay of stored presentation which may be optionally accompanied by stored pdf (document) files, or the p2p shared whiteboard, or both. the following features facilitate the interactions on stored media contents: (i) tagging parts of the media and pdf files, (ii) annotations of tagged portions, (iii) posting of annotated parts, (iv) comments on annotated parts, (v) announcements of the tagged parts, and (vi) discussion forum. we use a de bruijn graph-based overlay for implementing distributed hash table (dht) organization for stored fig. 2 overall system architecture of p2p-ps materials. there are two approaches for live media streaming on a p2p system [32] : -tree-based: it maintains a logical tree overlay where the root generates the stream, and other peers receive it from their respective parents. the leaves are free-riders. -mesh-based: a mesh architecture allows each peer to connect to other peers. it also lets a peer combine sub-streams received from more than one peer. a tree-based media streaming approach is inadequate for our purpose. first, when an internal node leaves the entire subtree below it is orphaned and get partitioned from the network. second, a tree cannot handle the dynamicity of peers joining and leaving the network. therefore, we used a variation of mesh-based media streaming. in the mesh-based p2p system, the nodes randomly connect, thus forming a mesh. these connections could be used to deliver data either unidirectionally or bidirectionally. for example, coolstreaming [56] maintains bidirectional connections, whereas prime [34] maintains unidirectional connections. the mesh-based approach uses a swarm-type content delivery strategy [50] as in bittorrent [35] . after receiving data from a server, a peer may act as a server for other peers. a peer collects data from other peers in parallel and combines it in a single file, thereby efficiently utilizing the bandwidth of its neighboring peers. it also reduces the load on the primary server because many peers now share the content distribution load. the modified mesh-based approach exploits the availability of spare capacity at a peer. the upload bandwidth and streaming rate together determine the fanout value. the feeder node to a fanout cannot support a faster outflow rate than its inflow rate. it essentially forms a logical a gradient relationship between the in-degree and the out-degree of a node for receiving the data packets continuously. in the following description, the terms "parent" and "children" will respectively refer to a node's neighbor in an inflow and an outflow path. during a media streaming session, all the nodes except for the ones connected directly to the source must maintain the relationship, 'in-degree ≤ out-degree.' since it does not preclude the relationship 'in-degree > out-degree,' a node may have a higher number of inflows than outflows implying a node has more parents than the number of children. however, we discard such a case because if a node receives more packets than it could send out, then eventually it will lead to a buffer overflow problem, leading to loss of multiple packets and re-transmissions. the media contents are divided into packet-sized chunks for streaming. the packets are propagated to the peers who assemble the received packets into a media file. in a live streaming the speaker is, typically, the source node which starts streaming of the media contents to its neighboring peers. the nodes other than the source are referred to as listeners. being a multi-parent, multi-child architecture, a peer could ask for packets from its parent peers and deliver the the received packets to its children. after authentication, the root (or the source) may start streaming data, but it does not send to any of its children (or listeners) unless the latter explicitly asked for the same. figure 3a shows a partial view of a random mesh structure created during execution of the streaming process. when a new peer p new joins the system, it gets a list of nodes from a known bootstrap server as indicated in fig. 3b . after attaching itself to the first parent, a node asks its parents for the latest (current) streaming packet id. p new generates the initial requests for packets according to algorithm 1. it essentially creates a gradient overlay network on top of a mesh network. suppose p new received its latest packet information from the parent p 0 . p new will then send requests for id 0 from p 0 , id 1 from p 1 , id 2 from p 2 , and in general, id k from p k where id i = id i−1 +1 for 1 ≤ i < k. for example, let p new get the initial reply from p 3 , the next from p 1 , and the next from p k , then p new requests for id k+1 from p 3 , id k+2 from p 1 , id k+3 from p k , and so on. figure 3b explains the strategy for subsequent pull requests for packets while the procedure is described by algorithm 2. so, there is no need for a separate scheduler process. each peer issues a request for the latest packet in a way akin to a reservation system. peer joining process has been explained in algorithm 1 except for the authentication process. initially, the root is responsible for generating the streaming content. it starts an active session on the mesh network. an ordinary peer or a listener joins by using the session key. the bootstrap server works also as the authentication server. a new peer, after authenticating itself submits the session key to join the streaming session. the bootstrap server adds the requesting peer to the active peer list after successful authentication (fig. 4) . the bootstrap server maintains a list of n active peers having spare capacity to facilitate the peer joining process. as indicated by algorithm 1, only a list of log 2 n out of the n active peers is provided to a new peer p new to extend the gradient overlay graph. the reason for it is two-fold. firstly, it ensures that the new peer can connect to the source by a low hop distance which improves the start up time for the new peer. secondly, if any peer leaves the network alternative peers will be available to serve the new peer. the peer p new first determines which of the peers in the received parent-list could respond to the pull requests. for that, p new sends "adopt-me" request to all the peers in the parent-list. a peer could accept or discard the request depending on its fanout value (out-degree) determined by the upload bandwidth (streaming rate). initially, in-degree = out-degree for all nodes except for the one directly connected to the source node. hence, if the number of parents (in-degree) of any node becomes less than its outdegree, then it requests the bootstrap server to send (indegree − current parent count) number of active peers. the bootstrap server randomly selects a set of required number of active peers from the entire active peer list and returns the same to the requesting node. when a node quits gracefully, it proactively informs about its exit to the other children and the bootstrap server. involuntary exit is a bit difficult to handle. there are two possible cases: 1. if the exiting node is not a source of inflow to any other peer in the network. 2. if the the exiting peer is the only source of inflow to certain peers then the streaming at orphaned nodes stops. in case 1, there can be no disruption. though disruption may occur in case 2, an orphaned peer has to remember the packet-id which it requested in recent past. the peer just issue a new request for lost packet-id after realizing that the parent has left. assume that it takes t units of time for transferring a packet between a sender and a receiver. if there is no response for "2t" units of time, then the peer compares the "req-id" with the id of the last packet it has received. if the requested id is less than the last packet id it has received, then the requester assumes that the parent node is either dead or does not have the packet. so, it creates another "req-id" for the same packet to a different peer. it means that non-availability of one packet may lead to a delay of at most 2t + t=3t time units. if a peer p experiences a delay of timeout > 2t units for a response from any of the parents p(p ), then p reports p(p ) to the bootstrap server bs. bs then initiates "are you alive?" message to p(p ). if there is no response for a time exceeding 2t, then p(p ) is assumed to be dead. the bootstrap server removes the non-responding node from the list of active peers list and informs the child node p about the loss of its parent so that it could update its parent list. however, no parent list exists. a parent stores the list of its children. hence, if a node exits involuntarily, then none of its children gets any information about the parent's exit. every child node has to figure out on its own about the failure of its parent. suppose a node is left with only one parent who also fails abruptly. the orphaned peer must then request the bootstrap server to assign a new set of parents which may incur a delay. we used a scheme of proactive allocation of parents to reduce the delay for allocating parents to the orphaned nodes. once a child notices that it has less than k/2 parents, it requests the bootstrap server to allocate at least one extra parent. seeking one additional parent is to restrict the effect a "flash exit", which occurs when a talk (presentation) is about to end. many peers start leaving the system at once. the number of parents for each existing peer falls below k/2 at a faster rate. therefore, at that point, it might not be possible to satisfy the requirement of k parents. in the worst case, only a source node and a single peer may be available, and all other nodes may have departed. in this case, the source could be the only parent of the remaining peer. every peer is aware of the source node's (the speaker's device) address. whenever a listener wants to initiate a query, he/she clicks on the "ask doubt" button provided in the user interface of the application. it establishes a direct connection with the speaker's device. the speaker gets a notification of the query and may send an acknowledgment. the peer device is allowed to send the query after receiving the acknowledgment. the doubt or the query should be in the form of an audio message like it happens during a physical presentation. the message is unicast on the link between the source and the peer, which initiated the query. the child peers of the requester would then pull the data while the source pushes the data to its other children. thus, the query is propagated in a push-based manner as it happens in the tree-based approach. it guarantees that the speaker can resolve no query before it is asked, thereby preserving the causality relationship. the shared whiteboard follows a push-based approach for data propagation as opposed to the pull-based approach followed by live streaming. there is no parent-child relationship between the nodes. all adjacent nodes are neighbors. the packet structure of the live board appears in fig. 5 . a shared whiteboard's packet consists of a packet-id, a label-id, the packet length, and the data. the packet-id is composed of the client-id and the sequence number. the packet-id field uniquely identifies a packet. the client-id is the first 8b of the sha-1 hash of the ip address of a user. the client-id ensures that a packet delivery does not occur to the same peer from whom it is received. the shared whiteboard supports multiple pages. each page gets a unique label-id. the operations performed on every page are stored separately. the label-id makes the canvas repainting task easier on the receiver side. it also allows maintaining a consistent view of both the sender and the receiver. due to a push-based approach, there may be data redundancy at the receiver side. so, the old packets are marked and then discarded. propagating text data is not an intensive operation. therefore, even with the push-based approach, it does not contribute much to the overhead. the user interface of the whiteboard application provides many operations like basic shapes, colors, free-pen. among these, free-pen is the most time-consuming operation. a single free-pen operation generates many events. sending one event over the network consumes much bandwidth and degrades the system performance under the presence of moderate to heavy network traffic. we used a buffered approach to improve the performance of free-pen. the content of the entire buffer is sent if either the buffer is full or a mouse release event occurs. at the receiver side, the operations replayed on receipt. the events get repainted in a sequential manner only. the size of the buffer depends on the packet size or maximum transmission unit (mtu). maintaining a consistent view of all peers is most important for the shared whiteboard application. the system maintains a separate list for every label-id. whenever a new packet arrives, after extracting its label-id, the corresponding packet is placed in that label-id list at the appropriate position according to its sequence number. the algorithms for packet generation, sender and receiver processes are provided in algorithms 3, 4, and 5 respectively. a pushbased approach causes packet losses and network delays. it may lead to an inconsistent view of the board among the peers if the replay at a receiver happens without considering the delayed packets. for handling delayed packets, at first, each packet is placed at its correct relative position, among the other packets. then the repainting is done using the label-id and the sequence number in the packet. it is not a compute-intensive operation. the repainting is also very quick, as most computers can perform more than 10 8 operations per sec. furthermore, the repainting performed only for one single page of the board at a time. a peer joining late may get an inconsistent view of the shared whiteboard. we used a combination of push-pull to solve this problem. the current label-id does not match with the peer's label-id, then a comparison is performed with the sequence numbers. if the current label-id or the sequence number is a positive number, then it is safe to assume that data propagation has already started. hence, a peer joining late sends requests (pulls) for the previous data from one of the neighbors. each operation belonging to a page is tagged with the corresponding page number. a page may as well be left blank intentionally. in other words, the number of operations performed on a page may be zero or a positive number. in a pull request, a peer first requests for the data of the page in the current view, which can be extracted with the help of label-id. therefore, a pull request must include label-id. the responder replies with the number of operations performed on the page. the receiver then verifies the reply. if the number of operations performed on that page is zero, no further request is needed. otherwise, a request for remaining operations would be made. these operations are requested one at a time in a pullbased manner until the latest sequence number. after the late joiner receives all the packets, only then his/her session gets activated. an explicit deferred joining process is necessary to avoid the inconsistent intervention of a late joiner. to understand why it happens, consider a user who joins after half an hour of the start of a session. suppose the user immediately starts some operations on the first page of the canvas when the user's device is still in the process of receiving data from other peers. such an intervention by the user leads to an overwriting of the previous data. therefore, we defer the activation time of the late joining peer until the data synchronization is complete. another solution could be to enable the group-undo feature in the system, i.e., every undo command gets propagated to the entire group, and required changes will be done on every peer's canvas. even if the user joins the system before the sync is complete and starts scribbling, we could use a group-undo command to undo those operations. the inclusion of the group-undo feature is not available in the current work. a screenshot of the video streaming along with illustration on shared white-board is given below in fig. 6 . the presenter's video appears to right bottom corner when she was explaining the pythagoras theorem. the screenshot in fig. 7 depicts the user's control panel for video streaming combined with shared whiteboard. before starting a new session the user generates and verifies a new session key to indicate a new presentation session. the presenter can either uses a stored contents (video or audio) or can start a new streaming session. by sharing the screen a user (the presenter) can initiate a multiway presentation with a bunch of listeners with video and shared whiteboard. p2p interaction on stored materials is another important aspect of our system. it relies on an efficient, robust file sharing and searching component that incorporates many additional features. the most important among these features is tagging selected parts of audio, video and pdf files. a participating peer can raise queries by creating annotations and postings of the tagged portions of media and document files. a peer may also give comments on posts for resolution of queries. a stored media may be streamed tightly coupled with a shared whiteboard for live discussion much like a live meeting, or a brainstorming session. therefore, the p2p-ps on stored contents is a concomitant system of overall p2p-ps platform. it helps in reflective learning and even creating a mash-up presentation using stored repositories. the file sharing system is implemented using de bruijn graph overlays. the stored contents are organized into a dht using de bruijn graph-based overlays. a de bruijn graph is a labeled directed multigraph with fixed out-degree k. every node of the graph has an i d or a label of fixed length. let the length of each i d be d and alphabet σ size be k. de bruijn graph b(k, d) of n = k d nodes can be constructed as follows. each node is connected by an outgoing edge to k other nodes. there is an outgoing edge from a node a to a node b if i d b can be created by applying a left shift to the label i d a and appending one symbol from alphabet set σ to the rightmost position, i.e., i d b = i d a [1 :] + a where a ∈ σ. if labels are considered as base k numbers, then outgoing edges will be to all the nodes with labels equal to figure 8 illustrates an example of a de bruijn graph b (2, 3) . routing in a de bruijn graph is specified as a string. let s and d denote the source and destination nodes with labels i d s and i d d respectively. then the string for the routing path from s to d is obtained as follows: (1) for example, assume that a look-up for destination node 1011 is initiated by a source node with 1110. the maximum overlap of the suffix of 1110 and prefix of 1011 is 10 as shown in fig 9. hence, the string for routing path will be 111011, i.e., 1110 appended by the non-overlapped part 11 of id of destination node. the next hop v is derived from a current hop u as follows: 1. v is a neighbor of u in the input de bruijn graph. 2. the length of longest suffix of v that is same as a prefix of i d d which has a length 1 more than the length of the longest suffix of u that is equal to a prefix of i d d . this routing scheme is known as substring routing. the structure and routing method indicate that the diameter of de bruijn graph is d = log k n. it has low clustering and exhibit (k − 1)-node-connectivity. k-nodeconnectivity is not possible due to self-loops on k nodes with ids of the form "αα . . . α", for α = 0 to k − 1. the nodes can be linked together to form a ring which makes the graph k-regular and also achieves k-node-connectivity. k-node-connectivity makes the graph more resilient to fault-tolerance. even the failure of any (k −1) nodes cannot disconnect the graph and diameter remains at most d + 1. loguinov et al [33] showed that the expected congestion in de bruijn graph is much less than the other counterparts under a similar rate of load, due to larger bisection width of the graph. the de bruijn graph possesses better asymptotic degreediameter properties compared to some of the widely used dhts, such as chord [48] , trie [19] , can [45] , pastry [46] and butterfly [37] . tables 1 and 2 provide a summary of comparisons in terms of degree and diameters from the analysis made by loguinov et al. [33] . the blank cells in table 2 indicate that corresponding node degrees are not supported. table 3 compares the average distance between the nodes in de bruijn graph to the optimal moore graph [33] with the same degree k. in a de bruijn graph it remains very close to optimal values even for the smaller values of k. loguinov et al. [33] proposed some guidelines for incremental construction of a de bruijn graph. however, the paper falls short of an actual implementation as it does not address the problem of maintaining de bruijn structure in presence of churning, where churn refers to dynamicity involving nodes leaving and joining. for the purpose of implementation, we choose the parameters k = 8, and d = 8 for a de bruijn graph. it allows around 16 million nodes inside network labeled "00000000-77777777" in octal strings. however, the diameter of a de bruijn graph remains o(1) (8 to be precise). the dht overlay in our system would support efficient look-ups, and an epidemic dissemination-based protocols can be designed to infect all the nodes in just a few rounds. we refer to the nodes in the underlying de bruijn graph as "virtual" nodes. for maintaining the underlying graph, a physical node in our system is responsible for a range of virtual nodes with consecutive i ds. a range of a physical node in our system is referred to as its zone. initially, when a single physical node joins the dht, it becomes responsible for all virtual nodes from 00000000 to 77777777. the virtual id-space may be visualized as a ring, where each physical node is responsible for only an arc segment of the ring. figure 10 illustrates an example of three physical nodes responsible for three different zones. the structure is similar to chords [48] . however, unlike the chord overlay a physical node responsible for an id space arc may be located somewhere randomly within it. a node a has an outgoing edge to another node b, if there is at least one virtual node in a's zone having an outgoing edge to a virtual node in b's zone. each node keeps a list of its outgoing and incoming edges. each node in these lists knows its address and zone id. the details of the structure maintained at each node is given in table 4 . if a tries to join, it selects a random id and forwards a join request to identify the owner of the zone in which the chosen random id falls. for convenience of description, we use the following convention: any intermediate node receiving the join request initiated by a is referred to as b; while the node whose zone contains the random id chosen by a, is node c. the problem of joining is split into three parts according to actions of a, b and c is explained below. node a requests the rendezvous server (whose public ip address is known to all) to provide the external address (ip address) of a and those in a list of random peers. a picks one peer randomly from the list supplied by rendezvous server and sends a join request to that peer. the request contains a's external address, and a random virtual id from the entire id-space, i.e., 00000000 to 77777777. the join request initiated by a tries to identify c, the owner of the random id sent in the join request. when node a sends such a request for the first time, its id is chosen using sha-1 hash value of its external address. but, upon retries, a random id from the entire region is picked. on receiving the join request, b forwards it to the next node on the routing path. if a routing path is not provided in the request, b will create one using b's id and destination id given in the join request. the correct path will be sent along with the join request. on receiving the join request from a (possibly through intermediate nodes), c sends the information regarding its zone id, the incoming and the outgoing links. c does not accept further join requests until joining of a is complete, or the timeout of 10 seconds occurs. the reply from c contains a structure, namely the virtual node label of c, external address of c, and outgoing and incoming edges of c. now c waits for a to complete the joining process and sends information about a's new zone. then c updates its own zone by sending keys with values to be managed by a, notifying the neighbors about the change in zone, and dropping the edges destroyed due to the shrinking of its zone. next, a picks the half part of the zone not containing c's id and chooses a random id (label) from the picked zone as own id. a sends information of its id and its zone id to c, and the attachment of node a in dht overlay is complete. a needs to perform the following two actions to complete the joining process: (i) disseminate its id and zone information to all the links shared by c; (ii) identify the links to be dropped, check if there is any new link to c, and make the corresponding changes to the list of incoming and outgoing edges. finally, a willingly accepts the load shared by c. due to space limitation, the algorithm is included here. interested readers can review the send join request and receive join request algorithms in appendix a and appendix b, respectively. to keep the example small, we show the node join process in de bruijn graph with k = 2 and d = 4. figure 11a and b illustrate joining process of the first two nodes in the system. the process of joining of the third node is shown in fig. 12a . notice that the joining of the third node requires the removal of the (dashed line) link a −→ b. for the joining of the fourth node two new links are to be inserted as shown by dotted lines in fig. 12b. consider the leaving of a node a from the system. a node c is identified who can merge the zone of a into its own zone. there are two parts to the leave process, as the leaving node is aware of its both the predecessor and the successor in the dht overlay. a identifies a successor zone and a predecessor zone. the first virtual id belonging to the successor zone equal to a picks one of the two ids, finds the respective owners of the zones and sends a request to leave. the leave request contains the information regarding id, zone, incoming and outgoing links to the chosen neighbor, say c. if c agrees within timeout period of 5 seconds, a sends the load to c and the leave process is complete. otherwise, a tries the same with the other neighboring node. if the timeout occurs again, the entire procedure is repeated assuming chosen nodes were busy in other leave procedure. algorithm 8 in appendix c specifies the precise steps executed by a. node c receiving the request checks if it is going to leave the system. if not, it accepts the request and agrees to take over the load by merging the two zones. the merging process includes merging of the incoming and outgoing links as well. c then notifies all the linked nodes regarding the update. algorithm 9 in appendix d gives a step-wise description of c's operations. during join and leave, nodes whose zone changes notify the nodes linked to them by outgoing or incoming edges. when such notification is received, a node adds or drops links caused by the change. every two minutes, each node sends keep-alive message to all linked nodes. these nodes update the timestamp with the zone information corresponding to that node. if no update is received in last five minutes for a linked node, it is considered dead, and owner of the successor zone takes the responsibility of orphaned zone. if there is an outgoing edge from a virtual node in one zone to a virtual node in another zone, then there exists an outgoing edge between one node to the other. on many occasions, zones of the neighboring nodes change. a brute force way to find an edge in the underlying graph would slow down join and leave operations. node a receives zone update from another node b. a checks if the size of its zone is larger than or equal to n/k. our implementation supports two types of searches (i) by id/key, and (ii) by keywords. a user can share one or more directories which contain the file(s). by default, at least one directory is shared which may be empty. the said directory is located in the "downloads" folder of the user's system. anything downloaded from the p2p system is stored in this directory and is automatically shared. we calculate 160-bit sha-1 value for the contents of the file and use it as a key. since, each node has 24-bit label, the first 24 bits of a key are used for the routing purpose. for example, first 24 bits of the sha-1 hash value "2fd4e1c67a2d28fced849ee1bb76e7391b93eb12" is "2fd4e1" which is equal to "13752341" in octal. at the start of the application and after every 30 minutes, the directories are scanned for any new or modified files. both new and modified files are queued for processing. sha1 of a file is computed, and it is stored in the local database. any deleted or renamed file is also identified. for deleted files, the entries related to them are deleted from the local database. a set of important keywords are associated with file to improve the chances of finding a document or file in the dht. we provide support for extracting text from pdf and video files that come along with the subtitles. if a pdf is created using scanned copies, the text is extracted using an ocr tool. once the text has been extracted, tf-idf scheme is used to identify top 100 keywords for a file using its content. in this scheme, the file is considered to be a set of documents of 1000 words each. in the tf-idf scheme, a term is assigned a higher score if it has larger frequency than other terms, but appears in a less number of the documents. the scoring scheme is, therefore, based on inverse document frequency. these keywords are stored in the local database along with file's modification times, which can be used to avoid processing of the file again. for all keywords, sha-1 is also computed to obtain the keys. the users can also manually add up to 10 keywords. a key is a globally unique id of an object (e.g., sha-1 value), and the value contains the name of the object, the size of the object, the address of the object and the timestamp of the last refresh. the keywords are stored as meta data and maintained in the local database at the external nodes of an overlay. the purpose is to enable keyword based search for related document and media files. we need two basic operations, namely, put and get. for each file, multikeyput is used to insert key-value pairs of all keywords in the system. the keywords also include words in the file name. the associated value stores information such as sha-1 hash of file, size of the file, name of the file and address of the file. after every 30 minutes, multikeyput is called for keywords of each file, thus refreshing the timestamp for keys in corresponding nodes. a node will check for the key's timestamp every 10 minutes, if a particular node has not updated a key in last 60 minutes, corresponding key-value pair is considered to be not available and dropped from the local store. the delay of 60 minutes is kept considering a lost update in the network. searching operation is performed one at a time. a user can cancel the search at anytime. since, there can be delayed replies, a unique search id is associated with the get requests to distinguish the results. any node returning result/s must provide search id along with the result. the packets received for the current search id are kept, and all other are discarded. a query is broken down into keywords, and search is conducted with multikeyget procedure. for any reply that comes while search is active, the ranking of the results may change. the file having more number of query keywords in it, gets a higher rank. the files with same number of query keywords is distinguished by the replication/popularity in the system. the more popular result gets a higher ranking than the others. a user might receive the id or the key of a file from another user via some communication channel. if user wishes to download file with given id, he/she can enter it in get file dialog which calls get method. on receiving the results, a download request is initiated. when results are fetched using the above search methods, a user can choose to download a file. if chosen file is available with multiple peers, multi-threaded approach for downloading is used. different chunks are requested from multiple peers, and they are written to the file as they are downloaded. most pdf viewers have tools to highlight a portion of the text. but, any editing of the content changes hash value of a file. modification of hash value of a file is undesirable, as the file in p2p learning environment is archival in nature. therefore, the features should be provided without editing the file itself. furthermore, the users may also want to see portions that are highlighted by others. some other users may wish to tag some text or provide a hyperlink to some other external resource. all these operations are grouped as peer learning activities without any change in the contents. we, therefore, defined a general format for annotations. the general annotation format is a template for other formats. the following information is stored in an annotation: -annotation id: a unique id to identify an annotation. the annotations are stored separately in the local database, and synchronized among peers. the "properties" field stores the properties for specific types of annotations. the text field may be in the form of a hypertext, as the support for links and images is provided. a user can create and use annotations either for personal use or can share them with others to spread useful information. a shared annotation is a post that can be discussed among the participating peers. so, we integrated a discussion forum along with dissemination of annotations. highlighting in the pdf documents can be done by either selecting text or selecting a rectangle. we define properties of each type separately. text selection provides the flexibility of selection of a specific word or a set of words, or a set of lines in a particular page. if selection spans multiple pages, the user should select words or lines on each page separately. the user may also provide a comment or add some text related to a highlighted area. the additional text/comments are stored with the annotation. since each annotation of this type is designed for only one continuous region of a text, we define following properties for an annotation. video annotations being similar to audio annotations, we do not distinguish between the two. a user can select a duration of the video and tag it. the two end-points of a duration are rounded to the nearest integers and at least 5 seconds of the duration is selected. the properties for such a selection are: starttime (the time at the start of duration) and endtime (the time at the end of duration). a post inherits the format of the corresponding annotation, adds an additional field for title. a post also has a constraint of minimum (resp. maximum) number of 100 (resp. 1600) characters in the text field. announcements are similar to posts in the discussion forum. they are useful for posting information about locations of newly added documents. the sharing of announcements is carried out in a similar fashion. they are forwarded to other peers as soon as they are received. on the front-end, announcements appear separately from other posts. posts and comments are synchronized using following two protocols: (i) an epidemic dissemination based protocol to spread recently published posts and comments; and (ii) a reconciliation (or anti-entropy session) performed at regular intervals with the neighbors to handle missed updates, since a user will not be active at all times. the consistency requirements of the discussion platform is weaker because once a post is made, it cannot be updated. for a particular post or a comment, only one person is responsible. posts and comments need not be delivered immediately. the happened-before relation of the comments and the posts is maintained due to their hierarchical structure. the causal ordering of other messages can only be provided through time-stamps associated with them. we assume that clocks of the machines are synchronized with ntp servers. every node receives posts and comments through their neighbors. our application periodically checks if any new post or comment arrived since the last check. if there are new posts or comments, they are sent to all the neighbors except the ones from which they were received. this ensures the delivery of messages to all. the period of such check is set to one minute. the diameter of our system is d = 8, therefore, any new message gets delivered to the entire system in less than ten minutes. the interval time can be set to zero, and in that case, messages will be delivered to others instantly. reconciliation procedure executes at regular intervals of thirty minutes. it begins by randomly picking a neighbor at the start of an interval. a predefined time of seven days is chosen. when posts from last seven days are sorted by time, the first post's timestamp is picked, and shared with the neighbor. on receiving the reconciliation request. the neighbor chooses all the posts that started on or after the given timestamp, and sorts them according to timestamps. a list of ids of these posts or comments are created. the list is divided into chunks of 256 entries. a hash value is calculated for each chunk, based on the post/comment ids. these hash values are shared with the initiator. the initiator also calculates the hash values in the similar fashion, and the hash values are compared. if the hash value is different, then the initiator asks a neighbor to share the ids in the particular chunk. it then gets the list of ids in the chunk from the neighbor. both of them make changes to the corresponding lists by adding missing ids and recalculate hash values for that chunk. new list of hash values from the mismatched chunk is sent to the initiator. when the last chunk's hash matches reconciliation is over. most of the time, only last few chunks would differ, and the number of message exchanges is low. emulab provides an environment for experiments over testbeds consisting of a large-scale distributed network. we deployed scripts for the experiments using the emulab portal to acquire both physically distributed and purely simulated nodes [49] . it fitted the kind of experiments we wanted to run for establishing the efficacy of our approach to build a low latency file sharing, searching, and inline annotations framework ideally suitable for the e-learning system using a p2p organization. as far as the whiteboard is concerned, sharing our primary motivation was to find out how a p2p learning system may work in a lan environment and support up to a maximum of 250 nodes. we carried out the simulation experiments with up to 1000 nodes. the first experiment is to find out the stability of our system. during the process of joining, as explained in section 4, a peer gets a list of log 2 n active peers. running simulations on up to 1000 nodes, we found that the choice of log 2 n leads to a stabilization of network as the size of the overlay increases. once the bootstrap server returns the list, the peer send "adopt me" request to all the active peers in the list. a peer on receiving join request could either accept or discard the request based on its fanout value. running fig. 13 the results on emulab for stabilization of the maximum path length and churning simulations on up to 1000 nodes, we found that the choice of log 2 n leads to a stabilization of network as the size of the overlay increases. we found that the maximum path increases with an increase in the number of nodes, but later it gets stabilized. as shown in fig. 13 from 700 to 1000 nodes, the maximum path length has stabilized to 6 units. the mesh-overlay might get disconnected in the presence of churn. a node with 0 in-degree has no parent, and hence, it could not receive data until it finds at least one parent. so, the stabilization of the overlay is essential in the presence of churning. it means only one node could have in-degree 0. this node should be the source node only. we performed simulations on 1000 nodes for churn rate 10%, 20%, 30%. the results appear in fig. 13b . the yaxis denotes the number of nodes having in-degree equal to zero. the time is measured in seconds. we observed that even with 30% churn rate, the overlay stabilizes within 5 seconds. table 5 depicts the minimum throughput value analyzed on a different number of nodes. here, throughput is defined as the number of packets received in one second. we have obtained these results by performing experiments on emulab. in emulab experiments, each node receives 5000 packets from its parent nodes. for packet size of 1400 bytes and streaming rate of 2mbps, the number of packets generated per second = 179. hence, 5000 packets will be generated in 5000 179 = 27.93 seconds. hence, we can see from table 5 that our results are closely related to the fig. 14a and b respectively. the maximum time to deliver packets is a reflection of the delay between the source node and the farthest node. in terms of practical utility of the system we need to ensure that absolute latency experience by a listener in live streaming should be long. we therefore, calculated observable latency at a listener device for varying number of nodes. the plot in fig. 15 shows that the average latency varied between 10.5ms to 12.5ms, while the maximum latency varied between 31ms to 37ms. since, we used modified mesh-based architecture for bittorrent swarm type dissemination of video packets, it allowed us the keep latency under desired range. de bruijn graph has diameter and out-degree equal to eight. at the application layer, the diameter does not exceed the value of eight. however, out-degree can vary according to the size of a zone. in theory, the maximum out-degree would be less than k × o(log k n) with high probability, where n is the number of nodes in the system. considering fig. 16 maximum out-degree of nodes n = 800, 000 nodes, the maximum out-degree among all runs was 41. we experimented with different values of n and found the average out-degree to be 7.99. it is due to the absence of self-loops. the graphs in fig. 16a and b show the median values of the maximum out-degree of a node. the simulation results match the theoretical results. the graph shows that the median value for the maximum out-degree is 31 when n = 100000. figure 17 shows the distribution of out-degree. it shows that very few nodes have degree > 2 × k. only 380 out of 100000 nodes have a degree of more than 16, which is only 0.38% of the total. none of the nodes have out-degree > k × log k 100000. in an equivalent chord implementation, the average out-degree is o(log n), which is > 20. our experiments also determined that minimum in-degree=7 and the maximum in-degree=8. another experiment with the varying value of n from 100 to 100000 was performed where each node queried for 10 random keys. the results, plotted in fig. 18 , showed that average number of hops per query stayed well below the value of log k (n). the distribution of access count of a node for n = 100000 and a million queries is shown in fig. 19 . for n = 100000, the average hop count is 5.51. it means, fig. 17 distribution graph of out-degree of nodes on an average, 5.51 nodes were accessed per query. the distribution shows that only 1741 nodes (or 1.74% nodes) were accessed more than 2 times the average, and only 89 nodes (or 0.089% nodes) were accessed more than 3 times the average access count. the maximum any node was accessed was 250 times. any node in this system supports more than 1000000 routing queries per minute or more than 17000 routing queries per second. it would mean the system can support at least 68 times more query workload (or 680 queries per second per node) without any degradation in the performance. we also carried out experiments on emulab to find out the maximum latency and the success rate of lookups. in the experiment, the nodes were connected in a lan environment. every node randomly picked twenty-five words from the list of three thousand words available to everyone. after a delay of five minutes, the nodes sent out queries for the same set of words. the boot-up times of machines in emulab differ up to ±5 minutes. hence, the join procedure for the de bruijn overlay network would have involved a transfer of the load for most of the nodes. the results were calculated for the varying number of nodes, as table 6 . it indicates that none of the lookups failed, even during the dynamic joins in the setup. in another experiment, the nodes are allowed to leave the system with 10% probability after every three minutes. our approach achieved a success rate of 99.39% and the maximum latency of 52ms for 200 nodes. assuming that the human tolerance limit is about 200ms, the response is pretty good. the scope of building innovative p2p applications is, thus, enormous. however, this paper focuses on just two key application spaces, namely, communication and collaboration. instant messaging [27] , voip services [10] , p2p-sip [9] , social networking [41] and media streaming [54] are among the few popular p2p applications in communication space. most of deployments of p2p live streaming systems [24, 31, 39, 56] use mesh-pull architectures. some of the recent work in focused on analytical modeling and efficiency of live streaming on p2p networks. three different variations of multi-request mechanism has been presented in [55] . it concludes that there is the performance gap between analytical modeling and simulation which can be narrowed by evolving some realistic strategy on pull requests. it may provide comparative performance of a push-based mechanism. in a recent work terelius and johansson [52] investigated p2p network for efficient live streaming. they have provided a theoretical analysis of topology convergence of p2p network in presence of churn. the network graph converges to be a complete gradient overlay network. interestingly, the proposed modified mesh network employs a modified packet reservation strategy akin to the concept of gradient network. bittorrent [42] also uses mesh-pull where each peer advertises the chunks of the stream it has in its cache. the advertisement is in the form of buffer maps. the peers use buffer maps to enhance viewing quality, start-up latency, bandwidth utilization, among others. in the context of p2p learning, we use a modified mesh architecture, which is presented in section 3. in a collaboration space, content distribution is a crucial application. in this paper, we are interested in the most basic form of content distribution, i.e., file sharing and searching [36, 40, 42, 47] . two fundamental problems in file sharing are (i) to locate peers and (ii) to find a route between a pair of peers using proximity neighbor selection. distributed hash tables (dhts) [48] were introduced to handle the problems mentioned above. an efficient proximity neighbor selection is found to be highly effective when used in distributed hash tables (dht) with prefixbased routing like pastry [46] or tapstry [57] . in this paper, we use de bruijn graph-based overlays, which also employ a prefix-based routing scheme. de bruijn graph-based dhts possess better degree diameter properties compared to other dhts such as chord [48] , pastry [46] , trie [19] , can [45] and butterfly [37] . furthermore, as indicated by emulation both average and maximum out-degrees are low even for moderately large de bruijn graphs. some guidelines are available in the literature for de bruijn graphbased implementation of dhts [17, 20, 33] . however, it does not address the maintenance of dht in the presence of churn (membership change over time due to peers leaving and joining unpredictably). the main focus of the paper is p2p technology in e-learning. a majority of e-learning platforms are aided by a centralized core or cloud server [16, 28] because querying learning objects is far more complicated than searching or fetching a document or a music file. a critical aspect of the research on p2p e-learning is on peer to peer data management and provide infrastructure for complex queries on educational objects [38, 51] . edutella [38] project used w3c metadata standard rdf to provide an infrastructure on which complex queries are possible over p2p networks based on jxta platform [22] . piazza [51] addressed the problem of sharing semantically heterogeneous data in a distributed manner. p2p architectures in e-learning space meant for facilitating the download of large files in the background through a lightweight p2p core [43] , which integrates a bittorrent-based small self-producing component. some existing applications of p2p technology in e-learning focused on load balancing by creating an integrated model consisting of p2p and client-server model [30] . the idea is to separate the server functions from learning content. the functions of the server is distributed in a p2p manner. the peers manage the learning contents and provide them to the learners. so the computation and storage costs are distributed. since learning objects are not stored at the user's machine, copyright issues do not arise. however, the system cannot be considered as a p2p model for e-learning. the ieee/acm 2013 computer science curricula listed out challenges in creating and maintaining heterogeneous test-beds which is needed for research, teaching, and learning concerning computer network technologies on a global scale. global environment for network infrastructure [12] and future internet research and experiments [21] are two important learning projects which focused on near-ubiquitous availability of heterogeneous test-beds for carrying out experiments on future internet technologies for learning. forge [29] toolkit leveraged both fire and geni for developing learning material to provide an eco-system for teaching and self-learning using tools and experiments available under open policies. teaching and experiments related to computer science courses in universities seattle project [4] offered access to planet lab for deploying and test student assignments. no e-learning platform exists, at least in our knowledge, which provides an environment anywhere close to the p2pbased presentation system that mimics peer interactions on a live presentation and allows learners to tag and annotate learning material to raise queries, etc. the availability of p2p interactive whiteboard in conjunction with live streaming, gives users a feel of physical attendance in a remote presentation. the only proposal that comes somewhere close to our system is by priyankara et al. [44] . it essentially proposes a p2p content sharing mechanism or a natural extension of file sharing and searching capabilities on learning material. it is evident from related literature that most e-learning platforms using p2p technology are intended for monitoring or individual mentoring requirements of the learners in a framework of remote learning paradigm. we have studied two different architectures to support live streaming. based on the analysis, we modified mesh architecture to support dynamic fanout leveraging spare capacity whenever available. hence, our system could also support heterogeneous nodes satisfying the minimal bandwidth requirement, as explained in section 3. in our experience with actual sessions on the live board, we found a consistent display of results. the repainting was quite efficient at all peer nodes. the buffered approach has helped to optimize the p2p shared whiteboard's performance. our simulations on emulab proved that even in the presence of churning, the overlay structure for live streaming gets stabilized moderately quickly. since the maximum path length also gets stabilized, the latency is bearable in live sessions of p2p interactions with live streaming. the emulab experiments are important to establish that experimental throughput is close to theoretical throughput values. both the current and earlier version of the system is available from bitbucket links: 1. https://bitbucket.org/p2pelearning/icls/src/ master/ [14] . 2. https://bitbucket.org/p2pelearning/distro1/src/ master/ [23] . . as a part of future work, we plan to include more tools in our shared whiteboard and enable group-undo operation. various video-coding techniques are available to neutralize the effect of packet loss. in the future, we would like to incorporate those techniques for live streaming. the back end support for file sharing, searching, and inline annotation also needs to be integrated fully with a learning management system. -for students. by students seattle: open peer-to-peer computing. seattle.poly overleaf training resources sip: network architecture and resource location strategy a study of ten popular android mobile voip applications: are the communications encrypted? an analysis of the skype peer-topeer internet telephony protocol geni: a federated test-bed for innovative network experiments everything you need to know about using zoom integrated collaborative learning software skype vs. zoom: which video chat app is best for working from home? using moodle: teaching with the popular open source course management system on de bruijn routing in distributed hash tables: there and back again a case study of faculty experience and preference of using blackboard and canvas lms efficient peer-to-peer lookup based on a distributed trie broose: a practical distributed hashtable based on the de-bruijn topology future internet research and experimentation: the fire initiative project jxta: a technology overview insights into pplive: a measurement study of a large-scale p2p iptv system student perception of moodle learning management system: a satisfaction and significance analysis educational technology best practices a study of internet instant messaging and chat protocols canvas lms course design forge toolkit: leveraging distributed systems in elearning platforms proposal of e-learning system integrated p2p model with clientserver model stochastic fluid theory for p2p streaming systems a survey on peer-to-peer video streaming systems graph-theoretic analysis of structured peer-to-peer systems: routing distances and fault resilience prime: peer-to-peer receiver-driven mesh-based streaming mesh or multipletree: a comparative study of live p2p streaming approaches peer-to-peer (p2p) file sharing system: a tool for distance education viceroy: a scalable and dynamic emulation of the butterfly edutella: a p2p networking infrastructure based on rdf miostream: a peer-to-peer distributed live media streaming on the edge p2p networking: an information sharing alternative a survey on decentralized online social networks the bittorrent p2p file-sharing system: measurements and analysis integrating edulearn learning content management system (lcms) with cooperating learning object repositories (lors) in a peer to peer (p2p) architectural framework vyapthi: a leveraged p2p content sharing platform for distributed e-learning systems a scalable content-addressable network pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems measurement study of peer-to-peer file sharing systems chord: a scalable peer-to-peer lookup protocol for internet applications the scalability of swarming peer-to-peer content delivery the piazza peer data management project peer-to-peer gradient topologies in networks with churn duolingo effectiveness load balancing in p2p video streaming systems with service differentiation towards the multi-request mechanism in pull-based peer-to-peer live streaming systems coolstreaming/donet: a data-driven overlay network for peer-to-peer live media streaming tapestry: a resilient global-scale overlay for service deployment key: cord-022561-rv5j1201 authors: boes, katie m.; durham, amy c. title: bone marrow, blood cells, and the lymphoid/lymphatic system date: 2017-02-17 journal: pathologic basis of veterinary disease doi: 10.1016/b978-0-323-35775-3.00013-8 sha: doc_id: 22561 cord_uid: rv5j1201 nan within the marrow spaces, a network of stromal cells and extracellular matrix provides metabolic and structural support to hematopoietic cells. these stromal cells consist of adipocytes and specialized fibroblasts, called reticular cells. the latter provides structural support by producing a fine network of a type of collagen, called reticulin, and by extending long cytoplasmic processes around other cells and structures. both reticulin and cytoplasmic processes are not normally visible with light microscopy but are visible with silver reticulin stains (e.g., gordon and sweet's and sometimes with periodic acid-schiff). bone marrow is highly vascularized but does not have lymphatic drainage. marrow of long bones receives part of its blood supply from the nutrient artery, which enters the bone via the nutrient canal at midshaft. the remaining arterial supply enters the marrow through an anastomosing array of vessels that arise from the periosteal arteries and penetrate the cortical bone. vessels from the nutrient and periosteal arteries converge and form an interweaving network of venous sinusoids that permeates the marrow. these sinusoids not only deliver nutrients and remove cellular waste but also act as the entry point for hematopoietic cells into blood circulation. sinusoidal endothelial cells function as a barrier and regulate traffic of chemicals and particles between the intravascular and extravascular spaces. venous drainage parallels that of the nutrient artery and its extensions. • bleeding time (template bleeding time, buccal mucosal bleeding time). this assay assesses primary hemostasis (platelet plug formation) by measuring the time interval between inflicting of standardized wound and cessation of bleeding. sedation may be required. in small animals the test is usually performed on the buccal mucosa; in large animals it may be performed on the distal limb. prolonged bleeding time may be because of a platelet function defect, von willebrand disease, or a vascular defect. the sensitivity of this test is low; reference intervals are species and site dependent (can perform test on a normal animal as a control). this test is contraindicated in cases of thrombocytopenia because significant thrombocytopenia can cause a prolonged bleeding time (invalidates interpretation of test results). • clot retraction test. this assay assesses retraction of a clot, in which platelets play an essential role. this is a crude test that is rarely performed. different protocols are described. significant thrombocytopenia invalidates interpretation of test results. • tests to characterize platelet function abnormalities more specifically are available through specialized laboratories. • aggregometry-to assess platelet aggregation in response to different physiologic agonists. • adhesion assays-to assess the ability of platelets to adhere to a substrate (e.g., collagen). • flow cytometry-to assay for expression of surface molecules. • pfa-100-an instrument that simulates a damaged blood vessel, by measuring time for a platelet plug to occlude an aperture; to date, this instrument has mainly been used in research applications. • thromboelastography (teg)-global assessment of hemostasis (platelets, coagulation, and fibrinolysis) based on viscoelastic analysis of whole blood. • tests for immune-mediated thrombocytopenia (imt). • flow cytometry-to detect immunoglobulin bound to the platelet surface, using a fluorescent-labeled antibody. • bone marrow immunofluorescent antibody (ifa) test-to detect bound immunoglobulin. sometimes referred to as the "antimegakaryocyte antibody test," this assay actually detects the presence of immunoglobulin nonspecifically: a smear of a bone marrow aspirate is incubated with a fluorescent-labeled antibody to species-specific immunoglobulin. other components of the marrow include myelinated and nonmyelinated nerves, as well as low numbers of resident macrophages, lymphocytes, and plasma cells. of note, the macrophages play an important role in iron storage and erythrocyte maturation. the following basic concepts provide a framework for understanding the mechanisms of injury and diseases presented later in the chapter. • hematopoietic tissue is highly proliferative. billions of cells per kilogram of body weight are produced each day. • pluripotent hematopoietic stem cells are a self-renewing population, giving rise to cells with committed differentiation programs, and are common ancestors of all blood cells. the process of hematopoietic differentiation is shown in fig. 13 -2. • hematopoietic cells undergo sequential divisions as they develop, so there are progressively higher numbers of cells as they mature. cells also continue to mature after they have stopped dividing. conceptually, it is helpful to consider cells in the bone marrow as belonging to mitotic and postmitotic compartments. examples of developing hematopoietic cells are shown in fig. 13 -3. • mature cells released into the blood circulation have different normal life spans, varying from hours (neutrophils), to days (platelets), to months (erythrocytes), and to years (some lymphocytes). • the hematopoietic system is under exquisite local and systemic control and responds rapidly and predictably to various stimuli. • production and turnover of blood cells are balanced so that numbers are maintained within normal ranges (steady-state kinetics) in healthy individuals. • normally the bone marrow releases mostly mature cell types (and very low numbers of cells that are almost fully mature) into the circulation. in response to certain physiologic or pathologic stimuli, however, the bone marrow releases immature cells that are further back in the supply "pipeline." the composition of the marrow changes with age. the general pattern is that hematopoietic tissue (red marrow) regresses and is replaced with nonhematopoietic tissue, mainly fat (yellow marrow). thus in newborns and very young animals the bone marrow consists largely of hematopoietically active tissue, with relatively little fat, whereas in geriatric individuals the marrow consists largely of fat. in adults, hematopoiesis occurs primarily in the pelvis, sternum, ribs, vertebrae, and the proximal ends of humeri and femora. even within these areas of active hematopoiesis, fat may constitute a significant proportion of the marrow volume. immature hematopoietic cells can be divided into three stages: stem cells, progenitor cells, and precursor cells. hematopoietic stem cells (hscs) have the capacity to self-renew, differentiate into mature cells, and repopulate the bone marrow after it is obliterated. progenitor cells and precursor cells cannot self-renew; with each cell division, they evolve into more differentiated cells. later-stage precursors cannot divide. stem cells and progenitor cells require immunochemical stains for identification, but precursor cells can be identified by their characteristic morphologic features (see fig. 13 -3). control of hematopoiesis is complex, with many redundancies, feedback mechanisms, and pathways that overlap with other physiologic and pathologic processes. many cytokines influence cells of different lineages and stages of differentiation. primary growth factors for primitive cells are interleukin (il) 3, produced by t lymphocytes, and stem cell factor, produced by monocytes, macrophages, fibroblasts, endothelial cells, and lymphocytes. interleukin 7 is an early lymphoid growth factor. lineage-specific growth factors are discussed in their corresponding sections. hematopoiesis occurs in the interstitium between the venous sinusoids in the so-called hematopoietic spaces. there is a complex functional interplay among hematopoietic cells with the supporting connective tissue cells, extracellular matrix, and soluble factors, which form the hematopoietic microenvironment. behavior of hematopoietic cells is influenced by direct cell-to-cell and cellmatrix interactions and by soluble mediators, such as cytokines and hormones that interact with cells and with matrix proteins. cells localize to specific niches within the hematopoietic microenvironment via adhesion molecules, such as integrins, immunoglobulins, lectins, and other receptors, which recognize ligands on other cells or matrix components. cells also express receptors for soluble molecules such as chemokines (chemoattractant cytokines) and hormones that influence cell trafficking and metabolism. iron is essential to hemoglobin synthesis and function. it is acquired through the diet and is transported to the bone marrow via the iron transport protein, transferrin. central macrophages either store iron as ferritin or hemosiderin, or transfer the iron to erythroid precursors for hemoglobin synthesis. hemosiderin is identifiable in routinely stained marrow preparations as an intracellular brown pigment. however, perls's prussian blue stain is more sensitive and specific for iron detection. the earliest erythroid precursor identifiable by routine light microscopy is the rubriblast, which undergoes maturational division to produce 8 to 32 progeny cells. late-stage erythroid precursors, known as metarubricytes, extrude their nuclei and become inhibiting apoptosis of developing erythroid cells. the stimulus for increased epo production is hypoxia. within the bone marrow, erythroid precursors surround a central macrophage in specialized niches, termed erythroblastic islands . the central macrophage, also known as a nurse cell, anchors the precursors within the island niche, regulates erythroid proliferation and differentiation, transfers iron to the erythroid progenitors for hemoglobin synthesis, and phagocytizes extruded metarubricyte nuclei. although erythroblastic islands occur throughout the marrow, those with more differentiated erythroid cells neighbor sinusoids, whereas nonadjacent islands contain mostly undifferentiated precursors. as erythroid cells mature from a rubriblast to a mature erythrocyte, their nuclei become smaller and more condensed. the nucleus is eventually extruded to form a polychromatophil. erythroid cells also become less basophilic and more eosinophilic as more hemoglobin is produced and as rna-rich organelles are lost during maturation. (hemoglobin stains eosinophilic, and rna stains basophilic with routine romanowsky's stains.) as granulocytes (e.g., neutrophils, eosinophils, and basophils) mature from a myeloblast to their mature forms, their nuclei become dense and segmented. granulocytes acquire their secondary or specific granules during the myelocyte stage and can be morphologically differentiated starting at this stage. neutrophils have neutral-staining secondary granules, eosinophil secondary granules have an affinity for acidic or eosin dyes, and basophil secondary granules have an affinity for basic dyes. monoblasts differentiate into promonocytes with ruffled nuclear boarders and then into monocytes. in most mammals, mature erythrocytes have a biconcave disk shape, called a discocyte. microscopically, these cells are round and eosinophilic with a central area of pallor. however, the central concavity may not be microscopically apparent in species other than the dog. camelids normally have oval erythrocytes, termed ovalocytes or elliptocytes, which facilitate better gas exchange at high altitudes. the erythrocytes of some animals are prone to in vitro shape change, including those of cervids, pigs, and some goat breeds (e.g., angora). erythrocyte size during health depends on the species, breed, and age of the animal. in dogs, some breeds have relatively smaller (e.g., akitas and shibas) or larger (e.g., some poodles) erythrocytes. akitas and shibas also have a high concentration of potassium, unlike erythrocytes in other dogs. juvenile animals may have larger erythrocytes because of the persistence of fetal erythrocytes, which is followed by a period of relatively smaller cells before reaching adult reference intervals. mature mammalian erythrocytes lack nuclei and organelles and are thus incapable of transcription, translation, and oxidative metabolism. however, they do require energy for various functions, including maintenance of shape and deformability, active transport, and prevention of oxidative damage. red blood cells generate this energy entirely through glycolysis (also known as the embden-meyerhof pathway). except in pigs, glucose enters erythrocytes from the plasma through an insulin-independent, integral membrane glucose transporter. within circulation the erythrocyte mean life span varies between species and is related to body weight and metabolic rate: approximately 150 days in horses and cattle, 100 days in dogs, and 70 days in cats. when erythrocytes reach the end of their life span, they are destroyed in a process termed hemolysis. hemolysis may occur within blood vessels (intravascular hemolysis) or by sinusoidal macrophages (extravascular hemolysis). during intravascular hemolysis, erythrocytes release their contents, mostly hemoglobin, directly into blood. however, during extravascular hemolysis, macrophages phagocytize entire erythrocytes, leaving little or no hemoglobin in the blood. normal turnover of erythrocytes occurs mainly by extravascular hemolysis within the spleen, and to a lesser extent in other organs such as the liver and bone marrow. the exact controls are not clear, but factors that likely play a role in physiologic hemolysis include the following: • exposure of membrane components normally sequestered on the inner leaflet of the erythrocyte membrane, particularly phosphatidylserine. reticulocytes, and subsequently mature erythrocytes. the normal transit time from rubriblast to mature erythrocyte is approximately 1 week. reticulocytes start maturing in the bone marrow but finish their maturation in the blood circulation and spleen. horses are an exception in that they do not release reticulocytes into circulation, even in situations of increased demand. unlike mature erythrocytes, which lack organelles, reticulocytes still contain ribosomes and mitochondria, mainly to support completion of hemoglobin synthesis. these remaining organelles impart a bluish-purple cast (polychromasia) to reticulocytes on routine blood smear examination. the resultant cells are termed polychromatophils. because older reticulocytes do not exhibit polychromasia, more sensitive laboratory techniques must be used for accurate reticulocyte quantification. when a blood sample is incubated with new methylene blue stain, the reticulocytes' ribosomal rna precipitates to form irregular, dark aggregates . cats also have a more mature form of reticulocyte, termed punctate reticulocyte, which is stippled when stained with new methylene blue. punctate reticulocytes indicate prior, not active, regeneration and do not appear polychromatophilic on routine blood smear evaluation. storage pool, which consists of a reserve of fully mature neutrophils. the size of the storage pool varies by species; it is large in the dog, but small in ruminants. in homeostasis mostly mature segmented granulocytes are released from the marrow into the blood. the first monocytic precursor identifiable by morphologic features is the monoblast, which develops into promonocytes and subsequently monocytes (see fig. 13 -3). unlike granulocytes, monocytes do not have a marrow storage pool; they immediately enter venous sinusoids upon maturation. after migrating into the tissues, monocytes undergo morphologic and immunophenotypic maturation into macrophages. within blood vessels there are two pools of leukocytes: the circulating pool and the marginating pool. circulating cells are free flowing in blood, whereas marginating cells are temporarily adhered to endothelial cells by selectins. in most healthy mammals there are typically equal numbers of neutrophils in the circulating and marginal pools. however, there are threefold more marginal neutrophils relative to circulating neutrophils in cats. only the circulating leukocyte pool is sampled during phlebotomy. the concentration of myeloid cells in blood depends on the rate of production and release from the bone marrow, the proportions of cells in the circulating and marginating pools, and the rate of migration from the vasculature into tissues. the fate of neutrophils after they leave the bloodstream in normal conditions (i.e., not in the context of inflammation) is poorly understood. they migrate into the gastrointestinal and respiratory tracts, liver, and spleen and may be lost through mucosal surfaces or undergo apoptosis and be phagocytized by macrophages. lymphopoiesis. lymphopoiesis-from lympha (latin, water)refers to the production of new lymphocytes, including b lymphocytes, t lymphocytes, and natural killer (nk) cells. b lymphocytes primarily produce immunoglobulins, also known as antibodies, and are key effectors of humoral immunity. they are distinguished by the presence of an immunoglobulin receptor complex, termed the b lymphocyte receptor. plasma cells are terminally differentiated b lymphocytes that produce abundant immunoglobulin. t lymphocytes, effectors of cell-mediated immunity, possess t lymphocyte receptors that bind antigens prepared by antigen-presenting cells. a component of innate immunity, nk cells kill a variety of infected and tumor cells in the absence of prior exposure or priming. main growth factors for b lymphocytes, t lymphocytes, and nk cells are il-4, il-2, and il-15, respectively. lymphocytes are derived from hscs within the bone marrow. b lymphocyte development occurs in two phases, first in an antigenindependent phase in the bone marrow and ileal peyer's patches (the site of b lymphocyte development in ruminants), then in an antigendependent phase in peripheral lymphoid tissues (such as spleen, lymph nodes, and mucosa-associated lymphoid tissue [malt] ). t lymphocyte progenitors migrate from the bone marrow to the thymus, where they undergo differentiation, selection, and maturation processes before migrating to the peripheral lymphoid tissue as effector cells. unlike granulocytes, which circulate only in blood vessels and migrate unidirectionally into target tissues, lymphocytes travel in both blood and lymphatic vessels and continually circulate between blood, tissues, and lymphatic vessels. also in contrast to nonlymphoid hematopoietic cells, blood lymphocyte concentrations in adult animals are primarily dependent upon extramedullary lymphocyte production and kinetics, and not lymphopoiesis by the marrow. in healthy nonruminant mammals, lymphocytes are the second most numerous blood leukocyte. according to conventional wisdom, • decreased erythrocyte deformability. • binding of immunoglobulin g (igg) and/or complement to erythrocyte membranes. complement binding may be secondary clustering of the membrane anion exchange protein, band 3. • oxidative damage to erythrocytes. macrophages degrade erythrocytes into reusable components, such as iron and amino acids, and the waste product bilirubin. bilirubin is then exported into circulation, where it is transported to the liver by albumin. the liver conjugates and subsequently excretes bilirubin into bile for elimination from the body. intravascular hemolysis normally occurs at only extremely low levels. hemoglobin is a tetramer that, when released from the erythrocyte into the blood, splits into dimers that bind to a plasma protein called haptoglobin. the hemoglobin-haptoglobin complex is taken up by hepatocytes and macrophages. this is the major pathway for handling free hemoglobin. however, free hemoglobin may also oxidize to form methemoglobin, which dissociates to form metheme and globin. metheme binds to a plasma protein called hemopexin, which is taken up by hepatocytes and macrophages in a similar manner to hemoglobin-haptoglobin complexes. free heme in the reduced form binds to albumin, from which it is taken up in the liver and converted into bilirubin. the concentration of circulating erythrocytes typically decreases postnatally and remains below normal adult levels during the period of rapid body growth. the age at which erythrocyte numbers begin to increase and the age at which adult levels are reached vary among species. in dogs, adult values are usually reached between 4 and 6 months of age; in horses, this occurs at approximately 1 year of age. granulopoiesis is the production of neutrophils, eosinophils, and basophils, whereas monocyte production is termed monocytopoiesis. granulocytic and monocytic cells are sometimes collectively referenced as myeloid cells. however, the term myeloid and the prefix myelo-can be confusing because they have other meanings; they may reference the bone marrow, all nonlymphoid hemic cells (erythrocytes, leukocytes, and megakaryocytes), only granulocytes, or the spinal cord. the main purpose of granulocytes and monocytes is to migrate to sites of tissue inflammation and function in host defense (see chapters 3 and 5 ). briefly, these cells have key immunologic functions, including phagocytosis and microbicidal activity (neutrophils and monocyte-derived macrophages), parasiticidal activity and participation in allergic reactions (eosinophils and basophils), antigen processing and presentation, and cytokine production (macrophages). neutrophils are the predominant leukocyte type in blood of most domestic species. primary stimulators of granulopoiesis and monocytopoiesis are granulocyte-macrophage colony-stimulating factor and il-1, il-3, and il-6 (granulocytes and monocytes), granulocyte colonystimulating factor (granulocytes), and macrophage colonystimulating factor (monocytes). in general, these cytokines are produced by various inflammatory cells, with or without contribution from stromal cells. the earliest granulocytic precursor identifiable by routine light microscopy is the myeloblast, which undergoes maturational division over 5 days to produce 16 to 32 progeny cells (see fig. 13 -3). these granulocytic precursors are conceptually divided into those stages that can divide, including myeloblasts, promyelocytes, and myelocytes (proliferation pool), and those that cannot, including metamyelocytes, and band and segmented forms (maturation pool). within the neutrophil maturation pool is a subpool, termed the platelet aggregation and adherence to subendothelial collagen. expansion of surface area and release of granule contents is aided by a network of membrane invaginations known as the open canalicular system. this system is not present in horses, cattle, and camelids. information on this topic is available at www.expertconsult.com. information on this topic is available at www.expertconsult.com. information on this topic is available at www.expertconsult.com. information on this topic is available at www.expertconsult.com. mechanisms of bone marrow disease are summarized in box 13-1. hematopoietic cells' response to injury is dependent upon whether the insult is on the marrow or within extramarrow tissues. in general, marrow-directed injury or disturbances result in production of abnormal hematopoietic cells (dysplasia), fewer hematopoietic cells (hypoplasia), or a failure of hematopoietic cell development (aplasia). dysplasia, hypoplasia, and aplasia may be specific for one cell line, such as pure red cell aplasia, or affect multiple lineages, as seen with aplastic anemia. accordingly, decreased blood concentrations of the involved cell types are expected with hypoplasia or aplasia. erythroid, myeloid, and megakaryocytic hypoplasia or aplasia causes nonregenerative anemia, neutropenia, and thrombocytopenia, respectively. bicytopenia is used to describe decreased blood concentrations of two cell lines, whereas pancytopenia indicates decreased blood concentrations of all three cell types. bicytopenia or pancytopenia may indicate generalized marrow disease, such as occurs with aplastic anemia or marrow malignancies (leukemia), necrosis, fibrosis (myelofibrosis), or inflammation (myelitis). replacement of hematopoietic tissue within the bone marrow by abnormal tissue, including neoplastic cells, fibrosis, or inflammatory cells, is termed myelophthisis. cattle normally have higher numbers of lymphocytes than neutrophils in circulation. however, recent studies suggest that is no longer the case, most likely due to changes in genetics and husbandry. in most species the majority of lymphocytes in blood circulation are t lymphocytes. the concentration of blood lymphocytes decreases with age. thrombopoiesis. thrombopoiesis-from thrombos (gr., clot)refers to the production of platelets, which are small (2 to 4 µm), round to ovoid, anucleate cells within blood vessels. platelets have a central role in primary hemostasis but also participate in secondary hemostasis (coagulation) and inflammatory pathways (see chapters 2 and 3). thrombopoietin (tpo) is the primary regulator of thrombopoiesis. the liver and renal tubular epithelial cells constantly produce tpo, which is then cleared and destroyed by platelets and their precursors. therefore plasma tpo concentration is inversely proportional to platelet and platelet precursor mass. if the platelet mass is decreased, less tpo is cleared, and there is subsequently more free plasma tpo to stimulate thrombopoiesis. the earliest morphologically identifiable platelet precursor is the megakaryoblast, which undergoes nuclear reduplications without cell division, termed endomitosis, to form a megakaryocyte with 8 to 64 nuclei. as the name suggests, megakaryocytes are very large cells, much larger than any other hematopoietic cell ; also see fig. 13 -1). megakaryocytes neighbor venous sinusoids, extend their cytoplasmic processes into vascular lumens, and shed membranebound cytoplasmic fragments (platelets) into blood circulation. orderly platelet shedding is partially facilitated by β 1 -tubulin microtubules within megakaryocytes. platelets circulate in a quiescent form and become activated by binding platelet agonists, including thrombin, adenosine diphosphate (adp), and thromboxane. platelet activation causes shape change, granule release, and relocation of procoagulant phospholipids and glycoproteins (gps) to the outer cell membrane. specific procoagulant actions include release of calcium, von willebrand factor (vwf), factor v, and fibrinogen, as well as providing phosphatidylserine-rich binding sites for the extrinsic tenase (factors iii, vii, and x), intrinsic tenase (factors ix, viii, and x), and prothrombinase (factors x, v, and ii) coagulation complexes. platelet gp surface receptors include those for binding vwf (gpib-ix-v), collagen (gpvi), and fibrinogen (gpiib-iiia), which facilitate increased destruction hemorrhage (especially erythrocytes) consumption (platelets) neoplasia altered distribution abnormal function bone marrow is not routinely sampled during postmortem examinations. however, indications for bone marrow evaluation include suspected leukemia, metastatic neoplasia within bone marrow, or infectious myelitis, as well as cytopenia(s) or hematopoietic dysplasia of unknown cause. multimodal evaluation is ideal, including a recent (<24 hours) complete blood count with bone marrow cytologic and histopathologic examination. however, antemortem blood analyses are not always available, and interpretation of hematopoietic cytomorphologic examination results becomes difficult to impossible shortly after death. postmortem bone marrow should be collected as soon as possible after death or euthanasia, preferably within 30 minutes. samples may be collected from the proximal femur, rib, sternum or vertebrae. when collecting from the femur, the femoral neck is removed with a bone saw, or a fragment of the shaft is removed with bone-cutting shears. cytologic samples are first collected using the paintbrush technique: gently sample the red marrow with a clean, dry, naturalbristle brush, and then carefully brush the material onto a clean glass microscope slide in two to four parallel wavy lines. the brush should be cleaned and dried before its use on a different animal. the slide is then air dried, stored away from formalin fumes, and then stained with a routine (romanowsky) stain. for histologic evaluation the entire femoral head or femoral shaft or rib fragment with exposed red marrow is submersed in 10% neutral buffered formalin. for cosmetic necropsies, samples may be obtained by antemortem techniques, such as needle biopsies for cytologic examination and core biopsies for histopathologic examination. the complete blood count (cbc) is the cornerstone for diagnosis of hematologic disturbances and is often part of a minimum database in sick patients. the cbc includes numeric data indicating the concentration of different cell types, as well as other estimations of red blood cell mass (hemoglobin concentration, packed-cell volume, and hematocrit), red blood cell volume (mean cell volume), and red blood cell hemoglobin content (mean cell hemoglobin and mean cell hemoglobin concentration). cell morphologic features and the presence or absence of hemic parasites are assessed upon microscopic review of a blood smear and are also included in a cbc report. (note: some parasites may infect blood cells, such as hepatozoon organisms within circulating neutrophils or monocytes or bartonella organisms within erythrocytes, but mainly cause disease in other body systems and are therefore not discussed in this chapter.) learning to evaluate blood smears is a valuable skill for any practicing veterinarian. the cbc also may include the plasma protein concentration, as measured with a refractometer. it is important to remember that changes in hydration status and in the distribution of body fluids between the vascular and extravascular compartments affect the concentration of both cells and proteins in the blood. other tests that may help with evaluation of the hematopoietic system include cell or tissue biopsies, the direct antiglobulin test, flow cytometry, immunophenotyping, and polymerase chain reaction (pcr). aspiration cytology and/or histopathology of organs other than the bone marrow can be pursued to assess for the presence of emh, increased destruction of erythrocytes, neoplasia, or infection. the coombs test, or direct antiglobulin test, detects excessive antibody or complement bound to red blood cells' surfaces and is the standard assay for immune-mediated hemolytic anemia. flow cytometry and immunofluorescent antibody tests may also be used to detect autoantibody bound to erythrocytes or other hematopoietic cells. immunophenotyping and pcr are further discussed in the section on hematopoietic neoplasia. structural or functional abnormalities of blood vessels, platelets, or coagulation factors may result in a tendency toward hypocoagulability (bleeding), hypercoagulability (inappropriate thrombosis), or both. in veterinary medicine there has been a great deal of work on specific mechanisms of hypocoagulability, whereas mechanisms of hypercoagulability are less fully characterized. disorders of primary hemostasis typically result in "small bleeds" (e.g., petechiation, mild ecchymosis, bleeding from mucous membranes, bleeding immediately after venipuncture), whereas disorders of secondary hemostasis typically result in "big bleeds" (e.g., hemorrhage into body cavities/ joints, marked ecchymosis, large hematomas, delayed bleeding after venipuncture). this chapter concentrates on primary disorders of hemostasis and also covers disseminated intravascular coagulation, which is the secondary condition. however, it is important to note that coagulation disorders can also result from other underlying disease processes. for example, advanced liver disease can lead to abnormal hemostasis through decreased or defective synthesis of coagulation factors or impaired clearance of fibrinolytic products that inhibit coagulation reactions and platelet function. vascular disorders may also result in a bleeding tendency because of abnormalities of endothelial function or collagen-platelet interactions. specific diseases involving abnormal structure or function of hematopoietic or hemostatic elements are discussed later in this chapter. the cbc provides basic information about platelets, including numeric values for platelet concentration and mean platelet volume (mpv), subjective assessment of platelet morphologic features (size, shape, and granularity), and a rough estimation of platelet numbers based on examination of a blood smear. some laboratories measure reticulated platelets (platelets recently released from the bone marrow), although this test is mostly used in the research setting at present. increased mpv and increased numbers of reticulated platelets tend to indicate increased thrombopoiesis. bone marrow examination is indicated with any unexplained cytopenia, including thrombocytopenia, to evaluate production. tests to evaluate the components of the hemostatic process are described and listed in e-appendix 13-1. secondary myelofibrosis is the enhanced deposition of collagen within the marrow by nonneoplastic fibroblasts and reticular cells. disease pathogenesis is unclear, but there are two leading theories. first, it may represent scar formation after marrow necrosis, as previously presented. and second, high concentrations of growth factors present during times of marrow injury or activation may stimulate fibroblast proliferation. in particular, stimulated megakaryocytes and macrophages produce fibrogenic cytokines, including plateletderived growth factor, transforming growth factor-β, and epidermal growth factor. early in disease there is reticulin deposition without reduction of hematopoietic elements. however, fibrous collagen replaces hematopoietic cells with disease progression. histologic identification of reticulin and collagen fibers can be aided with reticulin silver and masson's trichrome stains, respectively. in animals, secondary myelofibrosis occurs most commonly with leukemias, extramarrow malignancies, and chronic hemolytic anemias, but many cases are idiopathic. experimental whole-body gamma irradiation, dietary strontium-90 exposure, and certain drugs and toxins can also induce myelofibrosis. the responses of marrow adipocytes to systemic and localized disease are under current investigation, especially in relation to energy metabolism, inflammation, and bone trauma. during times of severe energy imbalance, such as cachexia, the marrow may undergo serous atrophy of fat, also known as gelatinous marrow transformation (efig. 13 -1). the pathogenesis of this phenomenon is unknown, but it is characterized by adipocyte atrophy, hematopoietic cell hypoplasia with subsequent cytopenias, and replacement of the marrow with extracellular hyaluronic acid-rich mucopolysaccharides. positive alcian blue staining identifies the extracellular material as mucin. marrow adipocytes secrete adipose-derived hormones, termed adipokines, including leptin and adiponectin. in general, leptin is proinflammatory, prothrombotic, and mitogenic for various cell types, including lymphocytes, hematopoietic progenitors, and leukemic cells. conversely, adiponectin has antiinflammatory and growth inhibitory properties. during times of inflammation and infection, leptin production is increased. in response to marrow trauma, such as orthopedic surgery, fat may enter the vasculature, embolize to various tissues, and cause tissue ischemia. the severity of tissue injury caused by fat embolism is dependent upon the quantity of fat entering circulation and the tissue's susceptibility to ischemia (see chapter 2). responses of circulating blood cells to injury include decreased survival (destruction, consumption, or loss), altered distribution, and altered structure or function (see box 13-1). these responses are not mutually exclusive-for example, altered erythrocyte structure may lead to decreased survival. often, but not always, these responses result in decreased concentrations of blood cells in circulation. abnormal concentrations of blood cells. the concentration of blood cells may be decreased, termed cytopenia (from kytos [gr., hollow vessel] and penia [gr., poverty]) or increased, designated cytosis (from osis [gr., condition]). a specific blood cell type is denoted as being decreased by using the suffix -penia (table 13-1) . a decreased concentration of erythrocytes is the exception and is termed anemia (from a [gr., without] and haima [gr., blood]). decreased concentrations of blood basophils are not recognized in domestic animals because the lower reference interval is typically zero. an increased blood cell type is denoted with the suffix -osis or -philia (see table 13 -1). postmortem quantification of blood cell insults to extramarrow tissues and cells tend to cause increased production of the involved cell types (hyperplasia) with or without dysplasia. loss of erythrocytes from blood vessels (hemorrhage), or premature destruction of erythrocytes (hemolysis) causes erythroid hyperplasia. tissue inflammation may cause neutrophilic, eosinophilic, basophilic, and/or monocytic hyperplasia, depending on the type of inflammation. megakaryocytic hyperplasia may occur with increased platelet use during hemorrhage or disseminated intravascular coagulation (dic) or with immune-mediated platelet destruction. exceptions to these generalizations, such as anemia of chronic disease, iron deficiency anemia, and anemia of renal failure, are discussed in more detail later. endothelial cell response to injury specifically within the marrow is poorly characterized, but it is likely similar to that of endothelial cells elsewhere, playing active roles in coagulation and inflammation (see chapters 2 and 3). however, one potential sign of marrow sinusoidal injury is the presence of circulating nucleated erythrocytes in the absence of erythrocyte regeneration, termed inappropriate metarubricytosis. it is proposed that injured marrow endothelial cells allow premature passage of metarubricytes into blood circulation during times of stress. however, a conflicting theory proposes that marrow stress causes decreased metarubricyte attachment to central macrophages, and subsequent release into circulation. specific causes of marrow injury-induced metarubricytosis include sepsis, hyperthermia, malignancies, hypoxia, and certain drugs and toxins. inappropriate metarubricytosis may also occur with erythroid dysplasia and splenic disorders. in addition to a suspected role in inappropriate metarubricytosis, marrow macrophages are integral to altered iron metabolism, including anemia of chronic disease and hemosiderosis. anemia of chronic disease is a mild to moderate nonregenerative anemia observed in animals with a variety of inflammatory and metabolic disorders. this anemia is discussed in more detail later, but briefly, it is primarily a result of iron sequestration within macrophages. hemosiderosis is the excessive accumulation of iron in tissues, typically macrophages. accumulation of iron in parenchymal organs, leading to organ toxicity, is termed hemochromatosis. in animals, iron overload due to blood transfusions or chronic hemolytic anemias may cause marrow hemosiderosis and hemochromatosis. myelitis can take different forms. granulomatous myelitis occurs with systemic fungal infections (e.g., histoplasmosis) or mycobacteriosis. acute or neutrophilic myelitis may occur with lower-order bacterial infections or those with an immune-mediated component. dogs and cats with nonregenerative immune-mediated hemolytic anemia (imha) often have myelitis, in addition to myelofibrosis and necrosis. the inflammation is evident as fibrin deposition, edema, and multifocal neutrophilic infiltrates; immune-mediated cytopenias may also concurrently occur with bone marrow lymphocytic and/or plasma cell hyperplasia. bone marrow necrosis is the necrosis of medullary hematopoietic cells, stromal cells, and stroma in large areas of bone marrow. potential causes include leukemias, extramarrow malignancies, infection (bovine viral diarrhea virus [bvdv] , ehrlichia canis, and feline leukemia virus [felv]), sepsis, drugs or toxins (carprofen, chemotherapeutic agents, estrogen, metronidazole, mitotane, and phenobarbital), and irradiation. direct hematopoietic or stromal cytotoxicity and altered marrow microvasculature (disseminated intravascular coagulation) are proposed pathogeneses. extensive marrow necrosis results in decreased hematopoiesis and subsequent blood cytopenias, including anemia, neutropenia, and thrombocytopenia. if the animal survives the initial insult, the marrow may recover and resume normal hematopoiesis, or it may undergo scar formation, termed myelofibrosis. concentrations is not possible due to perimortem coagulation. however, a complete blood count (cbc) with microscopic blood smear evaluation is the foundation for antemortem assessment of blood cells. anemia. anemia causes clinical signs referable to decreased red hemoglobin pigment (e.g., pale mucous membranes), decreased oxygen-carrying capacity (e.g., depression, lethargy, weakness, and exercise tolerance), and decreased blood viscosity (e.g., heart murmur). recumbency, seizures, syncope, or coma may occur with severe anemia. anemia is confirmed by identifying a decreased hemoglobin concentration or reduced erythrocyte mass, as measured by the packed-cell volume, hematocrit, or red blood cell concentration. the three general causes of anemia are blood loss (hemorrhage), red blood cell destruction or lysis (hemolysis), and decreased red blood cell production (erythroid hypoplasia). classifying anemia as regenerative or nonregenerative is clinically useful because it provides information about the mechanism of disease; regenerative anemia indicates hemorrhage or hemolysis, whereas erythroid hypoplasia or aplasia causes nonregenerative anemia (table 13 -2). the hallmark of regenerative anemias, except in horses, is reticulocytosis (i.e., increased numbers of circulating reticulocytes [immature erythrocytes]), which is evident as polychromasia on a routinely stained blood smear (see fig. 13 -5). reticulocytosis indicates increased bone marrow erythropoiesis ( fig. 13-7) and release of erythrocytes before they are fully mature. reticulocytosis is an appropriate marrow response to anemia and is often seen with hemorrhage or hemolysis. on a cbc a strong regenerative response may produce an increased mean cell volume (mcv) and decreased mean cell hemoglobin concentration (mchc) because reticulocytes are larger and have a lower hemoglobin concentration than mature erythrocytes. horses are an exception to this classification scheme because they do not release reticulocytes into circulation, even with erythroid hyperplasia. horses with a regenerative response may have an increased mcv and red cell distribution width (an index of variation in cell size). but definitive determination of regeneration in a horse requires demonstration of erythroid hyperplasia via bone marrow examination or an increasing red cell mass over sequential cbcs. in addition to reticulocytosis there may be increased numbers of nucleated red blood cells (nrbcs) in circulation with erythrocyte regeneration, termed appropriate metarubricytosis. when nrbcs are present as part of a regenerative response, they should be in low numbers relative to the numbers of reticulocytes. however, the presence of circulating nrbcs is not in itself definitive evidence of regeneration and may signify dyserythropoiesis (e.g., lead poisoning or bone marrow disease) or splenic dysfunction. these processes should be suspected when nrbcs are increased without reticulocysuch as into the peritoneal cavity, because iron is not lost from the body and can be reused for erythropoiesis. in hemolytic anemia, erythrocytes are destroyed at an increased rate. whether the mechanism is intravascular or extravascular, or a combination, depends on the specific disease process (specific diseases are discussed later in this chapter). some clinical indicators of hemolytic anemia and their pathogeneses are summarized in fig. 13 -9 and are further described in the following discussion. a classic sequela of hemolytic anemias in general is hyperbilirubinemia, which is an increase in the plasma bilirubin concentration. bilirubin is a yellow pigment, which explains why hyperbilirubinemia, if severe enough, causes icterus-the grossly visible yellowing of fluid or tissues ( fig. 13-10) . icterus, also known as jaundice, is usually detectable when the plasma bilirubin concentration exceeds 2 mg/dl. however, it is important to note that hyperbilirubinemia and icterus are not pathognomonic for hemolysis and may also occur with conditions of impaired bile flow (cholestasis), such as hepatopathy or cholangiopathy. in addition to icterus, hemolytic anemia often results in splenomegaly , which is secondary to extravascular hemolysis and macrophagic hyperplasia within the spleen, as well as splenic emh. splenomegaly may also occur in other conditions, as discussed elsewhere in this chapter. intravascular hemolysis is grossly evident as pink-tinged plasma or serum, termed hemolysis or hemoglobinemia. hemolysis is not apparent until the concentration of extracellular hemoglobin is greater than approximately 50 mg/dl. cell-free hemoglobin is scavenged by haptoglobin until haptoglobin becomes saturated with hemoglobin at a concentration of approximately 150 mg/dl. when haptoglobin is saturated, any remaining free hemoglobin has a low enough molecular weight to pass through the renal glomerular filter into the urine. this imparts a pink or red discoloration to the urine, called hemoglobinuria. thus extracellular hemoglobin can cause gross discoloration of the plasma, where it is bound to haptoglobin, before becoming grossly visible in urine. the half-life of haptoglobin is markedly decreased when bound to hemoglobin, so when large amounts of haptoglobin-hemoglobin complex are formed, the concentration of haptoglobin in the blood decreases and hemoglobin can pass through the glomerulus at even lower concentrations. hemoglobinuria is a contributing factor in the renal tubular necrosis (hemoglobinuric nephrosis) that often occurs in cases of acute intravascular hemolysis (see chapter 11). a similar lesion occurs in the kidneys of individuals with marked muscle damage and resulting myoglobinuria (see chapters 11 and 15). hemoglobinuria cannot be distinguished grossly from hematuria (erythrocytes in the urine) or myoglobinuria (myoglobin in the urine), and all three processes cause a positive reaction for "blood protein" on urine test strips. comparing the colors of the plasma and the urine may be informative. in contrast to hemoglobin, myoglobin causes gross discoloration of the urine before the plasma is discolored. this is because myoglobin is a low-molecular-weight monomer, freely filtered by the glomerulus, and does not bind plasma proteins to a significant degree. hematuria can be distinguished from hemoglobinuria on the basis of microscopic examination of urine sediment (i.e., erythrocytes are present in cases of hematuria). in addition to red plasma and urine, hemoglobinemia may also be identified by increased mch or mchc values on a cbc. this is because the hemoglobin concentration is measured by lysing all erythrocytes in the sample and then measuring the total hemoglobin via spectrophotometry. by this method, hemoglobin that originated within or outside of erythrocytes is measured together. however, calculations for mch and mchc, which include results for the hemoglobin and red blood cell concentrations, assume that all of tosis, or their numbers are high relative to the degree of reticulocytosis, termed inappropriate metarubricytosis. in ruminants, reticulocytosis is often accompanied by basophilic stippling (fig. 13-8) . however, like metarubricytosis, basophilic stippling without reticulocytosis is concerning for lead poisoning or other causes of dyserythropoiesis. recall that the stimulus for increased erythropoiesis is increased secretion of epo in response to tissue hypoxia. although the action of epo on erythropoiesis is rapid, evidence of a regenerative response is not immediately apparent in a blood sample. one of the main effects of epo is to expand the pool of early-stage erythroid precursors, and it takes time for these cells to differentiate to the point where they are released into circulation. in a case of acute hemorrhage or hemolysis, for example, it typically takes 3 to 4 days until reticulocytosis is evident on the cbc and several more days until the regenerative response peaks. the term preregenerative anemia is sometimes used to describe anemia with a regenerative response that is impending but not yet apparent on the cbc. confirming a regenerative response in such cases requires either evidence of erythroid hyperplasia in the bone marrow or emergence of a reticulocytosis on subsequent days. hemorrhage results in escape of erythrocytes and other blood components, such as protein, from the vasculature. as a result, a decreased plasma or serum protein concentration, termed hypoproteinemia, may be evident on a cbc or chemistry panel. if the hemorrhage is into the gastrointestinal lumen, some of the protein may be resorbed and converted to urea, resulting in an increased urea nitrogen concentration relative to creatinine in plasma. hemorrhage within the urinary tract may cause red urine with erythrocytes observed in the urine sediment. causes of hemorrhage include trauma, abnormal hemostasis, certain parasitisms, ulceration, and neoplasia. hemorrhage may be acute or chronic, or internal or external. during acute hemorrhage, there are ample iron stores within the body for hemoglobin synthesis and erythrocyte regeneration. however, with chronic external hemorrhage, continued loss of iron may deplete the body's iron stores. as iron stores diminish, so does erythrocyte regeneration, eventually leading to iron deficiency anemia. iron deficiency anemia is either poorly regenerative or nonregenerative and is discussed in more detail later in the chapter. iron deficiency anemia does not occur with chronic internal hemorrhage, spherocytes form when macrophages (mainly in the spleen) phagocytize part of an erythrocyte plasma membrane bound with autoantibody ( fig. 13-12) . the remaining portion of the erythrocyte assumes a spherical shape, thus preserving maximal volume. this change in shape results in decreased deformability of the cells. erythrocytes must be extremely pliable to traverse the splenic red pulp and sinusoidal walls; spherocytes therefore tend to be retained in the spleen in close association with macrophages with risk for further injury and eventual destruction. in the dog, spherocytes appear smaller than normal and have uniform staining ( fig. 13-13 , a), in contrast to normal erythrocytes, which have a region of central pallor imparted by their biconcave shape. this difference in staining between spherocytes and normal erythrocytes is not consistently discernible in many other domestic animals (including horses, the hemoglobin originated within erythrocytes. in the case of hemoglobinemia, the excess extracellular hemoglobin may cause an artifactual increase in the calculated mch and mchc. it is important to remember that similar artifactual increases may also occur with lipemia. once hemolytic anemia has been identified, the specific cause for hemolysis should be investigated based on signalment, clinical history, and microscopic blood smear evaluation. the most common causes of hemolytic anemia in domestic animals are immunemediated, infectious, oxidative, and mechanical fragmentation (i.e., microangiopathic) disorders (table 13-3) . spherocytosis and autoagglutination are hallmarks of immunemediated hemolytic anemia, either primary (also known as idiopathic) or secondary to infectious disease, drugs/toxins, or neoplasms. several initiating processes can cause intravascular hemolysis; formation of the complement membrane attack complex is pictured. with intravascular hemolysis, free hemoglobin is release directly into the plasma, where it is scavenged by haptoglobin and hemopexin. when haptoglobin and hemopexin are saturated, the cell-free hemoglobin causes red discoloration of the plasma (hemolysis) and is excreted in the urine (hemoglobinuria; dark red urine). the liver clears haptoglobinhemoglobin and hemopexin-methemoglobin complexes from plasma and converts hemoglobin to unconjugated bilirubin and then conjugated bilirubin. conjugated bilirubin is normally excreted in the bile and then converted to urobilinogen (yellow) and subsequently stercobilinogen (dark brown). however, excessive bilirubin will spill over into the plasma, resulting in hyperbilirubinemia, icteric plasma (if severe enough), and urinary excretion of bilirubin (bilirubinuria; icteric urine). extravascular hemolysis: during extravascular hemolysis, erythrocytes are phagocytized by macrophages, which digest erythrocytes, and convert hemoglobin to unconjugated bilirubin. excessive bilirubin in plasma causes hyperbilirubinemia with or without icteric plasma. unconjugated bilirubin is processed and excreted by the liver (as previously described) and in dogs, the kidney. kidney u-bilirubin cattle, and cats), whose erythrocytes differ from those of the dog in that they are smaller and have less pronounced biconcavity and therefore less pronounced central pallor. autoagglutination occurs because of cross-linking of antibodies bound to erythrocytes (see . autoagglutination is evident macroscopically as blood with a grainy consistency (see fig. 13-13, b) , and microscopically as clusters of erythrocytes (see fig. 13 -13, c). autoagglutination may also result in a falsely increased mcv and decreased red blood cell concentration when clustered cells are mistakenly counted as single cells by automated hematology analyzers. when autoagglutination is present, the packed-cell volume is the most reliable measurement of red blood cell mass. ghost cells are ruptured red blood cell membranes devoid of cytoplasmic contents (see a) . they indicate intravascular hemolysis and may be seen with a variety of hemolytic disorders, including those with immune-mediated, infectious, oxidative, or fragmentation causes. in the case of immune-mediated hemolytic anemia, antibody or complement binds to red blood cell membranes and activates the complement membrane attack complex (see fig. 13 -12). this causes pore formation in the red blood cell membrane and release of cytoplasmic contents into the plasma. ghost cells are eventually cleared from circulation by phagocytic macrophages, mainly within the spleen. oxidative damage to erythrocytes occurs when normal antioxidative pathways that generate reducing agents (such as reduced nicotinamide adenine dinucleotide [nadh] , reduced nicotinamide adenine dinucleotide phosphate [nadph] , and reduced glutathione [gsh]) are compromised or overwhelmed, resulting in hemolytic anemia, abnormal hemoglobin function, or both. hemolysis caused by oxidative damage may be extravascular or intravascular, or a combination. evidence of oxidative damage to erythrocytes may be apparent on blood smear examination as heinz bodies or eccentrocytes or on gross examination as methemoglobinemia. heinz bodies are foci of denatured globin that interact with the erythrocyte membrane. they are usually subtly evident on routine wright-stained blood smears as pale circular inclusions or blunt, rounded protrusions of the cell margin but are readily discernible on smears stained with new methylene blue . cats are particularly susceptible to heinz body formation and may have low numbers of heinz bodies normally. there is no unanimity of opinion, but some clinical pathologists believe that the presence of heinz bodies in up to 10% of all erythrocytes in cats is within normal limits. this predisposition is believed to reflect unique features of the feline erythrocyte, whose hemoglobin has more sulfhydryl groups (preferential sites for oxidative damage) than do erythrocytes of other species and may also have lower intrinsic reducing capacity. it is also possible that the feline spleen does not have as efficient a "pitting" function (splenic structure and function are discussed in more detail later in this chapter). eccentrocytes, evident as erythrocytes in which one side of the cell has increased pallor ( fig. 13-15, a) , are another manifestation of oxidative damage. they form because of cross-linking of total hemoglobin), methemoglobin imparts a grossly discernible chocolate color to the blood. by itself, mechanical fragmentation hemolysis tends to cause mild or no anemia. mechanical fragmentation results from trauma or shearing of erythrocytes within blood vessels. normal erythrocytes may be flowing through abnormal vasculature, such as with heart valve defects, intravascular fibrin deposition (e.g., disseminated intravascular coagulation), vasculitis, or hemangiosarcoma. alternatively, the red blood cells may be particularly fragile within normal blood vasculature, as occurs with iron deficiency. in either instance, microscopic evidence of mechanical fragmentation includes the presence of erythrocyte fragments (schistocytes [see fig. 13 -15, c]), erythrocytes with irregular cytoplasmic projections (acanthocytes), erythrocytes with blister-like projections (keratocytes), or ghost cells (see figs. 13-13, a, 13-15, d, and 13-15, e). membrane proteins, with adhesion of opposing areas of the cell's inner membrane leaflet, and displacement of most of the hemoglobin toward the other side. the fused membranes may fragment off of the eccentrocyte, leaving a slightly ruffled border; this cellular morphologic abnormality is called a pyknocyte (see fig. 13-15, b) . oxidative insult may also result in conversion of hemoglobin (iron in the fe 2+ state) to methemoglobin (iron in the fe 3+ state), which is incapable of binding oxygen. methemoglobin is produced normally in small amounts but reduced back to oxyhemoglobin by the enzyme cytochrome-b 5 reductase (also known as methemoglobin reductase). methemoglobinemia results when methemoglobin is produced in excessive amounts (because of oxidative insult) or when the normal pathways for maintaining hemoglobin in the fe 2+ state are impaired (as in cytochrome-b 5 reductase deficiency). when present in sufficiently high concentration (approximately 10% of degradation. antierythrocyte antibodies bind rbc surface antigens, resulting in rbc opsonization by immunoglobulins (mainly immunoglobulin g [igg] ) and complement (primarily c3b). immunoglobulin-or c3b-bound rbcs are phagocytized and digested by sinusoidal macrophages. 2, spherocytes. spherocytes form when the membrane of immunoglobulin-or c3b-bound rbcs are phagocytized by macrophages, without removing the entire rbc from circulation. compared to normal erythrocytes, spherocytes appear smaller, more eosinophilic, and lack central pallor. 3, rbc aggregation (agglutination). rbc aggregation occurs when antierythrocyte immunoglobulins (immunoglobulin m [igm] or high concentrations of igg) bind multiple erythrocytes simultaneously. 4, ghost cells. antierythrocyte antibodies bind rbc surface antigens, resulting in complement activation and formation of the membrane attack complex (mac). macs form membrane pores, resulting in rupture of rbcs, and the release of hemoglobin into the circulation. ghost cells are rbc membrane remnants that lack cytoplasm (hemoglobin normocytic, normochromic anemia). it has long been known that patients with inflammatory or other chronic disease often become anemic, and that this condition results in increased iron stores in the bone marrow. sequestration of iron may be a bacteriostatic evolutionary adaptation because many bacteria require iron as a cofactor for growth. in recent years, investigators have begun to elucidate the molecular mechanisms underlying anemia of inflammation. hepcidin, an acute phase protein and antimicrobial peptide synthesized in the liver, is a key mediator that limits iron availability. hepcidin expression increases with inflammation, infection, or iron overload and decreases with anemia or hypoxia. hepcidin exerts its effects by causing functional iron deficiency. it binds to and causes the degradation of the cell surface iron efflux molecule, ferroportin, thus inhibiting both absorption of dietary iron from the intestinal epithelium and export of iron from macrophages and hepatocytes into the plasma ( fig. 13-16 ). anemia of inflammation involves factors besides decreased iron availability. inflammatory cytokines are likely to inhibit erythropoiesis by oxidative damage to and triggering apoptosis of developing erythroid cells, by decreasing expression of epo and stem cell factor, and by decreasing expression of epo receptors. in addition, experimentally induced sterile inflammation in cats resulted in shortened erythrocyte survival, indicating that anemia of inflammation is likely also a function of increased erythrocyte destruction. other causes of decreased erythropoiesis are listed in table 13 -2. specific examples of diseases causing nonregenerative anemia by these mechanisms are discussed later in this chapter. neutropenia. neutropenia refers to a decrease in the concentration of neutrophils in circulating blood. neutropenia may be caused by decreased production, increased destruction, altered distribution, or a demand for neutrophils in tissues that exceeds the rate of granulopoiesis. decreased production is evident on bone marrow examination as granulocytic hypoplasia. this usually results from an insult that affects multiple hematopoietic lineages, such as chemical insult, radiation, neoplasia, infection, or fibrosis, but may also be caused by a process that preferentially targets granulopoiesis. in marked contrast to erythrocytes, neutrophils have a very short life span in circulation. once released from the bone marrow, a neutrophil is in the bloodstream only for hours before migrating into the tissues. when neutrophil production ceases, a reserve of mature neutrophils in the bone marrow storage pool may be adequate to maintain normal numbers of circulating neutrophils for a few days; however, after the bone marrow storage pool is depleted, neutropenia rapidly ensues. immune-mediated neutropenia is a rare but recognized condition in domestic animals. bone marrow findings range from granulocytic hypoplasia to hyperplasia, depending on where the cells under immune attack are in their differentiation programs. neutropenia with no evidence of decreased production and in which other causes of neutropenia have been excluded may be a result of destruction of neutrophils before they leave the bone marrow, a condition known as ineffective granulopoiesis. like other forms of ineffective hematopoiesis, this condition is often presumed to be immune mediated; in cats this condition may occur as a result of infection of hematopoietic cells with felv. as presented in the earlier section on granulopoiesis and monocytopoiesis (myelopoiesis), neutrophils within the blood vasculature are in two compartments: a circulating pool, consisting of those cells flowing freely in the blood, and a marginating pool, consisting of those cells transiently interacting with the endothelial surface. (in reality, neutrophils are constantly shifting between these two pools, but the proportion of cells in either pool normally remains fairly schistocytes are the only red blood cell morphologic abnormality specific for mechanical fragmentation because all other morphologic abnormalities can be seen with other disease processes. for example, ghost cells may be observed with other types of hemolysis. nonregenerative anemia is characterized by a lack of reticulocytosis on the cbc; however, reticulocytosis does not occur in horses even in the context of regeneration. most often this is a result of decreased production in the marrow (i.e., erythroid hypoplasia). erythrocytes circulate for a long time, so anemias caused by decreased production tend to develop slowly. the most common form of nonregenerative anemia is known as anemia of inflammation or anemia of chronic disease. in this form of anemia, erythrocytes are decreased in number but are typically normal in size and hemoglobin concentration (so-called c or because of one specifically depressing thrombopoiesis. in either case, decreased thrombopoiesis is evident as megakaryocytic hypoplasia upon bone marrow examination. general causes of decreased hematopoiesis outlined earlier in the sections on anemia and neutropenia also apply to thrombocytopenia. increased platelet destruction due to immune-mediated thrombocytopenia (imtp) is a fairly common disease in dogs and may also occur in other species. thrombocytopenia with immune-mediated thrombocytopenia is often severe (e.g., <10,000 platelets/µl), resulting in spontaneous multisystemic hemorrhage. increased use of platelets occurs with hemorrhage and disseminated intravascular coagulation. thrombocytopenia secondary to hemorrhage is often mild to moderate, whereas disseminated intravascular coagulation may cause mild to severe thrombocytopenia, often with evidence of mechanical fragmentation hemolysis (e.g., schistocytes). disseminated intravascular coagulation is a syndrome in which hypercoagulability leads to increased consumption of both platelets and coagulation factors in the plasma, with subsequent hypocoagulability and susceptibility to bleeding. risk factors for developing disseminated intravascular coagulation include severe inflammation, such as sepsis or pancreatitis, neoplasia, and organ failure. the spleen normally contains a significant proportion of total platelet mass (up to one-third in some species), and abnormalities involving the spleen may result in changes in the number of circulating platelets. for example, splenic congestion may result in platelet sequestration and thrombocytopenia, and splenic contraction may cause thrombocytosis. lymphopenia. lymphopenia refers to a decreased concentration of lymphocytes in blood. it is a common hematologic finding in sick animals. usually the precise mechanism of lymphopenia is not clear but is often presumed secondary to endogenous glucocorticoid excess that occurs with stress. excess glucocorticoids, either endogenous or exogenous, cause an altered distribution of lymphocytes; there is increased trafficking of lymphocytes from blood to lymphoid tissue, and decreased egress of lymphocytes from lymphoid tissue to blood. at higher concentrations of glucocorticoids, lymphocytes are destroyed. other causes of lymphotoxicity include chemotherapeutic agents, radiation therapy, and some infectious agents. lymphopenia may occur with various mechanisms, including loss of lymphocyte-rich lymphatic fluid (e.g., gastrointestinal disease, repeated drainage of chylous effusions), and disruption of the normal lymphoid tissue architecture because constant in any given species.) circulating neutrophils are part of the blood sample collected during routine venipuncture and are thus counted in the cbc, whereas marginating neutrophils are not. pseudoneutropenia refers to the situation in which there is an increased proportion of neutrophils in the marginating pool. this may occur because of decreased blood flow or in response to stimuli, such as endotoxemia, that increase expression of molecules promoting interaction between neutrophils and endothelial cells. this mechanism of neutropenia is rarely observed in clinical practice. neutropenia may also result from increased demand for neutrophils in the tissue. how rapidly such a situation develops depends not only on the magnitude of the inflammatory stimulus but also on the reserve of postmitotic neutrophils in the bone marrow. the size of this reserve, or storage pool, is species dependent. in dogs this pool contains the equivalent of 5 days' normal production of neutrophils. cattle represent the other extreme in that they have a small storage pool and thus are predisposed to becoming neutropenic during times of acute inflammation. horses and cats are somewhere between the two extremes, closer to cattle and dogs, respectively. it stands to reason that the clinical significance of neutropenia because of a supply and demand imbalance is also species dependent. in dogs, neutropenia as a result of inflammation is an alarming finding because it is evidence of a massive tissue demand for neutrophils that has exhausted the patient's storage pool and is exceeding the rate of granulopoiesis in the bone marrow. however in cows, neutropenia is commonly noted in a wide range of conditions involving acute inflammation and does not necessarily indicate an overwhelming demand. eosinopenia/basopenia. eosinopenia and basopenia are decreased concentrations of blood eosinophils and basophils, respectively. in many laboratories, cbc reference values for eosinophils and basophils are as low as zero cells per microliter, precluding detection of eosinopenia or basopenia. when detectable, eosinopenia is often a result of stress (i.e., glucocorticoid mediated). monocytopenia. monocytopenia denotes a decreased concentration of monocytes in blood; it is of little to no pathologic significance by itself. thrombocytopenia. thrombocytopenia refers to a decrease in the concentration of circulating platelets. mechanisms of thrombocytopenia include decreased production, increased destruction, increased consumption, and altered distribution. decreased production may occur because of a condition affecting cells of multiple hematopoietic lineages, including megakaryocytes, is often used interchangeably with erythrocytosis, but technically and for the purposes of this chapter, polycythemia refers to a specific type of leukemia called primary erythrocytosis or polycythemia vera. causes of erythrocytosis are either relative or absolute. relative erythrocytosis results from a fluid deficit or an altered distribution of erythrocytes within the body (i.e., the body's total erythrocyte mass of generalized lymphadenopathy (e.g., lymphoma, blastomycosis). some hereditary immunodeficiencies, such as severe combined immunodeficiency or thymic aplasia, can cause lymphopenia due to lymphoid aplasia. erythrocytosis. an increase in the measured red cell mass above the normal range is known as erythrocytosis. the term polycythemia rarely, an epo-secreting tumor may cause inappropriately elevated levels of epo in the absence of hypoxia. absolute erythrocytosis, whether primary or secondary, causes increased viscosity of the blood, resulting in impaired blood flow and microvasculature distention. affected individuals are at increased risk for tissue hypoxia, thrombosis, and hemorrhage. clinical signs of hyperviscosity syndrome may include erythematous mucous membranes ( fig. 13-17) , prolonged capillary refill time, prominent scleral vessels, evidence of thrombosis or hemorrhage, and secondary signs related to specific organ systems affected (e.g., neurologic and cardiovascular signs). neutrophilia. neutrophilia, an increased blood concentration of neutrophils, occurs in response to a number of different stimuli, which are not mutually exclusive. major mechanisms of neutrophilia are shown in fig. 13-18 . understanding the cbc findings characteristic of these responses is an important part of clinical veterinary medicine. inflammation can result in neutropenia, as discussed earlier, or neutrophilia, as discussed next. however, before moving on to a discussion of inflammatory neutrophilia and the so-called left shift, it is important to mention two other common is not increased). it occurs most frequently with dehydration, when the decreased proportion of water in the blood results in hemoconcentration. it is observed less frequently with epinephrine-mediated splenic contraction, wherein erythrocytes move from the spleen into peripheral circulation. erythrocytosis from splenic contraction occurs to the most pronounced degree in horses and cats, especially in young, healthy animals. absolute erythrocytosis is a true increase in red blood cell mass due to erythroid neoplasia or hyperplasia and includes causes of primary and secondary erythrocytosis. primary erythrocytosis, or polycythemia vera, is a neoplastic proliferation of erythroid cells with a predominance of mature erythrocytes. diagnosis is based on a marked increase in red cell mass (hematocrit in normally hydrated dogs ranges from 65% to >80%), an absence of hypoxemia, an absence of other tumors, and a normal or decreased plasma epo concentration. secondary erythrocytosis refers to epo-mediated erythroid hyperplasia causing an increased red blood cell mass. the erythroid hyperplasia may be an appropriate response to chronic hypoxia, such as occurs with right-to-left cardiac shunts or chronic pulmonary inflammatory mediators, including interleukin-1 (il-1), interleukin-6 (il-6), interferon (inf), and tumor necrosis factor (tnf), cause anemia of inflammatory disease due to oxidative hemolysis, iron sequestration within enterocytes and macrophages, and impaired erythroid responsiveness to erythropoietin (epo). during homeostasis the membrane transport molecule, ferroportin, transports iron from the cytosol to the extracellular space. the iron is then used for various physiologic processes, including hemoglobin production within bone marrow erythroid precursors. during times of inflammation the liver increases production of hepcidin, which binds ferroportin and causes its internalization and lysosomal degradation. with fewer membrane ferroportin molecules, less iron is absorbed from the diet and mobilized from macrophages. rbc, red blood cell. ( of course, neutrophilia may also indicate inflammation, and inflammatory stimuli of varying magnitude and duration produce different patterns of neutrophilia. a classic hematologic finding in patients with increased demand for neutrophils is the presence of immature forms in the blood, known as a left shift. not all inflammatory responses have a left shift, but the presence of a left shift almost always signifies active demand for neutrophils in the tissue. the magnitude of a left shift is assessed by the number of immature cells and their degree of immaturity. the mildest form is characterized by increased numbers of band neutrophils, the immediate predecessor to the segmented neutrophil normally found in circulation. progressively immature predecessors are seen with increasingly severe inflammation. a left shift is considered orderly if the number of immature neutrophils in circulation decreases as they become progressively immature. the term degenerative left shift is sometimes used to describe cases in which the number of immature forms exceeds the number of segmented neutrophils. as with glucocorticoidmediated neutrophilia, the typical magnitude of neutrophilia caused by inflammation varies by species, with dogs having the most pronounced response. it might be useful to think of neutrophil kinetics in terms of a producer-consumer model in which the bone marrow is the factory, and the tissues (where the neutrophils eventually go) are the customers. the bone marrow storage pool is the factory inventory, and the neutrophils in the bloodstream are in delivery to the customer. within the blood vessels, circulating neutrophils are on the highway, and marginating neutrophils are temporarily pulled off to the side of the road. during health, there is an even flow of neutrophils from the factory to the customer. thus the system is in steady state, and neutrophil numbers remain relatively constant and within the normal range. however, disease states may perturb this system at multiple levels. decreased granulopoiesis is analogous to a factory working below normal production level. ineffective granulopoiesis is analogous to goods that are produced at a normal to increased rate but are damaged during manufacturing and never leave the factory. a left shift is analogous to the factory meeting increased customer demand by shipping out unfinished goods. cases of persistent, established inflammation are characterized by bone marrow granulocytic hyperplasia and mature neutrophilia, analogous to a factory that has had time to adjust to increased demand and is meeting it more efficiently by increasing its output. eosinophilia/basophilia. eosinophilia and basophilia are increased concentrations of blood eosinophils and basophils, respectively. they may occur with parasitism, hypersensitivity reactions, paraneoplastic responses (e.g., lymphoma, mast cell neoplasia, or leukemia), and nonparasitic infectious disease. eosinophilia has also been documented with hypoadrenocorticism and rare idiopathic conditions (e.g., hypereosinophilic syndrome). most cases of eosinophilia and basophilia are due to eosinophilic and basophilic hyperplasia within the bone marrow in response to inflammatory growth factors. however, cortisol deficiency is thought to cause eosinophilia in dogs with hypoadrenocorticism. monocytosis. monocytosis is an increased concentration of monocytes in blood. it most commonly occurs with excessive glucocorticoids or inflammation and uncommonly to rarely with monocytic leukemia, immune-mediated neutropenia, and cyclic hematopoiesis. with excessive endogenous or exogenous glucocorticoids, monocytes shift from the marginating pool to the circulating pool. this stress monocytosis is most common in dogs, less frequent in cats, and rare in horses and cattle. inflammatory diseases cause monocytosis by cytokine-mediated monocytic hyperplasia in the bone marrow. causes of neutrophilia: glucocorticoid excess and epinephrine excess. less common causes of neutrophilia, such as leukocyte adhesion deficiency and neoplasia, are discussed later in the chapter. glucocorticoid excess, either because of endogenous production or exogenous administration, results in a cbc pattern known as the stress leukogram, characterized by mature neutrophilia (i.e., increased concentration of segmented neutrophils without immature neutrophils) and lymphopenia, with or without monocytosis and eosinopenia. mechanisms contributing to glucocorticoid-mediated neutrophilia include the following: • increased release of mature neutrophils from the bone marrow storage pool • decreased margination of neutrophils within the vasculature, with a resulting increase in the circulating pool • decreased migration of neutrophils from the bloodstream into tissues the magnitude of neutrophilia tends to be species dependent, with dogs having the most pronounced response (up to 35,000 cells/µl) and in decreasing order of responsiveness, cats (30,000 cells/µl), horses (20,000 cells/µl), and cattle (15,000 cells/µl) having less marked responses. with long-term glucocorticoid excess, neutrophil numbers tend to normalize, whereas lymphopenia persists. epinephrine release results in a different pattern, known as physiologic leukocytosis or excitement leukocytosis, characterized by mature neutrophilia (like the glucocorticoid response) and lymphocytosis (unlike the glucocorticoid response). this phenomenon is short lived (i.e., <1 hour). neutrophilia occurs primarily because of a shift of cells from the marginating to the circulating pool. physiologic leukocytosis is common in cats (especially when they are highly stressed during blood collection) and horses, less common in cattle, and uncommon in dogs. of a regenerative response in patients recovering from thrombocytopenia, as a result of redistribution after splenic contraction, or within the several weeks after splenectomy. in these cases, thrombocytosis is transient. in the case of splenectomy, thrombocytosis may be marked but normalizes after several weeks. because the body's total platelet mass regulates thrombopoiesis, and a significant portion of the platelet mass is normally in the spleen, it makes sense that splenectomized animals develop thrombocytosis. however, the reason that the number of circulating platelets normalizes in these individuals in the weeks after splenectomy is not thrombocytosis. thrombocytosis, or an increased concentration of platelets in the blood, is a relatively common, nonspecific finding in veterinary patients. in the vast majority of cases, thrombocytosis is reactive-a response to another, often apparently unrelated, disease process. examples of conditions having reactive thrombocytosis include inflammatory and infectious diseases, iron deficiency, hemorrhage, endocrinopathies, and neoplasia. factors that may contribute to reactive thrombocytosis include increased plasma concentration of thrombopoietin, inflammatory cytokines (e.g., il-6), or catecholamines. thrombocytosis may also occur as part 1, neutrophils and their precursors are distributed in five pools: a bone marrow precursor pool, which includes mitotically active and inactive immature cells; a bone marrow storage pool, consisting of mitotically inactive mature neutrophils; a peripheral blood marginating pool; a peripheral blood circulating pool; and a tissue pool. the relative size of each pool is represented by the size of its corresponding wedge. the peripheral blood neutrophil count measures only neutrophils within the circulating peripheral blood pool, which can be enlarged by (2) increased demargination, (3) diminished extravasation into tissue, (4) increased release of cells from the marrow storage pool, and (5) cells. the preceding section focused on abnormalities in the number of blood cells. there are also various acquired and congenital conditions involving abnormal structure or function of blood cells. this section briefly discusses abnormal blood cell structure or function occurring secondary to other underlying disease. primary disorders of blood cells are discussed later in the chapter in the section on specific diseases. morphologic abnormalities detected on routine microscopic examination of blood smears may provide important clues about underlying disease processes. poikilocytosis is a broad term referring to the presence of abnormally shaped erythrocytes in circulation. etable 13 -1 lists conditions with and mechanisms involved in the formation of a number of specific types of erythrocyte morphologic abnormalities, and fig. 13-15 shows some examples. the acquired neutrophil morphologic abnormality known as toxic change (fig. 13 -20) reflects accelerated production of neutrophils as part of the inflammatory response. features of toxic change include increased cytoplasmic basophilia, the presence of small bluegray cytoplasmic inclusions known as döhle bodies (often noted incidentally in cats), and in more severe cases, cytoplasmic vacuolation. although not causing impaired neutrophil function, toxic change occurs during granulopoiesis and thus is technically a form of dysplasia (e.g., döhle bodies are foci of aggregated endoplasmic reticulum). toxic change may accompany any inflammatory response, but in general the more marked the toxic change, the higher the index of suspicion for infection or endotoxemia. other secondary changes to neutrophils may not be evident morphologically. for example, studies in human beings and dogs have shown that individuals with cancer have abnormal neutrophil function (including phagocytic activity, killing capacity, and oxidative burst activity) before initiation of therapy. the clinical significance of this finding is not clear. platelet function disorders, also known as thrombopathies or thrombopathias, may be primary or secondary. many conditions are known or suspected to cause secondary platelet dysfunction (hypofunction or hyperfunction) by altering platelet adhesion or aggregation or by mechanisms that are not fully understood. box 13-2 shows underlying conditions having secondary platelet dysfunction. clear. there is also a rare form of megakaryocytic leukemia known as essential thrombocythemia, which is characterized by marked thrombocytosis. lymphocytosis. lymphocytosis refers to an increase in the concentration of lymphocytes in blood circulation. there are several causes of lymphocytosis, including age, excessive epinephrine, chronic inflammation, hypoadrenocorticism, and lymphoid neoplasia; lymphoid neoplasms are presented later in the chapter. young animals normally have higher concentrations of lymphocytes than older animals, and normal healthy young animals may have counts that exceed adult reference values. because this is not pathologic lymphocytosis, but normal physiologic variation, it is often termed pseudolymphocytosis of young animals. as discussed earlier in the section on neutrophilia, lymphocytosis is also a feature of epinephrine-mediated physiologic leukocytosis, resulting from redistribution of lymphocytes from the blood marginating pool into the blood circulating pool. epinephrine-mediated lymphocytosis may be more marked than neutrophilia, particularly in cats (lymphocyte counts of > 20,000/µl are not uncommon). antigenic stimulation may result in lymphocytosis, which may be marked in rare cases (up to approximately 30,000/µl in dogs and 40,000/µl in cats); however, this is not usually the case, even when there is clear evidence of increased immunologic activity in lymphoid tissues. in cases of antigenic stimulation, it is common for a minority of lymphocytes to have "reactive" morphologic features-larger lymphocytes with more abundant, deeply basophilic cytoplasm and more open chromatin ( fig. 13-19 ). just as glucocorticoid excess can cause lymphopenia, glucocorticoid deficiency (hypoadrenocorticism) can cause lymphocytosis, or lack of lymphopenia during conditions of stress that typically result in glucocorticoid-mediated lymphopenia. a condition known as persistent lymphocytosis (pl) occurs in approximately 30% of cattle infected with the bovine leukemia virus (blv). the condition is defined as an increase in the blood concentration of lymphocytes above the reference interval for at least 3 months. this form of lymphocytosis is a nonneoplastic proliferation (i.e., hyperplasia) of b lymphocytes. in the absence of other disease, cattle with persistent lymphocytosis are asymptomatic. however, cattle infected with blv, especially those animals with persistent lymphocytosis, are at increased risk for developing b lymphocyte lymphoma. aplastic anemia (aplastic pancytopenia) aplastic anemia, or more accurately aplastic pancytopenia, is a rare condition characterized by aplasia or severe hypoplasia of all hematopoietic lineages in the bone marrow with resulting cytopenias. the term aplastic anemia is a misnomer because affected cells are not limited to the erythroid lineage. many of the conditions reported to cause aplastic anemia do so only rarely or idiosyncratically; more frequently, they cause other hematologic or nonhematologic abnormalities. a partial list of reported causes of aplastic anemia in domestic animals includes the following: most of these causes, especially the chemical agents, are directly cytotoxic to hscs or progenitor cells, resulting in their destruction. however, another proposed mechanism is disruption of normal stem cell function because of mutation or perturbation of hematopoietic cells and/or their microenvironment. this pathogenesis is mostly recognized in retroviral infections. aplastic anemia occurs in both acute and chronic forms. most of the chemical causes result in acute disease. grossly, affected animals may show signs of multisystemic infection and hemorrhage due to severe neutropenia and thrombocytopenia, respectively. severe neutropenia typically develops within 1 week of an acute insult to the bone marrow, and severe thrombocytopenia occurs in the second week. this sequence is a result of the circulating life spans of each cell type; in health, neutrophils have a blood half-life of 5 to 10 hours, whereas platelets circulate for 5 to 10 days. the development of signs of anemia, such as pale mucous membranes, is more variable. the presence and severity of anemia depends on how rapidly the marrow recovers from the insult and the erythrocyte life span of the particular species. microscopically, bone marrow is hypocellular with markedly reduced hematopoietic cells. hematopoietic cells are replaced with adipose tissue, and there is a variable inflammatory infiltrate of lymphocytes, plasma cells, and macrophages. in addition, there may be necrosis, hematopoietic cell apoptosis, and an increase in phagocytic macrophages. fig. 13 -21 shows bone marrow aspirates from a dog with pancytopenia from acute 5-fluorouracil toxicosis, before and during recovery. many inherited or presumably inherited disorders of blood cells have been recognized in domestic animals, including rare or sporadic cases and conditions that are of questionable clinical relevance. this section and the later sections covering species-specific disorders are invading cells or microorganism gain access to the bone marrow or blood circulation either hematogenously or by trauma. trauma may be as obvious as a gaping wound or as subtle as the bite of an insect. portals of entry for the bone marrow are summarized in box 13-3. diseases that arise from the bone marrow, such as leukemia, typically spread to other tissues hematogenously. the bone marrow is encased by a protective shell of cortical bone, and blood supply to the marrow provides access to systemic humoral and cellular defenses. of course, leukocytes themselves function as an essential part of inflammation and immune function, as discussed briefly in the section on granulopoiesis and monocytopoiesis (myelopoiesis) and in greater detail in chapters 3 and 5. biochemical steps in the glycolytic pathway or linked to it generate antioxidant molecules that enable erythrocytes to withstand oxidative insults throughout their many days in circulation. in addition to producing energy in the form of adenosine triphosphate (atp), glycolysis generates nadh, which helps convert the oxidized, nonfunctional form of hemoglobin, known as methemoglobin, back to its active, reduced state. another antioxidant erythrocyte metabolic pathway, the pentose shunt or hexose uremia antiplatelet antibodies (also cause immune-mediated thrombocytopenia) infection (bvdv, felv) hyperglobulinemia increased fibrinolytic products hypoammonemia snake envenomation platelet inhibitors nsaids-irreversible (aspirin) or reversible inhibition of cyclooxygenase colloidal plasma expanders (e.g., hydroxyethyl starch) other drugs and exogenous agents (many) hematogenously direct penetration (trauma) including hypercellular glomeruli, thickened glomerular and tubular basement membranes, and tubular epithelial lipidosis, degeneration, and necrosis. other porphyrias have been diagnosed in cattle, pigs, and cats but are not known to cause hemolytic anemia. pyruvate kinase deficiency. pyruvate kinase (pk) deficiency is an inherited autosomal recessive condition due to a defective r-type pk isoenzyme that is normally present in high concentrations in mature erythrocytes. to compensate for this deficiency, there is persistence of the m2-type pk isoenzyme, which is less stable than the r-type isoenzyme. the disease is reported in many dog breeds and fewer cat breeds (e.g., abyssinian, somali, and domestic shorthair). erythrocyte pk deficiency results in decreased atp production and shortened erythrocyte life spans. in dogs the hemolytic anemia is typically chronic, moderate to severe, extravascular, and strongly regenerative. with chronicity, hemolytic anemia causes enhanced intestinal absorption of iron and subsequent hemosiderosis, especially of the liver and bone marrow. dogs typically die at 1 to 5 years of age of hemochromatosis-induced liver and bone marrow failure. however, cats with pk deficiency typically show no clinical signs, have milder anemia, and do not develop organ failure. grossly, affected animals have lesions attributed to hemolytic anemia, including splenomegaly, pale mucous membranes, and rarely icterus. dogs with end-stage disease have cirrhosis, myelofibrosis, and osteosclerosis. dogs with pk deficiency do not necessarily have the same genetic defect, so mutation-specific dna-based assays are required. in contrast, a single dna-based test is available to detect the common mutation affecting abyssinian, somali, and domestic shorthair cats. cytochrome-b 5 reductase deficiency. deficiency of cytochrome-b 5 reductase (cb5r, also known as methemoglobin reductase), the enzyme that catalyzes the reduction of methemoglobin (fe 3+ ) to hemoglobin (fe 2+ ), has been recognized in many dog breeds and in domestic shorthair cats. it is probably an autosomal recessive trait. affected animals may have cyanotic mucous membranes or exercise intolerance but usually lack anemia and clinical signs of disease. life expectancies are normal. glucose-6-phosphate dehydrogenase deficiency. deficiency of glucose-6-phosphate dehydrogenase (g6pd), the ratecontrolling enzyme of the pentose phosphate pathway (ppp), has been reported in an american saddlebred colt, its dam, and one male dog. the ppp is an antioxidative pathway that generates nadph, which maintains glutathione in its reduced form (gsh). therefore in animals with g6pd deficiency, oxidants are not scavenged, and erythrocyte oxidative injury occurs. the colt with g6pd deficiency had severe oxidative hemolytic anemia with eccentrocytes on blood smear evaluation. however, the colt's dam only had eccentrocytes, and showed no hematologic signs of disease. leukocyte adhesion deficiency. leukocyte adhesion deficiency (lad) is a fatal autosomal recessive defect of leukocyte integrins, in particular the β 2 chain (also known as cluster of differentiation [cd] 18 [cd18]). disease has been recognized in holstein cattle (known as bovine leukocyte adhesion deficiency [blad] ) and irish setter dogs (known as canine leukocyte adhesion deficiency [clad]) (see chapter 3). without normal expression of this adhesion molecule, leukocytes have severely impaired abilities to migrate from the blood into tissues. as a result, animals with leukocyte adhesion deficiency have marked neutrophilia with nonsuppurative multisystemic infections. blood neutrophils often have nuclei with greater than five nuclear segments, termed not comprehensive but instead focus on the more common, wellcharacterized, or recently reported conditions. erythropoietic porphyrias. porphyrias are a group of hereditary disorders in which porphyrins accumulate in the body because of defective heme synthesis. inherited enzyme defects in hemoglobin synthesis have been identified in holstein cattle, siamese cats, and other cattle and cat breeds, resulting in bovine congenital erythropoietic porphyria and feline erythropoietic porphyria, respectively. accumulation of toxic porphyrins in erythrocytes causes hemolytic anemia, whereas accumulation of porphyrins in tissues and fluids produces discoloration, including red-brown teeth, bones, and urine (see fig. 1 -59). because of the circulation of the photodynamic porphyrins in blood, these animals have lesions of photosensitization of the nonpigmented skin. all affected tissues, including erythrocytes, exhibit fluorescence with ultraviolet light. histologically, animals may exhibit perivascular dermatitis, as well as multisystemic porphyrin deposition, hemosiderosis, emh, and marrow erythroid hyperplasia. cats may show evidence of renal disease, a b syndrome exhibit oculocutaneous albinism (due to altered distribution of melanin granules) and are prone to infection and bleeding. blood smear evaluation reveals granulocytes with large cytoplasmic granules. glanzmann thrombasthenia. glanzmann thrombasthenia (gt) is an inherited platelet function defect caused by a mutated α iib subunit of the integrin α iib β 3 (also known as glycoprotein iib-iiia [gpiib-iiia]). the disorder has been recognized in great pyrenees and otterhound dogs and several horse breeds, including a quarter horse, a standardbred, a thoroughbred-cross, a peruvian paso mare, and an oldenburg filly. the α iib β 3 molecule has multiple functions but is best known as a fibrinogen receptor that is essential for normal platelet aggregation. bleeding tendencies vary widely between affected individuals but mainly occur on mucosal surfaces. the condition is characterized by an in vitro lack of response to all platelet agonists and severely impaired clot retraction (i.e., whole blood samples without anticoagulant often fail to clot). molecular testing is available to detect diseased or carrier states in dogs and horses. thrombopathia. calcium diacylglycerol guanine nucleotide exchange factor i (caldag-gefi) is a molecule within the signaling pathway that results in platelet activation in response to platelet agonists. mutated caldag-gefi has been documented in basset hound, eskimo spitz, and landseer dogs, and simmental cattle. all reported mutations have a bleeding tendency. in vitro platelet aggregation responses to platelet agonists, such as adp, collagen, and thrombin, are absent or impaired. information on this topic is available at www.expertconsult.com. information on this topic is available at www.expertconsult.com. information on this topic is available at www.expertconsult.com. oxidative agents. a variety of oxidative toxins cause hemolytic anemia and/or methemoglobinemia in domestic species. more common or well-characterized oxidants are listed here: • horses-acer rubrum (red maple) • ruminants-brassica spp. (cabbage, kale, and rape), copper hypersegmented neutrophils, due to neutrophil aging within blood vessels ( fig. 13-22 ). these animals are highly susceptible to infections and usually die at a young age. pelger-huët anomaly. pelger-huët anomaly (pha) is a condition of hyposegmented granulocytes due to a lamin b receptor mutation. it has been described in dogs, cats, horses, and rabbits, especially in certain breeds. in australian shepherd dogs the mode of inheritance is autosomal dominant with incomplete penetrance. most cases of pelger-huët anomaly are the heterozygous form and of no clinical significance. however, skeletal abnormalities, stillbirths, and/or early mortality may accompany pelger-huët anomaly in rabbits and cats, especially homozygotes. in pelger-huët anomaly the nuclei of neutrophils, eosinophils, and basophils fail to segment, resulting in band-shaped, bean-shaped, or round nuclei. although the nuclear shape is similar to that of an inflammatory left shift, healthy animals with pelger-huët anomaly do not have clinical signs or other laboratory findings indicating inflammation. for example, neutrophils in healthy animals with pelger-huët anomaly have mature (clumped) chromatin and do not show signs of toxicity ( fig. 13-23 ). an acquired, reversible condition mimicking pelger-huët anomaly, known as pseudo-pelger-huët anomaly, is occasionally noted in animals with infectious disease, neoplasia, or drug administration. chédiak-higashi syndrome. chédiak-higashi syndrome (chs) is a rare autosomal recessive defect in the lysosomal trafficking regulator (lyst) protein. the syndrome has been identified in hereford, brangus, and japanese black cattle, persian cats, and several nondomestic species. the defective lyst protein results in granule fusion in multiple cell types, including granulocytes, platelets, and melanocytes, as well as abnormal cell function. individuals with chédiak-higashi syndrome have severely impaired cellular innate immunity because of neutropenia, impaired leukocyte chemotaxis, and impaired killing by granulocytes and cytotoxic lymphocytes. platelets lack the dense granules that normally contain key bioactive molecules involved in hemostasis, including platelet agonists, such as adp and serotonin. in vitro platelet aggregation is severely impaired. as a result, animals with chédiak-higashi 746.e1 chapter 13 bone marrow, blood cells, and the lymphoid/lymphatic system von willebrand disease (vwd) is the most common canine hereditary bleeding disorder and has also been described in many other domestic species. the disease actually refers to a group of inherited conditions characterized by a quantitative or qualitative deficiency of vwf. this factor is a multimeric glycoprotein that is stored in platelet α-granules and endothelial cells and circulates as a complex with coagulation factor viii. its primary functions are to stabilize factor viii and mediate platelet binding to other platelets and subendothelial collagen. although not technically a platelet disorder, von willebrand disease is often classified as such because it results in a loss of normal platelet function. different types of von willebrand disease vary in terms of mode of inheritance and severity of clinical disease. type i von willebrand disease is characterized by low plasma vwf concentration but normal multimeric proportions and a mild to moderate clinical bleeding tendency; it has been reported in many dog breeds. type ii von willebrand disease is characterized by low vwf concentration, absence of large multimers, and a moderate to severe bleeding tendency; it has been reported in german short-haired pointer and german wirehaired pointer dogs. type iii von willebrand disease is characterized by absence of vwf and a severe bleeding tendency; familial and sporadic cases have been reported in numerous dog breeds. the buccal mucosal bleeding time is prolonged with von willebrand disease, often with adequate concentrations of platelets and normal prothrombin time and partial thromboplastin time (ptt). however, ptt may be mildly prolonged because vwf stabilizes factor viii, and deficiency of vwf results in enhanced factor viii degradation. grossly, affected animals exhibit bleeding tendencies, especially in the form of inherited coagulation factor deficiencies have been documented in most domestic species, including deficiencies of prekallikrein and factors i, ii, vii, viii, ix, x, xi, and xii. of these disorders, hereditary coagulation factor viii (hemophilia a) and factor ix (hemophilia b) deficiencies are most common. hemophilia a has been recognized in horses, cattle, dogs, and cats, and hemophilia b occurs in dogs and cats. both disorders have an x-linked recessive mode of inheritance, meaning that clinical disease is more common in males. affected males have variable tendencies to bleed, depending on the severity of the deficiency, exposure to trauma, and size and activity level of the affected individual. carrier females are usually asymptomatic. laboratory tests often reveal adequate platelets, normal prothrombin times, and prolonged partial thromboplastin times. hereditary defects in γ-glutamyl carboxylase, the enzyme required for normal carboxylation of vitamin k-dependent coagulation factors, have been recognized in a flock of rambouillet sheep and two devon rex cats from the same litter. the genetic defect is not known in cats, but in sheep it is an autosomal recessive trait that results in a premature stop codon and truncated γ-glutamyl carboxylase. in sheep there is increased lamb mortality with excessive bleeding during parturition, especially through the umbilicus or into subcutaneous tissues. gingival bleeding, epistaxis, and hematuria or at sites of injections, venipuncture, or surgery. of loss is through the gastrointestinal tract (e.g., neoplasia in older animals or hookworm infection in puppies). chronic blood loss may also be caused by marked ectoparasitism (e.g., pediculosis in cattle or massive flea burden in kittens and puppies), neoplasia in locations other than the gastrointestinal tract (e.g., cutaneous hemangiosarcoma), coagulation disorders, and repeated phlebotomy of blood donor animals. rapidly growing nursing animals may be iron deficient when compared with adults because milk is an iron-poor diet. in most cases this has little clinical significance (and in fact is normal). an important exception is piglets with no access to iron, which may cause anemia, failure to thrive, and increased mortality. neonatal piglets are routinely given parenteral iron for this reason. copper deficiency can cause iron deficiency in ruminants and may occur because of copper-deficient forage or impaired usage of copper by high dietary molybdenum or sulfate. it is believed that copper deficiency impairs production of ceruloplasmin, a copper-containing enzyme involved in gastrointestinal iron absorption. iron deficiency causes anemia by impaired hemoglobin synthesis. iron is an essential component of hemoglobin, and when it is absent, hemoglobin synthesis is depressed. because erythrocyte maturation is dependent upon obtaining a critical hemoglobin concentration, maturing erythroid precursors undergo additional cell divisions during iron-deficient states. these additional cell divisions result in small erythrocytes, termed microcytes (see fig. 13 -15, g). however, erythrocytes with low hemoglobin concentrations are produced when microcyte formation can no longer compensate for iron deficiency. the classic hematologic picture with iron deficiency anemia is microcytic (i.e., decreased mcv), hypochromic (i.e., decreased mchc) anemia. microcytes and hypochromasia (see fig. 13 -15, g) may also be discernible on blood smear examination as erythrocytes that are abnormally small and paler-staining, respectively. early iron deficiency anemia is poorly regenerative, whereas continued hemorrhage and iron loss cause nonregenerative anemia. additional hematologic changes may include evidence of erythrocyte mechanical fragmentation (e.g., schistocytes) and reactive thrombocytosis. hypophosphatemic hemolytic anemia. marked hypophosphatemia is recognized as a cause of intravascular hemolytic anemia in postparturient dairy cows and diabetic animals receiving insulin therapy. in postparturient cows, hypophosphatemia results from increased loss of phosphorus in their milk. insulin therapy may cause hypophosphatemia by shifting phosphorus from the extracellular space to the intracellular space. in either case, marked hypophosphatemia (e.g., 1 mg/dl in cows, or ≤ 1.5 mg/dl in cats) is thought to decrease erythrocyte production of atp, leading to inadequate energy required for maintenance of membrane and cytoskeletal integrity. an accompanying decrease in reducing capacity and increase in methemoglobin concentration have also been noted in experimental studies of hypophosphatemic hemolytic anemia in dairy cattle, suggesting that oxidative mechanisms may also contribute to anemia. affected animals are anemic and hemoglobinuric. gross postmortem findings include pallor, decreased viscosity of the blood, and lesions arising from the underlying metabolic derangement (e.g., discolored pale yellow and swollen liver due to hepatic lipidosis). renal tubular necrosis and hemoglobin pigment within the tubules is evident microscopically. this section covers infectious agents within the same genus that are recognized to cause disease in multiple species. other infectious agents with more limited host specificity (e.g., cytauxzoonosis in cats, feline and equine retroviruses) are covered in later sections on species-specific diseases. throughout both sections, diseases are • dogs-acetaminophen, propofol, zinc • cats-acetaminophen, propofol, propylene glycol • all species-allium spp. (chives, garlic, and onions) in horses, red maple leaves and bark are toxic, especially wilted or dried leaves. the toxic principle is believed to be gallic acid. plants that contain high concentrations of nitrates, such as cabbage, kale, and rape, may cause oxidative injury to erythrocytes; cattle are more susceptible than sheep and goats. however, sheep are more prone to copper toxicosis relative to other ruminants. the condition occurs in animals that have chronically accumulated large amounts of copper in the liver through the diet. the copper is then acutely released during conditions of stress, such as shipping or starvation. continuous rate infusions of the anesthetic propofol may cause oxidative hemolytic anemia in dogs and cats, but single or multiple single doses are not expected to cause clinical hemolysis. zinc toxicosis has been identified in a wide range of animals; however, it is most common in dogs due to their indiscriminate eating habits. common sources include pennies, batteries, paints, creams, automotive parts, screws, nuts, and coating on galvanized metals. propylene glycol is an odorless, slightly sweet solvent and moistening agent in many foods, drugs, and tobacco products. although it is "generally recognized as safe" for animal foods other than for cats by the food and drug administration, it has been banned from cat food since 1996. grossly and microscopically, animals show varying signs of oxidative hemolysis and/or methemoglobinemia, as previously presented in the section discussing anemias (see bone marrow and blood cells, dysfunction/responses to injury, blood cells, abnormal concentrations of blood cells, anemia). in sheep with copper toxicosis, hemoglobinuric nephrosis, frequently described as gunmetal-colored kidneys with port wine-colored urine, is a classic postmortem lesion. snake envenomation. hemolytic anemia from snake envenomation has been reported in horses, dogs, and cats. it is most commonly reported with viper and pit viper envenomations, including those from rattlesnakes. hemolysins within viper venom directly injure erythrocytes, causing intravascular hemolysis. other mechanisms of hemolysis include the action of phospholipase a 2 on erythrocyte membranes and erythrocyte mechanical fragmentation due to intravascular coagulation and vasculitis. nonhemolytic lesions depend on the venom's additional components and may include hemorrhage, paralysis, and/or tissue edema, inflammation, and necrosis. on blood smear evaluation, animals with snake envenomation may have ghost cells, spherocytes, and/or echinocytes (see figs. 13-13 and 13-15). information on this topic is available at www.expertconsult.com. severe malnutrition is probably a cause of nonregenerative anemia in all species attributable to combined deficiencies of molecular building blocks, energy, and essential cofactors. by far the most commonly recognized specific deficiency that results in anemia is iron deficiency. other specific nutritional deficiencies causing anemia in animals are uncommon or rare. acquired cobalamin (vitamin b 12 ) and folate deficiencies are recognized as causes of anemia in human beings but are rare in animals. iron deficiency anemia. iron deficiency is usually not a primary nutritional deficiency but rather occurs secondary to depletion of iron stores via chronic blood loss. the most common route 747.e1 chapter 13 bone marrow, blood cells, and the lymphoid/lymphatic system antagonism of vitamin k leads to production of a nonfunctional form of some coagulation factors and resulting coagulopathy; a similar condition results from vitamin k deficiency. conditions with avitaminosis k include poisoning with coumarin-related molecules, fat malabsorption (vitamin k is a fat-soluble vitamin) caused by primary intestinal disease or impaired biliary outflow (uncommon), dietary deficiency (rare), and antibiotics that interfere with vitamin k absorption or usage. a number of coagulation factors-factors ii, vii, ix, and x (collectively known as the vitamin k-dependent factors), as well as the regulatory molecules protein c and protein s-must undergo carboxylation to be functional. this posttranslational modification is catalyzed by the enzyme γ-glutamyl carboxylase, and requires vitamin k as a cofactor. vitamin k is oxidized during the carboxylation reaction and is converted back into its active reduced form by the enzyme vitamin k epoxide reductase. coumarin-related rodenticides, such as warfarin, act by inhibiting vitamin k epoxide reductase, resulting in an absence of vitamin k in its active reduced form (efig. 13 -2). this inhibition lasts until the rodenticide is metabolized and cleared. how long this takes depends on the half-life of the rodenticide and dose, but it may take many weeks. secondgeneration rodenticides, such as bromadoline and brodifacoum, are more potent than warfarin, with longer half-lives. spoiled sweet clover contains dicumarol, which causes coagulopathy by the same mechanism. efigure 13 -2 mechanism of anticoagulant rodenticide toxicity. anticoagulant rodenticides inhibit the enzyme that converts vitamin k back to its active reduced form. active factors (carboxylated) inactive factors (ii, vii, ix, x) = inactivation of vitamin k epoxide reductase, the enzyme that maintains vitamin k in the active form vitamin k epoxide (inactive) x x laboratory findings include prolonged coagulation times (prothrombin time [pt] , ptt, and activated clotting time [act] ). early in the course of rodenticide and related toxicoses, pt may be the only one of these tests that is prolonged because factor vii has the shortest half-life of the vitamin k-dependent factors. however, the other tests become prolonged as nonfunctional forms of the other factors accumulate. in uncomplicated cases, patients are not thrombocytopenic. a wide range of hemorrhagic lesions may occur in affected individuals, including ecchymoses, epistaxis, gingival bleeding, hematomas, hemoptysis, melena or hematochezia, hematuria, and other forms of hemorrhage. there are also lesions with regenerative anemia, such as pale mucous membranes and splenomegaly. histologically, there is hemorrhage, emh, and marrow erythroid hyperplasia. the treatment of cases of rodenticide and related toxicoses is regular administration of exogenous vitamin k 1 until the toxin is cleared (determined by repeat coagulation testing after withholding treatment). babesiosis may cause intravascular and extravascular hemolytic anemia via direct red blood cell injury, the innocent bystander effect, and secondary immune-mediated hemolytic anemia. infection with highly virulent strains may cause severe multisystemic disease. in these cases, massive immunostimulation and cytokine release cause circulatory disturbances, which may result in shock, induction of the systemic inflammatory response, and multiple organ dysfunction syndromes. babesia organisms can usually be detected on a routine blood smear in animals with acute disease. infected erythrocytes may be more prevalent in capillary blood, so blood smears made from samples taken from the pinna of the ear or the nail bed may increase the likelihood of detecting organisms microscopically. buffy coat smears also have an enriched population of infected erythrocytes. pcr-based tests are the most sensitive assay for detecting infection in animals with very low levels of parasitemia. at necropsy, gross lesions are mainly related to hemolysis and include pale mucous membranes, icterus, splenomegaly, dark red or black kidneys, and reddish-brown urine. the cut surface of the congested spleen oozes blood. the gallbladder is usually distended with thick bile. less common lesions include pulmonary edema, ascites, and congestion, petechiae, and ecchymoses of organs, including the heart and brain. parasitized erythrocytes are best visualized on impression smears of the kidney, brain, and skeletal muscle. microscopic findings in the liver and kidney are typical of a hemolytic crisis and include anemia-induced degeneration, necrosis of periacinar hepatocytes and cholestasis, and hemoglobinuric nephrosis with degeneration of tubular epithelium. erythroid hyperplasia is present in the bone marrow. in animals that survive the acute disease, there is hemosiderin accumulation in the liver, kidney, spleen, and bone marrow. in chronic cases there is hyperplasia of macrophages in the red pulp of the spleen. theileriosis (piroplasmosis). theileria spp. are tick-borne protozoal organisms that infect many domestic and wild animals worldwide. numerous theileria spp. have been documented, but only the more economically or regionally important species are mentioned here. diseases with the greatest economic impact in ruminants are east coast fever (theileria parva infection) and tropical theileriosis (theileria annulata infection). • horses-theileria equi (formerly babesia equi) • cattle-theileria annulata, theileria buffeli, t. parva • sheep and goats-theileria lestoquardi (formerly theileria hirci) like babesiosis, theileriosis is generally restricted to tropical and subtropical regions, including parts of africa, asia, the middle east, and europe. except for t. buffeli, all previously listed species are exotic to the united states. infection is characterized by schizonts within lymphocytes or monocytes, and pleomorphic intraerythrocytic piroplasms (merozoites and trophozoites). within host leukocytes the parasite induces leukocyte cellular division, which expands the parasitized cell population. infected cells disseminate throughout the lymphoid system via the lymphatic and blood vessels. the infected leukocyte may block capillaries, causing tissue ischemia. later in infection some schizonts cause leukocyte lysis and release of merozoites. merozoites then invade and parasitize erythrocytes, causing hemolytic anemia. possible mechanisms of anemia in theileriosis include invasion of erythroid precursors by merozoite stages and associated erythroid hypoplasia (as occurs with t. parva infection), immune-mediated hemolysis, mechanical fragmentation because of vasculitis or microthrombi, enzymatic destruction by proteases, and oxidative damage. gross and microscopic lesions are similar to those of babesiosis, except that cattle with east coast fever tend not to develop organized by taxonomy (protozoal, bacterial and rickettsial, and viral). babesiosis (piroplasmosis). babesia spp. and theileria spp., presented in the next section, are members of the order piroplasmida, and are generally referenced as piroplasms. these organisms are morphologically similar but have different life cycles; babesia spp. are primarily erythrocytic parasites, whereas theileria spp. sequentially parasitize leukocytes and then erythrocytes. both are protozoan parasites spread by ticks, but other modes of transmission are possible (e.g., biting flies, transplacental, and blood transfusions). evidence is accumulating that dog fighting also transmits babesia gibsoni infection. babesia organisms are typically classified as large (2 to 4 µm) or small (<2 µm) with routine light microscopy ( fig. 13-24 ). over 100 babesia species have been identified, some of which are listed here, along with their relative microscopic size in parentheses: geographic distributions vary with the species, but most have higher prevalences in tropical and subtropical regions. for example, equine and bovine babesiosis are endemic in parts of africa, the middle east, asia, central and south america, the caribbean, and europe. both were eradicated from the united states and are now considered exotic diseases in that country. of the previously mentioned species, only agents of canine babesiosis are thought to be endemic in the united states. fever. anaplasmosis, ehrlichiosis, heartwater, and tick-borne fever are tick-borne diseases caused by small, pleomorphic, gramnegative, obligate intracellular bacteria within the order rickettsiaceae, also colloquially known as rickettsias. as a group, rickettsias primarily infect hematopoietic cells and endothelial cells. rickettsias that predominantly infect endothelial cells (e.g., rickettsia rickettsii [rocky mountain spotted fever]), or cause gastrointestinal disease (e.g., neorickettsia helminthoeca [salmon poisoning disease] and neorickettsia risticii [potomac horse fever]) are discussed elsewhere (see chapters 4 and 7). less commonly, transmission may occur via blood transfusions or blood-contaminated medical supplies. rickettsias that infect erythrocytes include the following species (the disease name follows in parentheses): • cattle-anaplasma marginale, anaplasma centrale (bovine anaplasmosis) • sheep and goats-anaplasma ovis (ovine and caprine anaplasmosis, respectively) anaplasma marginale and a. ovis have worldwide distributions, but a. centrale is mostly restricted to south america, africa, and the middle east. hemolytic anemia. in acute east coast fever, lymph nodes are enlarged, edematous, and hemorrhagic. but with chronic cases they may be shrunken. there is often splenomegaly, hepatomegaly, and hemorrhagic enteritis with white foci of lymphoid infiltrates (pseudoinfarcts) in the liver and kidney. microscopically, infected leukocytes may block capillaries. african trypanosomiasis. trypanosomes are flagellated protozoa that can infect all domesticated animals. the most important species that cause disease are trypanosoma congolense, trypanosoma vivax, and trypanosoma brucei ssp. brucei. disease is most common in parts of africa where the biologic vector, the tsetse fly, exists. however, t. vivax has spread to central and south america and the caribbean, where other biting flies transmit the parasite mechanically. in africa, cattle are mainly affected due to the feeding preferences of the tsetse fly. african trypanosomiasis must be distinguished from nonpathogenic trypanosomiasis, such as trypanosoma theileri infection in cattle. animals become infected when feeding tsetse flies inoculate metacyclic trypanosomes into the skin of animals. the trypanosomes grow for a few days, causing a localized chancre sore, and then sequentially enter the lymph nodes and bloodstream. trypanosomal organisms do not infect erythrocytes but rather exist as free trypomastigotes (i.e., flagellated protozoa with a characteristic undulating membrane) in the blood ( fig. 13 -25, a) or as amastigotes in tissue. the mechanism of anemia is believed to be immune mediated. cattle with acute trypanosomiasis have significant anemia, which initially is regenerative, but less so with time. the extent of parasitemia is readily apparent with t. vivax and t. theileri infections because the organisms are present in large numbers in the blood. this is in contrast to t. congolense, which localizes within the vasculature of the brain and skeletal muscle. chronically infected animals often die secondary to poor body condition, immunosuppression, and concurrent infections. gross examination of animals with acute disease often reveals generalized lymphadenomegaly, splenomegaly, and petechiae on serosal membranes. an acute hemorrhagic syndrome may occur in cattle, resulting in lesions of severe anemia (e.g., pale mucous membranes) and widespread mucosal and visceral hemorrhages. main lesions of chronic infections include signs of anemia, lymphadenopathy (e.g., enlarged or atrophied lymph nodes), emaciation, subcutaneous edema, pulmonary edema, increased fluid in body cavities, and serous atrophy of fat. trypanosoma cruzi is the flagellated protozoal agent of american trypanosomiasis. infections have been reported in more than 100 mammal species in south america, central america, and the southern united states, but dogs and cats are among the more common domestic hosts. infected triatomine insects, or "kissing bugs," defecate as they feed on their mammalian host, releasing infective t. cruzi organisms. the parasite then enters the body through mucous membranes or breaks in the skin. like the other trypanosomes described previously, t. cruzi lives in the blood as extracellular trypomastigotes (see fig. 13 -25, b) and in the tissues as intracellular amastigotes. trypanosoma cruzi primarily causes heart disease. lesions of acute disease include a pale myocardium, subendocardial and subepicardial hemorrhages, and yellowish-white spots and streaks. there may also be secondary lesions, such as pulmonary edema, ascites, and congestion of the liver, spleen, and kidneys. in chronic disease the heart may be enlarged and flaccid with thin walls. microscopically, there is often myocarditis and amastigotes within cardiomyocytes. most of these rickettsias have worldwide distributions. however, e. ewingii has been reported only in the united states, and e. ruminantium is endemic only in parts of africa and the caribbean. although a. phagocytophilum has a wide geographic distribution, strain variants are regionally restricted. for example, a. phagocytophilum causes disease in ruminants in europe, but it has not been documented in ruminants in the united states. reservoirs of disease vary, depending upon the rickettsial species. cattle are the reservoir host for e. ruminantium, canids are the reservoir host for a. platys and e. canis, and the other rickettsias have wildlife reservoirs. pathogenesis of disease involves endothelial cell, platelet, and leukocyte dysfunction. those agents that infect endothelial cells cause vasculitis and increased vascular permeability of small blood vessels. if only plasma is lost, then there is hypotension and tissue edema. however, more severe vasculitis causes microvascular hemorrhage with the potential for platelet consumption thrombocytopenia, disseminated intravascular coagulation, and hypotension. infection of platelets may cause thrombocytopenia by direct platelet lysis, immune-mediated mechanisms, or platelet sequestration within the spleen. pathogenesis of leukocyte dysfunction is unclear, but may involve sepsis, inhibited leukocyte function, endothelial cell activation, and platelet consumption. chronic e. canis infection may cause aplastic anemia with pancytopenia by an unknown mechanism. some studies indicate that german shepherd dogs with ehrlichiosis are predisposed to have particularly severe clinical disease. some breeds of cattle (bos taurus), sheep (merino), and goats (angora and saanen) are more susceptible to heartwater. upon blood smear evaluation, thrombocytopenia is the most common hematologic abnormality; anemia and neutropenia occur less frequently. in early stages of infection, blood cells may contain morulae, which are clusters of rickettsial organisms within cytoplasmic, membrane-bound vacuoles ( fig. 13-27 ). examination of buffy coat smears increases the probability of detecting the organism. chronic infection may cause lymphocytosis, particularly of granular lymphocytes. anaplasma platys causes recurrent marked thrombocytopenia. in general, more common gross lesions are splenomegaly, lymphadenomegaly, and pulmonary edema and hemorrhage. more severe cases may also exhibit multisystemic petechiae, ecchymoses, and bovine anaplasmosis causes anemia mainly by immune-mediated extravascular hemolysis. the severity of disease in infected animals varies with age. infected calves under 1 year of age rarely develop clinical disease, whereas cattle 3 years of age or older are more likely to develop severe, potentially fatal, illness. the reason for this discrepancy is not clear. indian cattle (bos indicus) are more resistant to disease than european cattle (bos taurus). surviving cattle become chronic carriers (and thus reservoirs for infection of other animals) and develop cyclic bacteremia, which is typically not detectable on blood smears. splenectomy of carrier animals results in marked bacteremia and acute hemolysis. pcr testing is the most sensitive means of identifying animals with low levels of bacteremia. grossly, acute disease causes lesions of acute hemolytic anemia, including pale mucous membranes, low blood viscosity, icterus, splenomegaly, hepatomegaly, and a distended gallbladder. in animals with acute disease it is usually easy to detect a. marginale organisms on routine blood smear evaluation ( fig. 13-26 ) or impression smears from cut sections of the spleen. however, in recovering animals, the organisms may be difficult to find. rickettsias that infect leukocytes are broadly divided into those that preferentially infect granulocytes ( in addition to anemia, common findings in animals with leptospirosis-induced hemolysis include hemoglobinuria and icterus. on necropsy, renal tubular necrosis, which occurs in part because of hemoglobinuria (hemoglobinuric nephrosis), may also be present. hemotropic mycoplasmosis (hemoplasmosis). the term hemotropic mycoplasmas, or hemoplasmas, encompasses a group of bacteria, formerly known as haemobartonella or eperythrozoon spp., that infect erythrocytes of many domestic, laboratory, and wild animals. hemotropic mycoplasmas affecting common domestic species are as follows: • cattle-mycoplasma wenyonii • camelids-"candidatus mycoplasma haemolamae" • sheep and goats-mycoplasma ovis • pigs-mycoplasma suis (efig. 13-4) • dogs-mycoplasma haemocanis, "candidatus mycoplasma haematoparvum" • cats-mycoplasma haemofelis, "candidatus mycoplasma haemominutum," "candidatus mycoplasma turicensus" like other mycoplasmas, hemoplasmas are small (0.3 to 3 µm in diameter) and lack a cell wall. they are epicellular parasites, residing in indentations and invaginations of red blood cell surfaces. the mode of transmission is poorly understood, but blood-sucking arthropods are believed to play a role; transmission in utero, through biting or fighting, and transfusion of infected blood products are also suspected. effects of infection vary from subclinical to fatal anemia, depending on the specific organism, dose, and host susceptibility. most hemoplasmas are more likely to cause acute illness in individuals that are immunocompromised or have concurrent disease. however, m. haemofelis is an exception and tends to cause acute hemolytic anemia in immunocompetent cats. anemia occurs mainly because of extravascular hemolysis, but intravascular hemolysis also occurs. although the pathogenic mechanisms are not completely understood, an immune-mediated component is highly probable, as well as direct red blood cell injury by the bacteria and the innocent bystander effect. hemotropic mycoplasmas induce cold agglutinins in infected individuals, although it is not clear whether these particular antibodies are important in the development of hemolytic anemia. when detected on routine blood smear evaluation, the organisms are variably shaped (cocci, small rods, or ring forms) and sometimes arranged in short, branching chains ( fig. 13-28) . the organisms may also be noted extracellularly, in the background of the blood smear, especially if the smear is made after prolonged storage of the blood in an anticoagulant tube. in animals dying of acute hemoplasma infection, the gross findings are typical of extravascular hemolysis, with pallor, icterus, splenomegaly, and distended gallbladder ( fig. 13-29 ). additional lesions documented in cattle include scrotal and hind limb edema and swelling of the teats. microscopic lesions in the red pulp of the spleen include congestion, erythrophagocytosis, macrophage hyperplasia, emh, and increased numbers of plasma cells. bone marrow has varying degrees of erythroid hyperplasia, depending on the duration of hemolysis. immune-mediated hemolytic anemia. immune-mediated hemolytic anemia is a condition characterized by increased destruction of erythrocytes because of binding of immunoglobulin to red blood cell surface antigens. it is a common, life-threatening condition in dogs but also has been described in horses, cattle, and cats. immune-mediated hemolytic anemia may be idiopathic (also called edema, cavitary effusions, and effusive polyarthropathy. hydropericardium gives heartwater its name but is more consistently found in small ruminants than in cattle. chronically infected dogs are emaciated. the bone marrow is hyperplastic and red in the acute disease but becomes hypoplastic and pale in dogs with chronic e. canis infection. equine anaplasmosis is often mild but may cause edema and hemorrhages. disease in cats is rare and poorly documented. histologic findings include generalized perivascular plasma cell infiltration, which is most pronounced in animals with chronic disease. multifocal, nonsuppurative meningoencephalitis, interstitial pneumonia, and glomerulonephritis are present in most dogs with the disease. rickettsial organisms are difficult to detect histologically; examination of wright-giemsa-stained impression smears of lung, liver, lymph nodes, and spleen is a more effective method for detecting the morulae within leukocytes. heartwater is often diagnosed by observing morulae in endothelial cells of giemsastained squash preparations of brain. rickettsial diseases are often diagnosed on the basis of serologic testing, but pcr testing is more sensitive. clostridial diseases. certain clostridium spp. may cause potentially fatal hemolytic anemias in animals; nonhemolytic lesions are presented elsewhere (see chapters 4, 7, 8, and 19) . clostridium haemolyticum and clostridium novyi type d cause the disease in cattle known as bacillary hemoglobinuria. (the phrase "red water" has also been used for this disease and for hemolytic anemias in cattle caused by babesia spp.) similar naturally occurring disease has been reported in sheep. in cattle the disease is caused by liver fluke (fasciola hepatica) migration in susceptible animals. ingested clostridial spores may live in kupffer cells for a long time without causing disease. however, when migrating flukes cause hepatic necrosis, the resulting anaerobic environment stimulates the clostridial organisms to proliferate and elaborate their hemolytic toxins, causing additional hepatic necrosis. the mechanism of hemolysis involves a bacterial β-toxin (phospholipase c or lecithinase), which enzymatically degrades cell membranes, causing acute intravascular hemolysis. bacillary hemoglobinuria also occurs with liver biopsies in calves. clostridium perfringens type a causes intravascular hemolytic anemia in lambs and calves-a condition known as yellow lamb disease, yellows, or enterotoxemic jaundice because of the characteristic icterus. the organism is a normal inhabitant of the gastrointestinal tract in these animals but may proliferate abnormally in response to some diets. c. perfringens causes intravascular hemolytic anemia in horses with clostridial abscesses, and clostridial mastitis in ewes. c. perfringens type a produces hemolytic α-toxin, which also has phospholipase c activity. leptospirosis. leptospirosis is recognized as a cause of hemolytic anemia in calves, lambs, and pigs. specific leptospiral organisms that cause hemolytic disease include leptospira interrogans serovars pomona and ictohaemorrhagiae. leptospira organisms are ubiquitous in the environment. infection occurs percutaneously and via mucosal surfaces and is followed by leptospiremia; organisms then localize preferentially in certain tissues (e.g., kidney, liver, and pregnant uterus). proposed mechanisms of hemolytic disease include immune-mediated (immunoglobulin m [igm] cold agglutinin) extravascular hemolysis and enzymatic (phospholipase produced by the organism) intravascular hemolysis. leptospirosis can also cause many disease manifestations besides hemolysis (e.g., renal failure, liver failure, abortion, and other conditions) that are not discussed here. primary immune-mediated hemolytic anemia or autoimmune hemolytic anemia) or secondary to a known initiator, termed secondary immune-mediated hemolytic anemia. although the cause of idiopathic immune-mediated hemolytic anemia is unknown, certain dog breeds (e.g., cocker spaniels) are predisposed to developing disease, suggesting the possibility of a genetic component. causes of secondary immune-mediated hemolytic anemia include certain infections (e.g., hemoplasmosis, babesiosis, and theileriosis), drugs (e.g., cephalosporins, penicillin, and sulfonamides), vaccines, and envenomations (e.g., bee stings). immune-mediated hemolysis directed at nonself antigens, such as in neonatal isoerythrolysis, is presented later. in most cases of idiopathic immune-mediated hemolytic anemia, the reactive antibody is igg, and the hemolysis is extravascular (i.e., erythrocytes with surface-bound antibody are phagocytized by macrophages, mainly in the spleen). igm and/or complement proteins may also contribute to idiopathic immune-mediated hemolytic anemia. complement factor c3b usually acts as an opsonin that promotes phagocytosis and extravascular hemolysis. however, formation of the complement membrane attack complex on red blood cell surfaces causes intravascular hemolysis; this mechanism more commonly occurs with igm autoantibodies. most immunoglobulins implicated in immune-mediated hemolytic anemia are reactive at body temperature (warm hemagglutinins). a smaller portion, usually igm, are more reactive at lower temperatures, young (hours to days old) with typical gross and microscopic changes of immune-mediated hemolytic anemia. pure red cell aplasia. pure red cell aplasia (prca) is a rare bone marrow disorder characterized by absence of erythropoiesis and severe nonregenerative anemia. primary and secondary forms of pure red cell aplasia have been described in dogs and cats. primary pure red cell aplasia is apparently caused by immune-mediated destruction of early erythroid progenitor cells, a presumption supported by the response of some patients to immunosuppressive therapy and by the detection of antibodies inhibiting erythroid colony formation in vitro in some dogs. administration of recombinant human erythropoietin (rhepo) has been identified as a cause of secondary pure red cell aplasia in dogs, cats, and horses, presumably caused by induction of antibodies against rhepo that cross-react with endogenous epo. experimentation with the use of speciesspecific recombinant epo has produced mixed results. dogs treated with recombinant canine epo have not developed pure red cell aplasia. however, in experiments reported thus far involving cats treated with recombinant feline epo, at least some animals have developed pure red cell aplasia. parvoviral infection has been suggested as a possible cause of secondary pure red cell aplasia in dogs. infection with felv subgroup c causes secondary erythroid aplasia in cats, probably because of infection of early-stage erythroid precursors. grossly, animals with pure red cell aplasia have pale mucous membranes without indicators of hemolysis (e.g., icterus). microscopic examination of the bone marrow shows an absence or near absence of erythroid precursors with or without lymphocytosis, plasmacytosis, and myelofibrosis; production of other cell lines (e.g., neutrophils and platelets) is normal or hyperplastic. immune-mediated neutropenia. immune-mediated neutropenia is a rare condition that has been reported in horses, dogs, and cats. this disease is characterized by severe neutropenia from immune-mediated destruction of neutrophils or their precursors. the range of causes is presumably similar to that of other immunemediated cytopenias (e.g., immune-mediated hemolytic anemia, pure red cell aplasia, and immune-mediated thrombocytopenia). affected animals may have infections, such as dermatitis, conjunctivitis, or vaginitis, which are secondary to marked neutropenia and a compromised innate immune system. microscopically, there may be neutrophil hyperplasia, maturation arrest, or aplasia in the bone marrow, depending on which neutrophil maturation stage is targeted causing a condition known as cold hemagglutinin disease. this results in ischemic necrosis at anatomic extremities (e.g., tips of the ears), where cooling of the circulation causes autoagglutination of erythrocytes and occlusion of the microvasculature. typically immunemediated hemolytic anemia targets mature erythrocytes, causing a marked regenerative response. however, as discussed earlier in the chapter, immune-mediated destruction of immature erythroid cells in the bone marrow may also occur, resulting in nonregenerative anemia. pathogenesis of secondary immune-mediated hemolytic anemia is dependent upon the cause. erythrocytic parasites may cause immune-mediated hemolysis by altering the red blood cell surface and exposing "hidden antigens" that are not recognized as selfantigens by the host's immune system. alternatively, the immune attack may be directed at the infectious agent, but erythrocytes are nonspecifically destroyed because of their close proximity-this is called the innocent bystander mechanism. certain drugs, such as penicillin, may cause immune-mediated hemolytic anemia by binding to erythrocyte membranes and forming drug-autoantigen complexes that induce antibody formation, termed hapten-dependent antibodies. other proposed mechanisms include binding of drugantibody immune complexes to the erythrocyte membrane, or induction of a true autoantibody directed against an erythrocyte antigen. hematologic, gross, and histopathologic abnormalities are typical of those of hemolytic anemia, as presented in the earlier section on bone marrow and blood cells, dysfunction/responses to injury, blood cells, abnormal concentrations of blood cells, anemia). in brief, there may be spherocytes and autoagglutination on blood smear evaluation, icterus and splenomegaly on gross examination, and emh, erythrophagocytic macrophages, and hypoxia-induced or thromboemboli-induced tissue necrosis on histopathologic examination. dogs with immune-mediated hemolytic anemia also frequently develop an inflammatory leukocytosis and coagulation abnormalities (prolonged coagulation times, decreased plasma antithrombin concentration, increased plasma concentration of fibrin degradation products, thrombocytopenia, and disseminated intravascular coagulation). intravascular hemolysis plays a relatively insignificant role in most cases of immune-mediated hemolytic anemia, but evidence of intravascular hemolysis (e.g., ghost cells, red plasma and urine, dark red kidneys) is noted occasionally, presumably in those cases in which igm and complement are major mediators of hemolysis. neonatal isoerythrolysis. neonatal isoerythrolysis (ni) is a form of immune-mediated hemolytic anemia in which colostrumderived maternal antibodies react against the newborn's erythrocytes. it is common in horses ( fig. 13-30 ) and has been reported in cattle, cats, and some other domestic and wildlife species. in horses, neonatal isoerythrolysis occurs as a result of immunosensitization of the dam from exposure to an incompatible blood type inherited from the stallion (e.g., transplacental exposure to fetal blood during pregnancy or mixing of maternal and fetal blood during parturition). a previously mismatched blood transfusion produces the same results. some equine blood groups are more antigenic than others; in particular, types aa and qa are very immunogenic in mares. in cattle, neonatal isoerythrolysis has been caused by vaccination with whole blood products or products containing erythrocyte membrane fragments. neonatal isoerythrolysis has been produced experimentally in dogs, but there are no reports of naturally occurring disease. in cats the recognized form of neonatal isoerythrolysis does not depend on prior maternal immunosensitization but on naturally occurring anti-a antibodies in queens with type b blood. affected animals are s l rather a secondary complication of many types of underlying disease, including severe inflammation, organ failure, and neoplasia. it is included in the section on inflammatory disorders because the coagulation cascade is closely linked to inflammatory pathways. information on this topic, including efig. 13 -5, is available at www.expertconsult.com. as well as in chapter 2. the term hematopoietic neoplasia encompasses a large and diverse group of clonal proliferative disorders of hematopoietic cells. historically, numerous systems have been used to classify hematopoietic neoplasms in human medicine, some of which have been applied inconsistently to veterinary species (examples include the kiel classification and national cancer institute working formulation). the world health organization (who) classification of hematopoietic neoplasia was first published in 2001 (updated in 2008) and is based on the principles defined in the revised european-american classification of lymphoid neoplasms (real) from the international lymphoma study group. the who classification system is considered the first true worldwide consensus on the classification of hematopoietic malignancies and integrates information on tumor topography, cell morphology, immunophenotype, genetic features, and clinical presentation and course. a veterinary reference of the who classification system, published in 2002, was later validated in 2011 using the canine model of lymphoma. this project, modeled after the study to validate the system in human beings, yielded an overall accuracy (i.e., agreement on a diagnosis) among pathologists of 83%. currently this classification system is accepted as the method of choice in both human and veterinary medicine. the who classification broadly categorizes neoplasms primarily according to cell lineage: myeloid, lymphoid, and histiocytic. this distinction is based on the fact that the earliest commitment of a pluripotent hsc is to either a lymphoid or nonlymphoid lineage. many pathologists and clinicians distinguish leukemias from other hematopoietic neoplasms. leukemia refers to a group of hematopoietic neoplasms that arise from the bone marrow and are present within the blood. leukemia may be difficult to differentiate from other forms of hematopoietic neoplasms that originate outside of the bone marrow but infiltrate the bone marrow and blood. for simplicity, cases of secondary bone marrow or blood involvement may not be considered leukemia but rather the "leukemic phase" of another primary neoplasm. it is now recognized that certain lymphomas and leukemias are different manifestations of the same disease (e.g., chronic lymphocytic leukemia and small lymphocytic lymphoma), and the designation of lymphoma or leukemia is placed on the tissue with the largest tumor burden. based on their degree of differentiation, leukemias are classified as acute or chronic. acute leukemias are poorly differentiated or undifferentiated, meaning that there are high percentages of early progenitor and precursor cells, including lymphoblasts, myeloblasts, monoblasts, erythroblasts, and/or megakaryoblasts. in contrast, well-differentiated cells predominate in chronic leukemias. because well-differentiated cells also predominate with nonneoplastic proliferations, chronic leukemias must be differentiated from reactive processes, such as those cells that occur in chronic and/or granulomatous inflammation. diagnosis of chronic leukemia is often made by excluding all other causes for the proliferating cell type. for example, causes of relative and secondary erythrocytosis are excluded to be able to diagnose polycythemia vera. furthermore, the designation of acute or chronic also refers to the disease's clinical course. acute leukemias tend to have an acute onset of severe and rapidly progressive clinical signs, whereas animals with chronic leukemia for destruction. marrow lymphocytosis and plasmacytosis may be marked (e.g., >60% of nucleated cells). the diagnosis may be supported by flow cytometric detection of immunoglobulin bound to neutrophils but is most often made on the basis of exclusion of other causes of neutropenia and response to immunosuppressive therapy. immune-mediated thrombocytopenia. immune-mediated thrombocytopenia (imtp) is a condition characterized by immunemediated destruction of platelets. it is a fairly common condition in dogs and is less frequent in horses and cats. the disease is usually idiopathic but may be secondary to infection (e.g., equine infectious anemia and ehrlichiosis), drug administration (e.g., cephalosporins and sulfonamides), neoplasia, and other immunemediated diseases. when immune-mediated thrombocytopenia occurs together with immune-mediated hemolytic anemia, the condition is called evans's syndrome. the thrombocytopenia is often severe (e.g., < 20,000 platelets/µl), resulting in varying degrees of bleeding tendencies, mainly in skin and mucous membranes. microscopically, there are multifocal perivascular hemorrhages in multiple tissues, and the bone marrow exhibits megakaryocytic and erythroid hyperplasia. rarely, immune-mediated destruction of megakaryocytes may cause megakaryocytic hypoplasia, termed amegakaryocytic thrombocytopenia. neonatal alloimmune thrombocytopenia. a form of immune-mediated thrombocytopenia, known as neonatal alloimmune thrombocytopenia, is recognized in neonatal pigs and foals. the pathogenesis of this disease is virtually identical to that of neonatal isoerythrolysis as a cause of anemia: a neonate inheriting paternal platelet antigens absorbs maternal antibodies against these antigens through the colostrum. in principle, a similar situation may occur after platelet-incompatible transfusion of blood or blood products containing platelets. gross and microscopic changes are similar to those of immune-mediated thrombocytopenia except that the animal is young (e.g., 1 to 3 days). hemophagocytic syndrome. hemophagocytic syndrome is a term used to describe the proliferation of nonneoplastic (i.e., polyclonal), well-differentiated but highly erythrophagic macrophages. the condition is rare but has been recognized in dogs and cats. unlike hemophagocytic histiocytic sarcoma, which is a neoplastic proliferation of phagocytic macrophages, hemophagocytic syndrome is secondary to an underlying disease, such as neoplasia, infection, or an immune-mediated disorder. the primary disease process causes increased production of stimulatory cytokines, which results in macrophage proliferation and hyperactivation. these activated macrophages phagocytize mature hematopoietic cells and hematopoietic precursors at an enhanced rate, resulting in one or more cytopenias. affected animals usually have lesions of the primary disease, as well as signs of the anemia (e.g., pale mucous membranes), neutropenia (e.g., bacterial infections), and thrombocytopenia (e.g., petechiae and ecchymoses). microscopically, phagocytic macrophages are found in high numbers in the bone marrow and commonly in other tissues, including lymph nodes, spleen, and liver. additional bone marrow findings reported in animals with hemophagocytic syndrome vary widely, ranging from hypoplasia to hyperplasia of cell lines with peripheral cytopenias. disseminated intravascular coagulation. disseminated intravascular coagulation is a syndrome characterized by continuous activation of both coagulation and fibrinolytic pathways and is also known as consumptive coagulopathy. it is not a primary disease, but 754.e1 chapter 13 bone marrow, blood cells, and the lymphoid/lymphatic system disseminated intravascular coagulation is a consumptive coagulopathy resulting from activation of both coagulation and fibrinolytic pathways. it is a secondary complication of many types of underlying disease, including many infectious diseases, trauma, burns, heat stroke, immune-mediated disease, hemolysis, shock, neoplasia, organ failure, obstetric complications, and noninfectious inflammatory disease, such as pancreatitis. it is common in critically ill domestic animals. disseminated intravascular coagulation involves an initial hypercoagulable phase, resulting in thrombosis and ischemic tissue damage, and a subsequent hypocoagulable phase as a result of consumption of coagulation factors and platelets, resulting in hemorrhage (efig. 13-5) . the pathogenesis of disseminated intravascular coagulation typically involves the release of tissue factor (thromboplastin) and subsequent activation of coagulation pathways and platelets but may also involve defective normal inhibition of coagulation or defective fibrinolysis. classically diagnosis of disseminated intravascular coagulation is based on clinical evidence of hemorrhage and/or thromboembolic disease and a triad of laboratory findings: thrombocytopenia, usually moderate (below the lower reference value but above 50,000/µl); prolonged coagulation times (prothrombin time and/or partial thromboplastin time); and decreased fibrinogen or increased concentration of plasma fibrin degradation products or d-dimer. milder forms of disseminated intravascular coagulation that do not meet all of the diagnostic criteria also occur. decreased plasma antithrombin (antithrombin iii) concentration and schistocytosis are other laboratory abnormalities often found in patients with disseminated intravascular coagulation. dysplasia of myeloid cells, and fewer than 20% myeloblasts and "blast equivalents." 3 acute myeloid leukemia. acute myeloid leukemia (aml) is uncommon in domestic animals but most frequently occurs in dogs and cats. in veterinary species, acute myeloid leukemia is most commonly of neutrophil, monocyte, and/or erythroid origin, with rare reports of eosinophil, basophil, or megakaryocytic lineages. it is caused by felv infection in cats. evaluations of blood smears show many early myeloid precursors, including myeloblasts and blast equivalents ( fig. 13-31, a) . in dogs the total leukocyte concentration averages approximately 70,000/µl; anemia, neutropenia, and thrombocytopenia commonly occur. grossly, animals show lesions attributed to anemia, neutropenia, and thrombocytopenia, such as pale mucous membranes, secondary infections, and multisystemic bleeding, respectively. neoplastic cells often infiltrate tissues, resulting in splenomegaly, hepatomegaly, and lymphadenomegaly. microscopically, myeloid cells efface (replace) the bone marrow and infiltrate extramedullary tissues, especially lymphoid tissue. chronic myeloid leukemia. chronic myeloid leukemia (cml), also called chronic myelogenous leukemia or myeloproliferative neoplasia, is rare in animals. most reported cases occur in dogs and cats. there are various subclassifications of chronic myeloid leukemia, including excessive production of erythrocytes (polycythemia vera), platelets (essential thrombocythemia), neutrophils (chronic neutrophilic leukemia), monocytes (chronic monocytic leukemia), neutrophils and monocytes (chronic myelomonocytic leukemia), eosinophils (chronic eosinophilic leukemia), or basophils (chronic basophilic leukemia). complete peripheral blood count analysis often reveals very high concentrations of the neoplastic cells, such as greater than 50,000 to 100,000 leukocytes/µl (see fig. 13-31, b) or 2,000,000 platelets/µl. cellular morphologic features are often normal, but slight dysplasia may be observed. later in the disease there may be cytopenias of nonneoplastic cell types. animals with polycythemia vera often have red mucous membranes and lesions of hyperviscosity syndrome, such as bleeding and dilated, tortuous retinal vessels. essential thrombocythemia results in multisystemic bleeding due to dysfunctional platelets, or multisystemic infarcts from hyperaggregability and excessive platelets. chronic myeloid leukemias of leukocytes often result in splenomegaly, hepatomegaly, and lymphadenomegaly because of infiltration by the neoplastic cells. histologically, the bone marrow shows proliferation of the neoplastic cell type characterized by dysplasia and low numbers (e.g., <20%) of myeloblasts and blast equivalents. mast cell neoplasia. mast cell tumors (mcts) of the skin and other sites are common in animals (see chapters 6, 7, and 17), but mast cell leukemia is rare. in cats, mcts are the most common neoplasm in the spleen (efig. 13-6) . mast cells normally are not present in the blood vascular system, but the finding of mast cells in the blood (mastocytemia) is highly suggestive of disseminated mast cell neoplasia (systemic mastocytosis) in cats. however, mastocytemia does not necessarily indicate myeloid neoplasia in dogs. in fact, one study found that the severity of mastocytemia in dogs was frequently higher in animals without mcts than those with mcts and that random detection of mast cells in blood smears usually is not the result of underlying mct. granulocytic sarcoma. granulocytic sarcoma is a poorly characterized extramedullary proliferation of myeloid precursors, most typically have indolent, slowly progressive disease. this classification scheme is summarized in table 13 -4. subcategories exist within each of these groups, as discussed further later. information on this topic is available at www.expertconsult.com. this section discusses examples of myeloid neoplasms, including myelodysplastic syndrome, myeloid leukemias, and mast cell neoplasms (technically a form of myeloid neoplasia), and lymphoid neoplasms, including lymphoid leukemias and multiple myeloma. other lymphoid neoplasms, such as the numerous subtypes of lymphoma and extramedullary plasmacytomas (emps), as well as histiocytic disorders are described in the section on lymphoid/lymphatic system, disorders of domestic animals, neoplasia. additional discussion of hematopoietic neoplasia occurs in the species-specific sections at the end of this chapter. myelodysplastic syndrome. myelodysplastic syndrome (mds) most commonly occurs in dogs and cats and may be caused by felv infection in cats. the disease refers to a group of clonal myeloid proliferative disorders with ineffective hematopoiesis in the bone marrow, resulting in cytopenias of more than one cell line. hematopoietic proliferation in bone marrow with concurrent peripheral blood cytopenias is likely a result of increased apoptosis of neoplastic cells within the bone marrow, before their release into circulation. clinical illness and death often result from secondary manifestations, such as secondary infections or cachexia, attributable to the effects of cytopenias and/or transformation of the neoplasm into acute myeloid leukemia. gross lesions are dependent upon the type and severity of the cytopenias. however, essential microscopic findings within the bone marrow are normal or increased cellularity, 3 "blast equivalents" include other stages of immature myeloid cells, such as abnormal promyelocytes, monoblasts, promonocytes, erythroblasts, and megakaryoblasts. before the discussion of specific diseases, it is worthwhile to describe the diagnostic techniques required to classify hematopoietic neoplasms that are becoming increasingly available for routine use in veterinary medicine. immunophenotyping refers to the use of antibodies recognizing specific molecules expressed on different cell types to determine the identity of a cell population of interest. immunophenotyping on the basis of these lineage-specific or lineage-associated markers can be performed on histologic sections (immunohistochemistry [see fig. 13 -86]), air-dried cytologic examination smears (immunocytochemistry), or by laser analysis of cells in suspension in blood or buffer solutions (flow cytometry). in cases of lymphoid neoplasia, immunophenotyping most routinely refers to determination of b or t lymphocyte origin. clonality assays, pcr for antigen receptor rearrangement (parr), can help identify neoplastic lymphoid proliferations on the basis of clonal rearrangements of genes encoding lymphocyte antigen receptors. in terms of practical application the parr assay is most useful in helping to distinguish lymphoid neoplasms from those nonneoplastic lymphoid proliferations mimicking neoplasia. cytogenetic testing has not been routinely used in veterinary medicine, though several genetic mutations have been identified in dogs. for example, breakpoint cluster region-abelson (bcr-abl) translocations have been identified in some canine leukemias, including acute myeloblastic leukemia and chronic monocytic leukemia. dogs with burkitt-like lymphoma have a translocation leading to constitutive c-myc expression. a b thrombocytopenia commonly occur. gross and microscopic lesions also are similar to those that occur in cases of acute myeloid leukemia, except that neoplastic cells may differentiate into morphologically identifiable lymphoid cells. chronic lymphocytic leukemia. chronic lymphocytic leukemia (cll) is uncommon in veterinary medicine. it is predominantly a disease of middle-aged to older dogs but is also documented in horses, cattle, and cats. most canine chronic lymphocytic leukemia cases are of t lymphocyte origin, typically cytotoxic t lymphocytes expressing cd8. in cats the majority of chronic lymphocytic leukemia cases have a t helper lymphocyte immunophenotype. a cbc often shows very high numbers of small lymphocytes with clumped chromatin and scant cytoplasm. proliferating cytotoxic t lymphocytes frequently contain a few pink cytoplasmic granules when stained with most methanol-based romanowsky stains (e.g., wright-giemsa). however, these granules may not be appreciated with some aqueous-based romanowsky stains (e.g., diff-quik). although the number of total blood lymphocytes is often greater than 100,000/µl, relatively mild lymphocytosis (e.g., 15,000/µl) has been reported. seventy-five percent of affected dogs also have often of eosinophilic or neutrophilic cell lines. although rare, there are reports of granulocytic sarcoma in dogs, cats, cattle, and pigs, and it may arise in a number of sites, such as lung, intestine, lymph nodes, liver, kidney, skin, and muscle. acute lymphoblastic leukemia. acute lymphoblastic leukemia (all) is uncommon in dogs and cats, and rare in horses and cattle. in a recent immunophenotype study of 51 cases of acute lymphoblastic leukemia in dogs, 47 arose from b lymphocytes and 4 arose from double-negative t lymphocytes that were immunonegative for cd4 and cd8 markers. in the blood of animals with acute lymphoblastic leukemia, there are typically many medium to large lymphoid cells with deeply basophilic cytoplasm, reticular to coarse chromatin, and prominent, multiple nucleoli (see fig. 13-31, c) . in affected dogs the mean blood lymphoid concentration is approximately 70,000/µl, but cats with acute lymphoblastic leukemia often have low numbers of neoplastic cells in the circulation. as with animals with acute myeloid leukemia, anemia, neutropenia, and c d m the light chains deposit as nonamyloid granules, it is termed light chain deposition disease. light chains are low-molecular-weight proteins that pass through the glomerular filter into the urine, wherein they are also known as bence jones proteins. they tend to not react with urine dipstick protein indicators and are most specifically detected by electrophoresis and immunoprecipitation. in addition to aiding in the diagnosis of multiple myeloma, paraproteins have an important role in pathogenesis of disease. these proteins may inhibit platelet function, increase blood viscosity, deposit in glomerular basement membranes (see chapter 11; see figs. 11-27 and 11-28, or precipitate at cool temperatures, which results in bleeding tendencies, hyperviscosity syndrome, glomerulopathies, and cryoglobulinemia, respectively. hyperviscosity syndrome refers to the clinical sequelae of pathologically increased blood viscosity, which are slowed blood flow and loss of laminar flow. clinical signs include mucosal hemorrhages, visual impairment due to retinopathy, and neurologic signs, such as tremors and abnormal aggressive behavior. cryoglobulinemia is the condition in which proteins, typically igm, precipitate at temperatures below normal body temperature (cold agglutinins). precipitation often occurs in blood vessels of the skin and extremities, such as the ears and digits, and results in ischemic necrosis. in multiple myeloma the neoplastic proliferation of plasma cells results in osteolysis. work with human cell cultures has shown that anemia, and 15% have thrombocytopenia. autopsy findings depend on the stage of disease. in advanced cases with marked infiltration of organs with neoplastic cells, there is often uniform splenomegaly, hepatomegaly, and lymphadenomegaly, and the bone marrow is highly cellular (efig. 13-7 ; see fig. 13-31, d) . other lesions depend on whether there are concurrent cytopenias, such as anemia, neutropenia, and thrombocytopenia, and if the neoplastic cells produce excessive immunoglobulin. lesions caused by excessive immunoglobulin are further discussed in the section on multiple myeloma. histologically, the bone marrow is densely cellular with welldifferentiated lymphocytes. small lymphocytes infiltrate and often efface in the architecture of the lymph nodes and spleen. the liver may have dense accumulations of neoplastic cells in the connective tissue around the portal triad. plasma cell neoplasia. plasma cell neoplasms are most easily categorized as myeloma or multiple myeloma, which arises in the bone marrow, and extramedullary plasmacytoma, which as the name implies involves sites other than bone; the latter is discussed in the section on lymphoid/lymphatic system, disorders of domestic animals: lymph nodes, neoplasia, plasma cell neoplasia. multiple myeloma. multiple myeloma (mm) is a rare, malignant tumor of plasma cells that arises in the bone marrow and usually secretes large amounts of immunoglobulin. the finding of neoplastic plasma cells in blood samples or smears is rare. dogs are affected more frequently than other species, but multiple myeloma has also been reported in horses, cattle, cats, and pigs. diagnosis of multiple myeloma is based on finding a minimum of two or three (opinions vary) of the following abnormalities: • markedly increased numbers of plasma cells in the bone marrow ( fig. 13-32 , a) • monoclonal gammopathy • radiographic evidence of osteolysis • light chain proteinuria the classic laboratory finding in patients with multiple myeloma is hyperglobulinemia, which results from the excessive production of immunoglobulin or an immunoglobulin subunit by the neoplastic cells. this homogeneous protein fraction is often called paraprotein or m protein. paraproteins produced from the same clone of plasma cells have the same molecular weight and electric charge. therefore they have the same migration pattern using serum protein electrophoresis, which results in a tall, narrow spike in the globulins region, termed monoclonal gammopathy (see fig. 13-32, b) . the term gammopathy is used because most immunoglobulins migrate in the γ-region of an electrophoresis gel. however, some immunoglobulins, especially immunoglobulin a (iga) and igm, migrate to the β-region. occasionally, biclonal or other atypical electrophoretic patterns may be seen with multiple myeloma as a result of protein degradation, protein complex formation, binding to other proteins, or when the tumor includes more than one clonal population. it is important to note that monoclonal gammopathy is not specific to multiple myeloma but has also been reported with lymphoma, chronic lymphocytic leukemia, canine ehrlichiosis, and canine leishmaniasis. definitively distinguishing monoclonal from polyclonal gammopathy requires immunoelectrophoresis or immunofixation using species-specific antibodies recognizing different immunoglobulin subclasses and subunits. occasionally, multiple myeloma cells produce only the immunoglobulin light chain. an immunoglobulin monomer consists of two heavy chains and two light chains connected by disulfide bonds. these light chains may deposit in tissues and cause organ dysfunction, especially renal failure. when the light chains form amyloid deposits, the disease is called amyloid light chain amyloidosis. but if marrow is dark red as a result of replacement of fat by hematopoietic tissue; the extent of replacement is an indication of the duration of the anemia. the severity of microscopic lesions is dependent on the chronicity of the disease, and they are most significant in the spleen, liver, and bone marrow. as would be anticipated, microscopic findings of the spleen are predominantly influenced by the number and activity of macrophages, which is a reflection of the duration of the disease and the frequency of hemolytic episodes. hemosiderin-laden macrophages persist for months to years; therefore large numbers are consistent with chronicity. kupffer cell hyperplasia with hemosiderin stores and periportal infiltrates of lymphocytes are the most significant changes in the liver. bone marrow histologic findings vary depending on the duration of the disease. in most animals the marrow is cellular because of the replacement of fat by intense, orderly erythropoiesis. granulocytes are relatively less numerous, and plasma cells are increased. as in the spleen, hemosiderin-laden macrophages are present in large numbers in chronic cases. emaciated animals with chronic disease have serous atrophy of fat (see efig. 13-1) . clinical findings with viremic episodes include fever, depression, icterus, petechial hemorrhages, lymph node enlargement, and dependent edema. equine infectious anemia infection is diagnosed on the basis of the coggins test, an agarose gel immunodiffusion test for the presence of the antibody against the virus. congenital dyserythropoiesis in polled herefords. a syndrome of congenital dyserythropoiesis and alopecia occurs in polled hereford calves. the cause and pathogenesis of this often fatal disease are unknown. early in disease there is hyperkeratosis and alopecia of the muzzle and ears, which progresses to generalized alopecia and hyperkeratotic dermatitis. histologically, there is orthokeratotic hyperkeratosis with dyskeratosis, as well as erythroid hyperplasia, dysplasia, and maturation arrest in the bone marrow. ineffective erythropoiesis results in nonregenerative to poorly regenerative anemia. erythrocyte band 3 is integral membrane protein that connects to the cytoskeleton and aids in erythrocyte stability. a hereditary deficiency of this protein has been identified in japanese black cattle, resulting in increased erythrocyte fragility, spherocytosis, intravascular hemolytic anemia, and retarded growth. affected calves show lesions consistent with hemolytic anemia, including pale mucous membranes, icterus, and splenomegaly. histologically, there are bilirubin accumulations in the liver, and hemosiderin in renal tubules. bovine leukemia virus. bovine leukemia virus is discussed in the later section on lymphoma (see lymphoid/lymphatic system, disorders of domestic animals: lymph nodes, neoplasia, lymphoma). bovine viral diarrhea virus. bvdv infection may cause thrombocytopenia in cattle, and a thrombocytopenic hemorrhagic syndrome has been specifically caused by type ii bvdv infection. investigations of the mechanism of bvdv-induced thrombocytopenia have resulted in varying, sometimes conflicting, conclusions. more than one study has shown viral antigen within bone marrow megakaryocytes and circulating platelets. evidence of impaired thrombopoiesis (megakaryocyte necrosis, megakaryocyte pyknosis, osteoclasts support the growth of myeloma cells, and that direct contact between the two cell types increases the myeloma cell proliferation and promotes osteoclast survival. increased osteoclast activity causes osteolysis, but the exact mechanism is not known. osteolysis often results in bone pain, lytic bone lesions on radiographs, hypercalcemia, and increased serum alkaline phosphatase activity. later in disease, osteolysis may cause pathologic fractures. morphologically, myeloma cells tend to grow in sheets that displace normal hematopoietic cells in the bone marrow. a proposed diagnostic criterion of multiple myeloma is that plasma cells constitute 30% or more of the nucleated cells in the marrow. welldifferentiated plasma cells are round with abundant basophilic cytoplasm (due to increased rough endoplasmic reticulum) and a perinuclear pale zone (enlarged golgi apparatus for the production of immunoglobulin); anisocytosis and anisokaryosis are often mild but may be marked. some plasma cell neoplasms have a bright eosinophilic fringe due to accumulated iga (see fig. 13-32, a) . nuclei are round with clumped chromatin and often peripherally placed with the cytoplasm; binucleation and multinucleation are common. poorly differentiated myeloma cells may lack and/or display less characteristic features. osteolysis of bone may be present microscopically. common sites of metastasis include the spleen, liver, lymph nodes, and kidneys. flavin adenine dinucleotide deficiency. flavin adenine dinucleotide (fad) is a cofactor for cytochrome-b 5 reductase, the enzyme that maintains hemoglobin in its functional reduced state, and for glutathione reductase, an enzyme that also protects erythrocytes from oxidative damage. reported in a spanish mustang mare and a kentucky mountain saddle horse gelding, erythrocyte fad deficiency is a result of an abnormal riboflavin kinase reaction, which is the first reaction in converting riboflavin to fad. clinicopathologic changes include persistent methemoglobinemia of 26% to 46%, eccentrocytosis, a slightly decreased or normal hematocrit, and erythroid hyperplasia in the bone marrow. equine infectious anemia virus. equine infectious anemia virus (eiav), the agent of equine infectious anemia, is a lentivirus that infects cells of the monocyte-macrophage system in horses (also ponies, donkeys, and mules). the virus is mechanically transmitted by biting flies, such as horseflies and deer flies. less common routes of transmission include blood transfusions, contaminated medical equipment, and transplacentally. disease may present in acute, subacute, and chronic forms and is potentially fatal. after an acute period of fever, depression, and thrombocytopenia that lasts 1 to 3 days, there is a prolonged period of recurrent fever, thrombocytopenia, and anemia. in most cases, clinical disease subsides within a year, and horses become lifelong carriers and reservoirs of eiav. eiav causes anemia by both immune-mediated hemolysis and decreased erythropoiesis. hemolysis is typically extravascular but may have an intravascular component during the acute phase. decreased erythropoiesis may result from direct suppression of earlystage erythroid cells by the virus, as well as anemia of inflammation. thrombocytopenia likely results from immune-mediated platelet destruction and suppressed platelet production. animals dying during hemolytic crises are pale with mucosal hemorrhages and dependent edema. the spleen and liver are enlarged, dark, and turgid, and they and other organs have superficial subcapsular hemorrhages. petechiae are evident beneath the renal capsule and throughout the cortex and medulla. the bone affected animals are not necessarily anemic. however, acute intravascular hemolytic episodes may occur with hyperventilationinduced alkalemia. lesions are typical of hemolytic anemia and include pale mucous membranes, icterus, hepatosplenomegaly, and dark red urine with microscopic emh and marrow erythroid hyperplasia. a single dna-based test is available to detect the common mutation. erythrocyte structural abnormalities. congenital erythrocyte structural abnormalities may occur with abnormal membrane composition or defective proteins within the membrane or cytoskeleton. some of these morphologic changes occur concurrently with clinical disease, but others do not. hereditary stomatocytosis is recognized in alaskan malamutes, drentse patrijshonds, and schnauzers. the specific defects are not known, but they are likely different in the various dog breeds. however, all affected dogs have stomatocytes on blood smear evaluation, as identified by their slit-shaped area of central pallor. erythrocytes also have increased osmotic fragility and decreased survival. schnauzers are clinically healthy and not anemic but do have reticulocytosis, suggesting that the hemolytic anemia is compensated by erythroid hyperplasia. mild to marked hemolytic anemia is documented in alaskan malamutes and drentse patrijshonds. alaskan malamutes have concurrent short-limb dwarfism, and drentse patrijshonds have hypertrophic gastritis and polycystic kidney disease. other (presumably heritable) erythrocyte abnormalities in dogs that do not have clinical signs include elliptocytosis caused by band 4.1 deficiency or β-spectrin mutation, and familial macrocytosis and dyshematopoiesis in poodles. scott's syndrome. an inherited thrombopathy resembling scott's syndrome in human beings, in which platelets lack normal procoagulant activity, has been recognized in a family of german shepherd dogs. the specific defect in these dogs has not been identified on the molecular level but involves impaired expression of phosphatidylserine on the platelet surface. affected dogs have a mild to moderate clinical bleeding tendency characterized by epistaxis, hyphema, intramuscular hematoma formation, and increased hemorrhage with surgery. macrothrombocytopenia. macrothrombocytopenia is an inherited condition in cavalier king charles spaniels in which there are lower than normal concentrations of platelets with enlarged and giant platelets. the condition is caused by defective β 1 -tubulin, which results in impaired microtubule assembly. affected dogs are asymptomatic but may have abnormal platelet aggregation in vitro. canine distemper. canine distemper virus preferentially infects lymphoid, epithelial, and nervous cells and is presented in greater detail in the lymphoid section. canine distemper virus may also infect other hematopoietic cells, including erythrocytes, nonlymphoid leukocytes, and platelets ( fig. 13-33) , and can cause decreased peripheral blood concentrations of neutrophils, lymphocytes, monocytes, and platelets during viremia. the thrombocytopenia is a result of virus-antibody immune complexes on platelet membranes and direct viral infection of megakaryocytes. increased erythrocyte osmotic fragility. a condition characterized by increased erythrocyte osmotic fragility has been described in abyssinian and somali cats. the specific defect has and degeneration) and increased thrombopoiesis (megakaryocytic hyperplasia, increased numbers of immature megakaryocytes) in the bone marrow has been reported in type ii bvdv-infected animals, including concurrent megakaryocyte necrosis and hyperplasia in some experimental subjects. calves infected with type ii bvdv also have impaired platelet function. cattle with the hemorrhagic syndrome are severely thrombocytopenic and neutropenic with multisystemic hemorrhages, particularly of the digestive tract, spleen, gallbladder, urinary bladder, and lymph nodes. histologic lesions include hemorrhage, epithelial necrosis of enterocytes, intestinal erosions, crypt proliferation with microabscesses, and lymphoid depletion of the gut-associated lymphoid tissue, peyer's patches, and spleen. lesions of the bone marrow are variable, as previously described. bovine neonatal pancytopenia. bovine neonatal pancytopenia (bnp) is caused by alloantibodies absorbed from colostrum, resulting in a hemorrhagic syndrome in calves. the syndrome was first recognized in europe in the early 2000s and has since been experimentally correlated with prior vaccination of affected calves' dams with a commercial bvdv vaccine (pregsure bvd; pfizer animal health). the vaccine has since been voluntarily recalled from the market. it is thought that vaccination induces alloantibody formation by the dam. the alloantibodies are ingested by the calf and bind to the calf's hematopoietic progenitor cells, resulting in functional compromise of those cells. acutely affected calves are less than a year of age and have peripheral thrombocytopenia and neutropenia. death results from thrombocytopenia-induced hemorrhages or neutropenia-induced secondary infections, including pneumonia, enteritis, and septicemia. within the bone marrow there is erythroid, myeloid, and megakaryocytic hypoplasia. cyclic hematopoiesis. cyclic hematopoiesis (also known as lethal gray collie disease) is an autosomal recessive disorder of pluripotent hscs in gray collie dogs. a defect in the adaptor protein complex (ap3) results in defective intracellular signaling and predictable fluctuations in concentrations of blood cells that occur in 14-day cycles. the pattern is cyclic marked neutropenia, and in a different phase, cyclic reticulocytosis, monocytosis, and thrombocytosis. production of key cytokines involved in regulation of hematopoiesis is also cyclic. neutropenia predisposes affected animals to infection, and many die of infectious causes. affected animals have dilute hair coats and lesions with acute or chronic infectious disease, especially of the lungs, gastrointestinal tract, and kidneys. dogs older than 30 weeks of age have systemic amyloidosis, which occurs because of cyclic increases in concentration of acute phase proteins during phases of monocytosis. phosphofructokinase deficiency. inherited autosomal recessive deficiency of the erythrocyte glycolytic enzyme, phosphofructokinase (pfk), is described in english springer spaniel, american cocker spaniel, and mixed-breed dogs. there are three genes encoding pfk enzymes, designated m-pfk in muscle and erythrocytes, l-pfk in liver, and p-pfk in platelets. a point mutation in the gene coding for m-pfk results in an unstable, truncated molecule. erythrocytes in pfk-deficient dogs have decreased atp and 2,3-diphosphoglycerate (2,3-dpg) production and increased fragility under alkaline conditions. the disease is characterized by chronic hemolysis with marked reticulocytosis. the marked regenerative response may compensate for the ongoing hemolysis; therefore erythrophagocytosis, thrombosis, and histologic changes of ischemia are common, especially within the spleen, liver, and lungs. affected cats typically become acutely ill with fever, pallor, and icterus and usually die within 2 to 3 days. for many years, cytauxzoonosis was considered to be almost always fatal. however, a recent not been identified, but pk deficiency (which has been reported in these breeds) was excluded as the cause. affected cats have chronic intermittent severe hemolytic anemia and often have other lesions secondary to hemolytic anemia (e.g., splenomegaly and hyperbilirubinemia). cytauxzoonosis. cytauxzoonosis is a severe, often fatal disease of domestic cats caused by the protozoal organism, cytauxzoon felis. disease is relatively common in the south central united states, particularly during summer months. bobcats (lynx rufus) and other wild felids are thought to be wildlife reservoirs of disease. c. felis is transmitted by a tick vector, dermacentor variabilis, which is probably essential for infectivity of the organism. cytauxzoonosis has a schizogenous phase within macrophages throughout the body (especially liver, spleen, lung, lymph nodes, and bone marrow) that causes systemic illness. these schizontcontaining macrophages enlarge and accumulate within the walls of veins, eventually causing vessel occlusion, circulatory impairment, and tissue hypoxia. later in disease, merozoites released from schizonts enter erythrocytes, resulting in an erythrocytic phase of infection. infected domestic cats often have nonregenerative anemia, but the pathogenesis for the anemia is unclear. however, it likely represents preregenerative hemolytic anemia because erythrocyte phagocytosis is a prominent finding in many organs. infected cats often also develop neutropenia and thrombocytopenia, which likely result from inflammation and disseminated intravascular coagulation, respectively. on blood smear evaluation, signet ring-shaped erythrocytic inclusions (piroplasms) may be observed during the erythrocytic phase of disease ( fig. 13-34) . these inclusions closely resemble small-form babesia (see fig. 13-24, a) and some theileria organisms. postmortem examination typically shows pallor, icterus, splenomegaly, enlarged and red lymph nodes, diffuse pulmonary congestion and edema, and multisystemic petechiae and ecchymoses. vascular obstruction may cause marked distention of abdominal veins. cavitary effusions are present in some cats. microscopically, large, schizont-laden macrophages accumulate within venous and sinusoidal lumens and often completely occlude the lumens (fig. 13-35) . which the b and t lymphocytes proliferate, differentiate, and mature. in mammals, lymphocytes arise from hscs in the bone marrow, and b lymphocytes continue to develop at this site. ruminants also have b lymphocyte proliferation and maturation within their peyer's patches. progenitor t lymphocytes migrate from bone marrow to mature and undergo selection in the thymus. the spleen, lymph nodes, and lymph nodules are secondary lymphoid organs and are responsible for the immune responses to antigens, such as the production of antibody and cell-mediated immune reactions. at these sites, lymphocytes are activated by antigens and undergo clonal selection, proliferation, and differentiation (see also chapter 5). in addition, the spleen and lymph nodes contain cells of the monocyte-macrophage system and thus also participate in the phagocytosis of cells and materials. the bone marrow is described in the first section of this chapter. the remaining primary lymphoid organ, the thymus, is described first in this section, followed by the secondary lymphoid organs: spleen, lymph nodes, and diffuse and nodular lymphatic tissues. errors from selection of inappropriate sampling sites and artifacts from compression and incorrect fixation for histopathologic and immunohistochemical examinations are common in routine veterinary pathologic analysis. the identification and remedies for these problems are discussed in e-appendix 13-2. the thymus is essential for the development and function of the immune system, specifically for the differentiation, selection, and maturation of t lymphocytes generated in the bone marrow (see also chapter 5). the basic arrangement of the thymus in domestic animals consists of paired cervical lobes (left and right), an intermediate lobe at the thoracic inlet, and a thoracic lobe, which may be bilobed. the cervical lobes are positioned ventrolateral to the trachea, adjacent to the carotid arteries, and extend from the intermediate lobe at the thoracic inlet as far cranially as the larynx. the intermediate lobe bridges between the cervical and the thoracic lobe. the right thoracic lobe is usually small or completely absent. the left lobe lies in the ventral aspect of the cranial mediastinum (except in the ruminant, where it is dorsal) and extends caudally as far as the pericardium. horse-the cervical lobes in foals are small, and the thoracic lobe constitutes the bulk of the thymus. ruminant-the cervical lobes are large. the left and right thoracic lobes are fused and unlike other domestic animals, lie in the dorsal aspect of the cranial mediastinum. pig-the cervical lobes are large. dog-the cervical lobes regress very early and thus appear absent. the thoracic lobe extends caudally to the pericardium. cat-the cervical lobes are small, and the thoracic lobe, which forms the majority of the thymus, extends caudally to the pericardium and molds to its surface. the thymus is referred to as a lymphoepithelial organ and hence is composed of epithelial and lymphoid tissue. formed from the endoderm of the third pharyngeal pouch in the fetus, the thymic epithelium is infiltrated by blood vessels from the surrounding mesoderm, resulting in the development of the thymic epithelial reticulum. the lymphocyte population consists of bone marrow-derived progenitor cells, which fill spaces within the epithelial network. a connective tissue capsule surrounds the thymus, and attached thin septa subdivide the tissue into partially separated lobules. each report, in which numerous cats from a subregion of the endemic area in the united states survived infection with an organism with greater than 99% homology to cytauxzoon felis, suggests the emergence of a less virulent strain. felv is an oncogenic, immunosuppressive lentivirus that causes hematologic abnormalities of widely varying types and severity. manifestations of disease caused by felv infection vary depending on dose, viral genetics, and host factors, but normal hematopoiesis is probably suppressed to some degree in all cases. felv infects hematopoietic precursor cells soon after the animal is exposed and continues to replicate in hematopoietic and lymphatic tissue of animals that remain persistently viremic. the virus disrupts normal hematopoiesis by inducing genetic mutations, by other direct effects of the virus on infected hematopoietic cells, or by an altered host immune system. hematologic changes include dysmyelopoiesis with resultant cytopenias or abnormal cell morphologic features, and neoplastic transformation of hematopoietic cells (leukemia). a notable form of dysplasia is the presence of macrocytic erythrocytes (macrocytes) and metarubricytosis in the absence of erythrocyte regeneration (inappropriate metarubricytosis). the relatively uncommon subgroup c viruses cause erythroid hypoplasia, probably because of infection of early-stage erythroid precursors. felv may be detected in megakaryocytes and platelets in infected cats and may result in platelet abnormalities, including thrombocytopenia, thrombocytosis, increased platelet size, and decreased function. proposed mechanisms of felv-induced thrombocytopenia include direct cytopathic effects, myelophthisis, and immunemediated destruction. platelet life span and function have been shown to be decreased in felv-positive cats. persistently viremic cats are immunosuppressed and are prone to developing other diseases, including infectious diseases, bone marrow disorders, and lymphoma. cbc abnormalities attributed to felv infection include various cytopenias, especially nonregenerative anemia, which may be persistent or cyclical. regenerative anemia may also occur with felv infection, often because of coinfection with m. haemofelis. hematopoietic cell dysplasia or neoplasia may also be evident. grossly, infected cats are often pale, but other lesions are dependent upon the presence of other cytopenias or concurrent disease. microscopically, the bone marrow is hypocellular, normocellular, or hypercellular. there may be erythroid hypoplasia, erythroid hyperplasia with maturation arrest, or acute leukemia. feline immunodeficiency virus. feline immunodeficiency virus (fiv), another feline lentivirus, causes anemia in a minority of infected cats. immunosuppressive effects of fiv from thymic depletion are discussed elsewhere. it is generally accepted that anemia does not result directly from fiv infection but instead develops because of concurrent disease such as coinfection with felv or hemotropic mycoplasma, other infection, or malignancy. the severity and type of anemia in fiv-infected cats depends on the other specific disease processes involved. the thymus, spleen, lymph nodes, and lymph nodules, including malt, are classified as part of both the lymphoid and immune systems. the lymphoid system (also known as lymphatic system in some texts) is broadly categorized into primary and secondary lymphoid organs. the main primary lymphoid organs include thymus, bone marrow, and bursa of fabricius in birds and are the sites at 761.e1 chapter 13 bone marrow, blood cells, and the lymphoid/lymphatic system because the thymus involutes after sexual maturity, evaluation of whether it is smaller than normal is difficult to discern unless the change is extreme or age-matched control animals are available. before sexual maturity the thymus is easily identified as a lobular white to gray organ with a thin capsule. after sexual maturity the gland is often grossly indistinguishable from adipose connective tissue within the cranial mediastinum, although microscopic remnants may remain. an extremely small thymus in a neonatal animal should be considered abnormal and may indicate a primary immunodeficiency or secondary lymphoid depletion caused by extreme stress, often due to infectious diseases. enlargement of the thymus is most often due to neoplasia. serial sectioning of the thymus allows for gross identification of neoplasms, cysts, or hematomas. spleens vary in size within the same species and among the different species of domestic animals. the spleen can be enlarged (splenomegaly), normal in size, or small (atrophy), and the surface can be smooth, wrinkled, or nodular. the appearance of the cut surface of the spleen in normal animals depends on the amount of stroma (e.g., trabeculae are prominent in ruminants); the size and visibility of the white pulp, which reflects the amount of lymphoid tissue; and whether the red pulp is congested with blood. during an autopsy (syn: necropsy), the spleen is dissected free and checked for torsion of the gastrosplenic ligament (in nonruminants). the spleen is then sliced transversely at approximately 5-mm intervals (serial sectioning), and the cut surfaces are checked for lesions. specimens are taken for tests that require fresh tissue (e.g., bacteriologic and virologic examinations), and the remaining cross-sections are placed in fixative (10% buffered neutral formalin). a diffusely enlarged spleen should be serially sectioned to determine if the splenomegaly is due to congestion. the cut surface of severely congested spleens is red to bluish-black and exudes blood (bloody spleens), whereas cut surfaces of noncongested enlarged spleens ooze little blood (meaty spleens) and the color depends on how much of the normal parenchyma is replaced by inflammatory cells, stored materials, or neoplastic cells (see splenomegaly and table 13 -5). the spleen may be measured and weighed, but because of the wide variation in the dimensions and weight of normal spleens and the amount of blood stored, this information is difficult to interpret. it is essential that spleens with one or more nodules also be serially sectioned and the nodules evaluated for size, shape, and consistency. nodules may be dark red and ooze blood on cut surface, white-tan with a more firm texture, or a mixture of both. multiple wedge sections that include the interface between a nodule and the adjacent nonmass spleen should be collected, because the center of the nodules often consists only of hemorrhage and necrosis, and neoplasms may be missed. the color of the capsular surface of the spleen also varies among species of domestic animals and depends on the opacity or translucence of the splenic capsule. the degree of opacity of the capsule is a function of its thickness and the amount of collagen. the splenic capsules of horses and ruminants are thick and usually appear gray because the color of the red pulp is not visible through the capsule. in the pig, dog, and cat, the splenic capsule is thin, and thus the surface of the spleen is red. the tenseness of the capsule depends on how much the splenic parenchyma is distended; storage spleens devoid of blood usually have a wrinkled surface. irregular contraction of storage spleens is common, especially in dogs, and consists of nonuniform areas of congestion with intermingled contracted and wrinkled regions. lymph nodes should be dissected free of fat and connective tissue, and any firm attachment to adjacent tissues should be noted because these attachments may indicate neoplastic infiltration through the capsule. gross examination includes evaluating size (measurement or weight), shape, and whether the capsule is intact. the cut surface is examined for the presence of bulging tissue, edema, congestion, exudate, discoloration (see pigmentation), obscuration of the normal architecture, and masses such as abscesses, granulomas, and discrete neoplasms. cytologic evaluation of superficial lymph nodes through fine-needle aspirates provides excellent cellular detail and often yields a diagnosis. however, diagnosis of certain diseases (including lymphomas for complete world health organization [who] classification) requires architectural assessments, and therefore cytologic or small histologic samples are not sufficient. tru-cut biopsies are not ideal, but a 2-mm tru-cut needle may provide adequate tissue. the surgeon or pathologist must handle lymph nodes carefully to minimize artifacts. compression (e.g., squeezing with forceps) may cause crush artifacts, usually resulting in nuclear "streaming." immediately after removal, imprints/impression smears should be prepared and then kept away from formalin fumes. formalin fixation (in this case by formaldehyde fumes) destroys the differential staining seen with romanowsky stains such as wright's and giemsa and results in diffuse blue staining. prompt transfer of biopsy or postmortem specimens into fixative is crucial because delayed fixation can lead to numerous artifacts, including an artificial decrease in mitotic index (up to 40% reduction with a more than 12-hour delay in fixation); this reduction can alter tumor classification and grade. the current recommendation for the duration of formalin fixation is 16 to 32 hours; complete fixation of 2-to 4-mm thick tissues is likely to be achieved after 24 to 48 hours. both underfixation and overfixation may lead to difficulties with antigen retrieval for immunohistochemistry, though underfixation is considered the more common and serious problem. thinly slicing some nodes may be difficult, and allowing the node to fix for 1 hour before slicing may help. some pathologists prefer not to incise very small lymph nodes to avoid compression artifacts, but instead nick the capsule to allow formalin penetration. however, fixation of unincised lymph nodes can also cause compression artifacts because the fibrous capsule contracts in the fixative. once fixed, nodes should be cut in uniformly thick cross section to include both the cortex and medulla. transverse sections are usually sufficiently small to allow the entire cross section of most lymph nodes to fit on one microscopic slide, which facilitates histologic interpretation. the longitudinal plane is preferred for porcine lymph nodes because the location and amount of cortex and medulla vary at different sites in transverse sections. molecules) but not self-antigens are permitted to mature by a process called positive selection. cells that do not recognize mhc molecules are removed by apoptosis. those t lymphocytes that recognize both mhc molecules and self-antigens are removed by macrophages at the corticomedullary junction, a process called negative selection. because of the rigid differentiation requirements attributable to mhc restriction and tolerance (positive and negative selection, respectively), only a small fraction (<5%) of the developing t lymphocytes that arrive at the thymus from the bone marrow survive. mature naïve t lymphocytes exit the thymus through postcapillary venules in the corticomedullary region, enter the circulation, and recirculate through secondary lymphoid tissues, primarily located in the paracortex of lymph nodes and the periarteriolar sheaths of the spleen. in these specialized sites, the mature naïve t lymphocytes are activated upon exposure to their specific antigens and undergo additional phases of development to differentiate into effector and memory cells. the thymus attains its maximal mass relative to body weight at birth and involutes after sexual maturity; the rate of involution may vary among domestic species. the lymphoid and epithelial components are gradually replaced by loose connective tissue and fat, although remnants remain histologically, even in aged animals. lobule is composed of a central medulla and surrounding cortex (fig. 13-36) . the thymic cortex consists mainly of an epithelial reticulum and lymphocytes ( fig. 13-37) . the stellate cells of the epithelial reticulum have elongate branching cytoplasmic processes that connect to adjacent epithelial cells through desmosomes, thus forming a supportive network (cytoreticulum). the lymphoid component is composed of differentiating lymphocytes derived from progenitor (also known as precursor) t lymphocytes in the bone marrow. the medulla is composed of similar epithelial reticular cells, many of which are much larger than those in the cortex and have a more obvious epithelial structure. some of the epithelial reticular cells form thymic corpuscles, also called hassall's corpuscles, which are distinctive keratinized epithelial structures (see fig. 13-37 ). interdigitating dendritic cells (dcs) are also present within the medulla, but there are far fewer lymphocytes than in the cortex. the progenitor t lymphocytes released from the bone marrow into the blood enter the thymus in the subcapsular zone of the cortex and begin the differentiation and selection processes, developing into mature naïve t lymphocytes as they traverse the thymic cortex to the medulla. in the cortex, t lymphocytes that recognize self-molecules (major histocompatibility complex [mhc] the responses of the thymus to injury and causes are listed in boxes 13-4 and 13-5. the most common change is lymphoid atrophy caused by physical and physiologic stresses, toxins, drugs, and viral infections. atrophy. because the thymus does not contain any lymphopoietic tissue, it depends on the bone marrow for the supply of progenitor t lymphocytes. thus thymic lymphoid atrophy can be the result of either an inadequate supply of lymphocytes from the bone marrow or lysis of lymphocytes (lymphocytolysis) in the thymus. thymic aims to define the terms used in this chapter. the term splenic sinusoid is used to describe a vascular structure present in the sinusal spleen (also known as sinusoidal spleen); dogs are the only domestic animal with true splenic sinusoids. the term red pulp vascular spaces is used (as opposed to "sinus") to describe the vascular spaces in the red pulp of both the nonsinusal and nonsinusoidal spleens of all domestic animals. 4 the other terms used here include marginal sinus, marginal zone, periarteriolar lymphoid sheath (pals), periarteriolar macrophage sheath (pams), and splenic lymphoid follicles. the spleen is located in the left cranial hypogastric region of the abdomen, where it is typically suspended in the gastrosplenic ligament between the diaphragm, stomach, and the body wall. the exception is in domestic ruminants, where it is closely adhered to the left dorsolateral aspect of the rumen. the gross shape and size of the spleen vary markedly among domestic animals, but generally it is a flattened, elongated organ. some species, notably birds, demonstrate seasonal variation in splenic shape and size. the spleen is covered by a thick capsule composed of smooth muscle and elastic fibers, from which numerous intertwining fibromuscular trabeculae extend into the parenchyma. these trabeculae and reticular cells form a spongelike supportive matrix for the parenchyma of the mammalian spleen in all domestic species. in cattle and horses the three muscular layers of the capsule lie perpendicular to each other, forming a capsule thicker than that of carnivores. carnivores, small ruminants, and pigs have interwoven smooth muscle within the splenic capsule, and pigs also have abundant elastic fibers within the capsule. the spleen differs from many other organs in the organization of its parenchyma. instead of a cortex and medulla, the spleen is divided into two distinct structural and functional components: the red pulp and white pulp (fig. 13-38) . with hematoxylin and eosin (h&e) staining, red pulp appears red-pink because of the abundance of red blood cells, whereas white pulp appears blue-purple because of the heavy concentration of lymphocytes. the white pulp consists of splenic follicles, populated by b lymphocytes; the pals, inhabited by t lymphocytes; and the marginal zone at the periphery of follicles. macrophages, antigen-presenting cells, and trafficking b and t lymphocytes populate the marginal zone. the radial arteries, branches of the central artery (also known as central arteriole), and capillaries from both red and white pulp drain into the marginal sinus of the marginal zone, although the latter has not been shown to be the case in all species to the same degree (e.g., the cat has a small marginal sinus but a well-developed pams) (figs. 13-39 and 13-40). the red pulp consists of cells of the monocyte-macrophage system, pams, sinusoids (dogs, rats, and human beings only), red pulp vascular spaces, and associated stromal elements such as reticular cells, fibroblasts, and trabecular myocytes. the labyrinth of the splenic red pulp vascular spaces serves as both a functional and physical filter for circulating blood cells. the blood circulation of the spleen is particularly suited to enable its functions, namely, (1) filtering and clearing the blood of atrophy must be differentiated from involution, which normally begins at sexual maturity. this distinction is difficult to make, unless the change is extreme or age-matched control animals are available for comparison. inflammation. inflammation of the thymus is rare. neutrophils and macrophages are often present within keratinized hassall's corpuscles during involution and should not be mistaken for a true thymitis. thymitis has been reported in salmon poisoning disease of dogs (see chapter 7), epizootic bovine abortion (see chapter 18), and in pigs infected with porcine circovirus type 2 (pcv2). necrosis and secondary infiltrates of neutrophils and macrophages may be seen in other infectious diseases (e.g., equine herpesvirus 1 [ehv-1]). the main portal of entry to the thymus is hematogenous. portals of entry used by microorganisms and other agents and substances to access the lymphatic system are summarized in box 13-6. these portals include the blood vessels (hematogenous spread by microorganisms free in the plasma or within circulating leukocytes or erythrocytes), afferent lymphatic vessels (lymphatic spread), direct penetration, or through m (for "microfold") cells and dcs in malt. defense mechanisms used by the thymus to protect itself against microorganisms and other agents are the innate and adaptive immune responses, discussed in chapters 3, 4, and 5. viruses, bacteria, and particles arriving in the lymph and blood interact with cells of the monocyte-macrophage system through phagocytosis and antigen processing and presentation. hyperplasia of the macrophages often occurs concurrently. antigen processing and presentation are followed by an immune response resulting in proliferation of b lymphocytes, plasma cells, and the subsequent production of antibody; proliferation of t lymphocytes may also occur. the relationships between anatomic structures and the different functions of the spleen are complicated. there are also anatomic differences among domestic animal species and confusion about the correct and up-to-date terminology. the following brief discussion malt, mucosa-associated lymphoid tissue. 4 there are numerous synonyms and misuse of terms within the literature, which have contributed to the confusion over terminology for red pulp vascular spaces. these terms include reticular space, red pulp, splenic cords, sinuses, red pulp sinuses, sinus spaces, pulp spaces, mesh space of the spleen, reticular cell-lined meshwork, interstices of the reticulum network, bloodfilled reticular meshwork of the red pulp, chordal spaces, splenic cords, and cords of billroth. the latter two terms are defined as the red pulp between the sinusoids, which most domestic animals do not have (except the dog). therefore the term red pulp vascular spaces is more appropriate. as a result of this pattern of blood flow, macrophages in the marginal sinus have the first opportunity to phagocytize antigens, bacteria, particles, and other material before macrophages in the sinusoids (in the dog) or in the pams and red pulp vascular spaces (all other domestic animals). in the dog the marginal sinus drains into the sinusoids, but in other domestic animals it drains into the red pulp vascular spaces. the central arteries leave the white pulp, enter the red pulp, and branch into smaller penicillar arterioles. each arteriole is surrounded by a sheath of macrophages known as periarteriolar macrophage sheaths (pams, previously known as ellipsoids), which are notably prominent in pigs, dogs, and cats. in horses, cattle, pigs, and cats the terminal branches of the penicillar arterioles empty into the red pulp vascular spaces lined by reticular cells. because the red pulp vascular spaces are not lined by endothelium, this type of circulation is known as an open system. this system is in contrast to the sinusoidal spleen of the dog (also of the rat and human beings), where the branches of the central artery of the white pulp and vessels from the marginal sinus enter into the sinusoids, which are lined by a discontinuous endothelium, and these empty into splenic venules. this type of circulation is known as a closed system because the blood flow is through blood vessels (arterioles, capillaries, sinusoids, and venules), all of which are lined by endothelium. although circulation in the red pulp is anatomically open in nonsinusoidal spleens, under certain conditions (e.g., during splenic contraction) the circulation is functionally closed, and the blood in the red pulp is particulate matter and senescent cells; (2) transporting recirculating lymphocytes and naïve b and t lymphocytes to the follicle and pals, respectively, to fulfill their specific immune functions; and (3) storage of blood in some domestic animal species (dog, cat, and horse) (fig. 13-41) . phagocytosis is particularly effective in the spleen because blood flows through areas within the red pulp that are populated with increased concentrations of macrophages, namely, within the marginal sinuses, in cuffs around the penicillar arteries (pams), diffusely on the reticular walls of the red pulp vascular spaces, and along the sinusoids in dogs. trafficking of naïve and recirculating lymphocytes is facilitated by the proximity of the marginal sinus to the follicular germinal centers and pals. maps of the vascular blood flow in sinusoidal and nonsinusoidal spleens are illustrated in figures 13-41 to 13-43. the celiac artery is the major branch of the abdominal aorta from which the splenic artery arises. the splenic artery enters the splenic capsule at the hilus, where it branches and enters the fibromuscular trabeculae as trabecular arteries to supply the splenic parenchyma. trabecular arteries become the central arteries of the white pulp and are surrounded by cuffs of t lymphocytes forming the pals. the splenic follicles, populated by b lymphocytes, are eccentrically embedded within or just adjacent to the pals. the central arteries send branches-the radial arteries-to supply the marginal sinus surrounding the splenic follicles. thus the cells at the circumferences of the follicles are brought into intimate contact with blood-borne antigens and trafficking b and t lymphocytes in the marginal sinus. diverted into "channels" lined by reticular cells. because the dog has both sinusoids and red pulp vascular spaces, it has both open and closed splenic circulations, which may allow for both fast and slow flows of blood depending on the physiologic need of the animal. blood flowing through the sinusoids or red pulp vascular spaces is under the surveillance of macrophages. in dogs the pseudopodia of these perisinusoidal macrophages project into the sinusoidal lumen through the spaces in the discontinuous endothelium. in all domestic animals, blood in the red pulp vascular spaces is under surveillance of macrophages attached to the reticular walls. blood from the red pulp vascular spaces and sinusoids then drains into the splenic venules, splenic veins, and ultimately into the portal vein, which empties into the liver. the spleen filters blood and removes foreign particles, bacteria, and erythrocytes that are senescent, have structural membrane abnormalities, or are infected with hemotropic parasites. as a secondary lymphoid organ, its immunologic functions include the activation of macrophages to process and present antigen, the proliferation of b lymphocytes and production of antibody and biologic molecules, and the interaction of t lymphocytes and antigens. in some species the spleen stores significant quantities of blood (box 13-7). the functions of the spleen are best considered on the basis of the two main components of the spleen: the red and white pulp and the anatomic systems contained within them (monocyte-macrophage system, red pulp vascular spaces, and hematopoiesis in the red pulp, and the b and t lymphocyte systems within the white pulp). monocyte-macrophage system. within the red pulp, macrophages are located in the marginal sinus, pams, and attached to the reticular walls of the red pulp vascular spaces. in the dog, macrophages are also located perisinusoidally. the supportive reticular network of the red pulp vascular spaces is composed of a fine meshwork of reticular fibers made of type iii collagen, on which macrophages are dispersed. exactly in which of these concentrations of macrophages phagocytosis of blood-borne particles takes place depends upon (1) the sequence in which they are exposed to the incoming blood, (2) the concentration of macrophages in these areas (e.g., the cat marginal sinus is small and thus not a major site of clearance; there is a compensatory increase in pams for phagocytosis), and (3) the functions of the macrophages. some of the macrophages in the marginal sinus and marginal zone are responsible for phagocytosis of particulate matter and others for the trapping and ingestion of antigens and antigen-antibody complexes. macrophages responsible for phagocytosis of blood-borne foreign material (fig. 13-44) , bacteria, and senescent and/or damaged erythrocytes (e.g., as seen in immune-mediated anemias and infections with hemotropic parasites) are also found in the red pulp. in the dog, sinusoidal macrophages remove entire erythrocytes (erythrophagocytosis), as well as portions of an erythrocyte's membrane and cytoplasmic inclusions, such as nuclear remnants like heinz bodies, by a process called pitting. as such, the presence of large numbers of nuclear remnants in erythrocytes in canine blood smears may indicate malfunction of the sinusoidal system. the normal rate of removal of senescent erythrocytes from the circulating blood does not cause an increase in size of the spleen; however, splenomegaly can be observed when large numbers of defective erythrocytes must be removed, as in cases of severe acute hemolytic anemia. nonsinusoidal spleens lack the fenestrated endothelium and perisinusoidal macrophages of canine sinusoids that allow for slow processing of red blood cells to determine which are to be returned to the equine, canine, and feline spleens all have considerable storage and contractile capacity because of their muscular capsule, increased numbers of trabeculae, and the relatively small amount of splenic parenchyma devoted to white pulp. the storage capacity in dogs and horses is remarkable: it has been claimed that the canine spleen can store one-third of the dog's erythrocytes while the animal sleeps and the equine spleen holds one-half of the animal's circulating red cell mass (which is considered advantageous because it reduces the viscosity of the circulating blood). storage spleens expand and contract quickly under the influence of the autonomic nervous system, via sympathetic and vagal fibers in the trabeculae and reticular walls of the red pulp vascular spaces and other circulatory disruptions, such as hypovolemic and/or cardiogenic shock. thus storage spleens may be either grossly enlarged and congested or small with a wrinkled surface and a dry parenchyma depending on whether the spleen is congested from stored blood or shrunken from contraction (see uniform splenomegaly and small spleens). hematopoietic tissue. in the developing fetus the liver is the primary site of hematopoiesis, with the spleen making a minor contribution. shortly before or after birth, hematopoiesis ceases in the liver and spleen, and the bone marrow becomes the primary hematopoietic organ. under certain conditions, such as severe demand due to prolonged anemia, splenic hematopoiesis can be reactivated; this outcome is called extramedullary hematopoiesis (emh). studies have indicated that splenic emh in dogs and cats most commonly occurs with degenerative or inflammatory circulation, pitted, or phagocytized. instead, the macrophages of the red pulp perform these functions, and phagocytized cells remain in the red pulp vascular spaces. the location of the primary sites of pitting in nonsinusoidal spleens is unclear, but it is likely that most erythrophagocytosis takes place in the red pulp vascular spaces. the cat's spleen is deficient in pitting, and removal of heinz bodies is slow; however, some erythrophagocytosis does occur in the marginal sinus. the macrophages of the sinusoids, marginal sinus, and red pulp vascular spaces are of bone marrow origin. from the bone marrow these cells circulate in the blood as monocytes and migrate into the spleen. some macrophages are replenished by local proliferation. for example, after phagocytizing large amounts of material from the blood, the macrophages of the pams migrate through the wall of the cuff into the adjacent red pulp, denuding the pams of macrophages. after 24 hours, local residual macrophages have proliferated to repopulate the pams. the fixed macrophages elsewhere in the body, namely, those in connective tissue, lymph nodes (sinus histiocytes), liver (kupffer cells), lung (pulmonary intravascular macrophages and pulmonary alveolar macrophages), and brain (resident and perivascular microglial cells), are also derived from bone marrow (see chapters 5, 8, 9, and 14) . storage or defense spleens. spleens are also classified as either storage or defense spleens, based on whether or not they can store significant volumes of blood. the ability to store blood in the spleen depends on the fibromuscular composition of the splenic capsule and trabeculae. splenic capsules and trabeculae with a low percentage of smooth muscle and elastic fibers cannot expand and contract and are designated as defense spleens. these are found in rabbits and human beings. the spleens of other domestic animal species have distention of the red pulp from stored blood separates the foci of white pulp (pals and lymphoid follicles), making white pulp appear sparser. splenic white pulp is organized around central arteries in the form of pals, which are populated primarily by t lymphocytes (see . primary splenic follicles are located eccentrically in pals and are primarily composed of b lymphocytes. when exposed to antigen, the splenic lymphoid follicles develop germinal centers (see lymphoid/ lymphatic system, lymph nodes, function) . macrophages in the white pulp follicles remove apoptotic b lymphocytes not selected for expansion because of low binding affinity for antigen. failure of these macrophages to phagocytize has been experimentally correlated with decreased production of growth factors like tgf-β and increased production of inflammatory cytokines that predispose the animal to autoimmune conditions. the marginal zone surrounds the marginal sinus at the interface of the white and red pulp and consists of macrophages, dcs, and t and b lymphocytes. the blood supply of the marginal sinus is from conditions (e.g., hematomas, thrombosis) and may occur without concomitant hematologic disease (see uniform splenomegaly with a firm consistency). it is also found in splenic nodular hyperplasia (see splenic nodules with a firm consistency). in some species, such as the mouse, emh is a normal function of the adult spleen and not necessarily a response to disease or hypoxic challenge. the splenic red pulp also contains large numbers of monocytes, which function as a reserve for generating tissue macrophages in response to ongoing tissue inflammation in the body. white pulp. white pulp consists of pals, each with a splenic lymphoid follicle surrounded by a marginal zone. normally these foci of white pulp are so small that they may not be visible on gross examination of a cross section of the spleen. however, if nodules are enlarged either by lymphoid hyperplasia, amyloid deposits, or a neoplastic process (e.g., lymphoma), they can become grossly visible on the cut surface, initially as 0.5-to 1.0-mm white circular foci scattered through the red pulp. in animals with storage spleens, the the splenic artery enters at the hilus and divides into arteries, which enter the trabeculae. when a trabecular artery emerges from a trabecula it becomes the central artery and is encased in a periarteriolar lymphoid sheath (pals), which is composed of t lymphocytes. it then enters the splenic follicle and gives off branches-the radial arteries, which supply the marginal sinus and marginal zone. the central artery emerges from the splenic follicle to enter the red pulp and branches into the penicillary arterioles, which are enclosed in a cuff of macrophages-the periarteriolar macrophage sheath (pams). the emerging penicillar arteries branch into arterioles and capillaries that supply the red pulp vascular spaces (see fig. 13-43) . the red pulp vascular spaces also receive blood from capillaries draining from the marginal sinus and drain into the splenic venules and then into the trabecular veins and splenic vein. b, sinusoidal spleen, dog. the blood flow is essentially the same but with the additional feature that arterioles from the marginal sinus drain into the sinusoids and some blood from the red pulp vascular space passes through slits in the sinusoidal wall to enter the sinusoid (see fig. 13-42) . this is the site of pitting and erythrophagocytosis. note that the major flow in a is sequentially past concentrations of macrophages in the marginal sinus, pams, and red pulp vascular spaces. in b there is the additional route from the marginal zone into the sinusoids. dog. red pulp vascular spaces macrophages in the marginal zone are phenotypically distinct from those in the red pulp. the red pulp macrophages function primarily to filter the blood by phagocytizing particles and by removing senescent or infected erythrocytes and pathogenic bacteria and fungi. marginal zone macrophages are divided into two types based on their location and the type of cell surface receptors they possess. the first group is positioned toward the periphery of the marginal zone, whereas the second group, the marginal metallophilic macrophages (so called for their silver staining positivity), is at the inner margin of the marginal zone closer to the splenic follicle and pals. it has been difficult to generate mammalian models that eliminate one of the two classes of marginal zone macrophages, so the degree to which one group specializes in a particular function is not clear. some marginal zone macrophages actively phagocytize particulate matter or bacteria (e.g., septicemias caused by streptococcus pneumoniae, listeria monocytogenes, campylobacter jejuni, or bacillus anthracis) in the blood (see fig. 13-40) . they also play a similar role in limiting the spread of viral infections. other marginal zone macrophages phagocytize and process antigens. thus macrophages of the marginal zone serve to bridge the innate and adaptive immune responses by secreting inflammatory cytokines to activate other immune cells and providing receptor-based activation of marginal zone lymphocytes. studies have shown that a loss of marginal zone macrophages coincides with decreased antigen trapping by resident b lymphocytes of the marginal zone and consequently a decrease in the early igm response to antigens. the responses of the spleen to injury (box 13-8) include acute inflammation, hyperplasia of the monocyte-macrophage system, hyperplasia of lymphoid tissues, atrophy of lymphoid tissues, storage of blood or contraction to expel reserve blood, and neoplasia. these responses are also best considered on the basis of the two main components of the spleen, the red and white pulp, and the anatomic systems associated with each. monocyte-macrophage system. the distribution and function of macrophages in the spleen is described earlier in the section on structure and function. these interactions are complex, and their relationships to both innate and adaptive immunity are areas of intense study (see also chapter 5). to facilitate filtering, all of the blood in the body passes through the spleen at least once a day, and 5% of the cardiac output goes to the spleen. in dogs, blood flow and transit time depend on whether the spleen is contracted or distended; blood flow is slower in the distended spleen. the extent to which macrophages of the monocyte-macrophage system phagocytize particles depends to a large degree on the sequence in which they receive blood. in most species, macrophages of the marginal sinus are the first to receive blood, and consequently phagocytized particles and bacteria tend to be more concentrated here initially. however, there are differences among domestic animal species; the cat, for instance, has a comparatively small marginal sinus, and thus the pams play a larger role in phagocytosis. the spleen is able to mount a strong response to blood-borne pathogens, which has been demonstrated in several studies. the blood of immunized rabbits injected intravenously with pneumococci cleared 98% of those bacteria within 15 minutes and 100% within an hour. the blood of dogs injected with 1 billion pneumococci per pound of body weight into the splenic artery was cleared of all bacteria in 65 minutes. after splenectomy, blood-borne the radial branches of the central artery, and it serves as the portal of entry into the spleen for recirculating b and t lymphocytes. from here, t lymphocytes migrate to the pals and b lymphocytes to the germinal centers. macrophages in the marginal zone capture bloodborne antigens, process them, and present them to the lymphocytes. senescent erythrocytes damaged erythrocytes (e.g., immune-mediated anemias) parasitized erythrocytes (e.g., hemotropic parasites) storage of blood (in storage spleens) extramedullary hematopoiesis severe demand (e.g., anemias) degenerative/inflammatory conditions without concomitant hematologic disease incidental (e.g., within nodules of hyperplasia) monocytes within splenic cords reserve for generating tissue macrophages in response to inflammation homing of circulating lymphocytes in the blood phagocytosis and processing of antigen macrophage activation inflammation, which may be diffuse or multifocal/focal (e.g., blastomycosis and tuberculosis, respectively). red pulp vascular spaces. the main response to injury of the red pulp vascular spaces is congestion (see uniform splenomegaly with a bloody consistency), as well as the storage of blood or contraction to expel reserve blood. white pulp. the responses to injury within the white pulp are most pronounced in the splenic lymphoid follicles. lymphoid follicular hyperplasia is a response to antigenic stimuli and results in the formation of secondary follicles; marked hyperplasia may be grossly evident. hyperplasia of splenic lymphoid follicles follows a similar sequence of events and morphologic changes as seen in other secondary lymphoid organs and is discussed in more detail in lymphoid/lymphatic system, lymph nodes, dysfunction/responses to injury. similarly, atrophy of splenic lymphoid follicles has similar causes as lymphoid atrophy in other lymphoid organs (see box 13-5). briefly, atrophy occurs in response to lack of antigenic stimulation (e.g., from regression after antigenic stimulation has ceased), from the effects of toxins, antineoplastic chemotherapeutic agents, microorganisms, radiation, malnutrition, wasting/cachectic diseases, or aging, or when the bone marrow and thymus fail to supply adequate numbers of b and t lymphocytes, respectively. the follicles are depleted of lymphocytes, and with time, germinal centers and follicles disappear. the amount of the total lymphoid tissue is reduced, and the spleen may be smaller. the response to injury of the monocyte-macrophage system in the marginal sinus and marginal zone is also phagocytosis and proliferation. capsule and trabeculae. lesions in the capsule and trabeculae are uncommon and include splenic capsulitis secondary to peritonitis, and complete or partial rupture of the splenic capsule, usually due to trauma. the two main portals of entry to the spleen for infectious agents are hematogenous spread and direct penetration. the splenic capsule is thick, and thus direct penetration is less common. inflammation from an adjacent peritonitis is unlikely to penetrate the capsule into the splenic parenchyma. cattle with traumatic reticulitis may have foreign objects migrate into the ventral extremity of the spleen, causing a splenic abscess. splenic abscesses also develop secondary to perforation of the gastric wall in horses, due to foreign body penetration, gastric ulcers, or gastric inflammation. portals of entry used by microorganisms and other agents and substances to access the lymphoid/lymphatic system are summarized in box 13-6. defense mechanisms used by the spleen to protect itself against microorganisms and other agents are the innate and adaptive immune responses, discussed in chapters 3, 4, and 5. other defense mechanisms are structural in nature to protect against external trauma and include the thick fibrous capsule of the spleen. lymph nodes are soft, pale tan, round, oval or reniform organs with a complex three-dimensional structure. on gross examination of a cross section of lymph nodes, two main areas are visible: an outer rim of cortex and an inner medulla (fig. 13-45) . to understand the pathologic response of the lymph node, it is important to consider its anatomic components and their relationship with antigen processing (fig. 13-46) : organisms multiply rapidly and may disseminate widely in the body to cause an overwhelming postsplenectomy infection. studies have also shown that the phagocytic function of the spleen is critical in the control of plasmodium (causative agent of malaria) in human beings and babesiosis in cattle. if the number of pathogenic bacteria in the circulation exceeds the capacity of the splenic macrophages, as in cases of severe septicemia, it may result in acute splenic congestion (see uniform splenomegaly with a bloody consistency). this may be followed by inflammation with areas of necrosis, fibrin deposition, and infiltration by neutrophils in bacteremias of pyogenic bacteria. the marginal zone can be the initial site of response to blood-borne antigens and bacteria delivered by the radial branches of the central arteries to the marginal sinus. similar to the response of the red pulp vascular spaces, the marginal zone can become congested and with time (only hours with highly pathogenic organisms) may contain aggregates of neutrophils and macrophages. histologically, the congestion and inflammation form a complete or partial concentric ring around the circumference of the splenic nodule (see anthrax). hyperplasia of the red pulp macrophages is also seen in chronic hemolytic diseases, because there is a prolonged need for phagocytosis of erythrocytes. similarly, chronic splenic congestion, usually the result of portal or splenic vein hypertension, can lead to proliferation of the macrophages present on the walls of the red pulp vascular spaces and results in thickening of the reticular walls between the red pulp vascular spaces. macrophages in the red pulp also proliferate in response to fungi and facultative intracellular pathogens (e.g., mycobacterium bovis) arriving hematogenously to the spleen. the number of red pulp macrophages may be augmented by monocytes recruited from the blood to form granulomatous acute inflammation with fibrin and necrosis abscesses, microabscesses granulomatous inflammation (diffuse, multifocal, focal) fibroblastic reticular cells and fibers. besides providing structural support, this reticulum helps form a substratum for the migration of lymphocytes and antigen-presenting cells to the follicles and facilitates the interaction with b and t lymphocytes. cortex. the outer/superficial cortex contains the lymphoid follicles (also referred to as lymphoid nodules) (see . the follicles are designated as primary if they consist mainly of small lymphocytes: mature naïve b lymphocytes expressing receptors for specific antigens exit the bone marrow and circulate through the bloodstream, lymphatic vessels, and secondary lymphoid tissues. on their arrival at lymph nodes, b lymphocytes exit through hevs in the paracortex and home to a primary follicle (which also contains follicular dcs in addition to the resting b lymphocytes). lymphoid follicles with germinal centers are designated as secondary follicles: b lymphocytes that recognize the antigen for which they are expressing receptors are activated and proliferate to form the secondary lymphoid follicles characterized by prominent germinal centers. germinal centers are areas with a specialized • stroma-capsule, trabeculae, and reticulum • cortex-"superficial" or "outer" cortex (lymphoid follicles, b lymphocytes) • paracortex-"deep" or "inner" cortex (t lymphocytes) • medulla-medullary sinuses and medullary cords • blood vessels-arteries, arterioles, high endothelial venules (hevs), efferent veins • lymphatic vessels-lymphatic afferent and efferent vessels; lymphatic sinuses (subcapsular, trabecular, and medullary) • monocyte-macrophage system-sinus histiocytes stroma. the lymph node is enclosed by a fibrous capsule penetrated by multiple afferent lymphatic vessels, which empty into the subcapsular sinus (see also . at the hilus, efferent lymphatic vessels and veins exit, and arteries enter the node. fibrous trabeculae extend from the capsule into the parenchyma to provide support to the node and to house vessels and nerves. the lymph node is also supported by a meshwork of medulla. the medulla is composed of medullary cords and medullary sinuses (see . the medullary cords contain macrophages, lymphocytes, and plasma cells. in a stimulated node the cords become filled with antibody-secreting plasma cells. the medullary sinuses are lined by fibroblastic reticular cells and contain macrophages ("sinus histiocytes"), which cling to reticular fibers crossing the lumen of the sinus. these macrophages phagocytize foreign material, cellular debris, and bacteria from the incoming lymph. vasculature: blood vessels, lymphatic vessels, and lymphatic sinuses. the blood vessels of the lymph node include arteries, arterioles, veins, and postcapillary venules (hevs) lined by specialized cuboidal endothelium (see figs. 13-45 and 13-46) . microenvironment that support the proliferation and further development of b lymphocytes to increase their antigen and functional capacity (see lymphoid/lymphatic system, lymph nodes, function). the mantle cell zone surrounds the germinal center and consists of small inactive mature naïve b lymphocytes and a smaller population of t lymphocytes (approximately 10%). paracortex. the diffuse lymphoid tissue of the paracortex (also referred to as the deep or inner cortex) consists mainly of t lymphocytes, as well as macrophages and dcs (see . this region contains the hevs through which b and t lymphocytes migrate from the blood into the lymphoid follicles and paracortex, respectively. t and b lymphocytes may also enter the lymph node via the lymphatic vessels. and larger molecules, small molecules and free antigens, and antigen within dcs. it is helpful to consider the paths taken by particles, molecules, antigens, and cells arriving at a lymph node. the following account describes the journey of an antigen as it enters a lymph node to trigger an immune response. antigen in the lymph arriving in the afferent lymphatic vessels empties into the subcapsular sinus. hydrostatic pressure here is low, and reticular fibers crossing the sinus impede flow, and thus particles tend to settle, which facilitates phagocytosis by the sinus macrophages. lymph then flows down the trabecular sinuses that line the outer surface of fibrous trabeculae, to the medullary sinus, and eventually exits via efferent vessels. as antigens within the lymph travel through the sinuses, they are captured and processed by macrophages and dcs. alternatively, dcs charged with antigen can migrate within blood vessels to the node and enter the paracortex via the hevs. circulating b lymphocytes also enter across the hevs, and if they encounter antigen-bearing dcs, there is a local reaction involving the appropriate t helper lymphocytes, b lymphocytes, and dcs. this results in the migration of the activated b lymphocytes to a primary follicle, where they initiate formation of a germinal center. germinal centers, upon migration of antigen-activated b lymphocytes, develop a characteristic architecture. distinct polarity composed of a superficial or light zone and a deep dark zone is present in cases of antigenic stimulation. the light zone, orientated at the source of antigen, consists mainly of small lymphocytes, called centrocytes, which have moderate amounts of pale eosinophilic cytoplasm. the cells of the dark zone, called centroblasts, are large, densely packed lymphocytes with scant cytoplasm, giving this area a darker appearance on h&e staining. the centroblasts undergo somatic mutations of the variable regions of the immunoglobulin gene, followed by isotype class switching (from igm to igg or iga). during this process most centroblasts undergo apoptosis, and cell fragments are phagocytized by macrophages, which are then termed tingible (stainable) body macrophages. the cells that have survived the affinity maturation process are now called centrocytes and along with t lymphocytes and follicular dcs, populate the germinal center light zone. these post-germinal center b lymphocytes leave the follicle as plasma cell precursors (immunoblasts or plasmablasts) and migrate from the cortex to the medullary cords, where they mature and excrete antibody into the efferent lymph. some of these cells may colonize the region surrounding the mantle cell zone to form a marginal zone. marginal zones are apparent only in situations of prolonged and intense immune stimulation and serve as a reservoir of memory cells. the elliptical mantle cell cuff is wider over the light pole of the follicle, though in instances of strong antigenic stimulation, the cuffs can completely encircle the germinal center. responses to injury are listed in box 13-9, and the responses are discussed on the basis of the following systems: sinus histiocytes of the monocyte-macrophage system, cortex, paracortex, and medulla (medullary sinuses and medullary cords). generally, enlarged lymph nodes can be distributed in several different patterns in the body. first, all lymph nodes throughout the body (systemic or generalized) may be enlarged (lymphadenopathy or lymphadenomegaly). this pattern is usually attributed to systemic infectious, inflammatory, or neoplastic processes. if a single lymph node or regional chain of nodes is enlarged, then the area drained by that node should be checked for lesions (e.g., evaluate the oral cavity if the mandibular lymph nodes are enlarged). thus it is important to know the area drained by specific lymph nodes. mesenteric lymph nodes are normally larger because of follicular approximately 90% to 95% of lymphocytes enter lymph nodes through the hevs, which also play an important role in lymph fluid balance. the lymphatic vasculature consists of afferent lymphatic vessels, which pierce the capsule and drain into the subcapsular sinus. lymph continues to drain through the trabecular sinuses to the medullary sinuses and finally exits at the hilus via efferent lymphatic vessels. all lymph nodes receive afferent lymphatic vessels from specific areas of the body. the term lymphocenter is often used in veterinary anatomy to describe a lymph node or a group of lymph nodes that is consistently present at the same location and drains from the same region in all species. for example, the popliteal lymph node, caudal to the stifle, drains the distal hind limb. the tracheobronchial nodes (bronchial lymphocenter), located at the tracheal bifurcation, collect lymph from the lungs and send it to the mediastinal nodes or directly to the thoracic duct. because lymph from a single afferent lymphatic vessel drains into a discrete region of a lymph node, only these regions of the node may be affected by the contents of a single draining lymph vessel (e.g., antigen, infectious organisms, or metastatic neoplasms [ fig. 13-47] ). the lymph node of the pig has a different structure. the afferent lymphatic vessels enter at the hilus instead of around the periphery of the node and empty lymph into the center of the node. the lymph drains to the "subcapsular" sinus (the equivalent of the medullary sinuses of other domestic animals) and then into several efferent lymphatic vessels, which pierce the outer capsule. this reversal of flow is the result of an inverted nodal architecture, with the cortex in the middle of the node surrounded by the medulla at the periphery. thus a pig lymph node that is draining an area of hemorrhage will have blood accumulate in the periphery (subcapsular) instead of in the center of the node (which may be grossly visible). the functions of the lymph node are (1) to filter lymph of particulate matter and microorganisms, (2) to facilitate the surveillance and processing of incoming antigens via interactions with b and t lymphocytes, and (3) to produce b lymphocytes and plasma cells. material arriving in the lymph can be subdivided into free particles that is undergoing follicular hyperplasia is enlarged and has a taut capsule, and the cut surface may bulge. histologically, the follicles contain active germinal centers with antigenic polarity (light and dark zones) (figs. 13-49 and 13-50; also see fig. 13-46) . depending on the duration and continued exposure to the antigen, there may also be concomitant paracortical hyperplasia and medullary cord hyperplasia and sinus histiocytosis, because these nodes continuously receive and respond to barrages of antigens and bacteria from the intestinal tract. sinus histiocytes (macrophages) are part of the monocyte-macrophage system and the first line of defense against infectious and noninfectious agents in the incoming lymph. in response to these draining agents, there is hyperplasia of the macrophages ("sinus histiocytosis"), most notable in the medullary sinuses ( fig. 13-48) . leukocytes, often monocytes, may harbor intracellular pathogens (e.g., mycobacterium spp., cell-associated viruses such as parvovirus), arrive in the blood or lymph, infect the lymph node, and then are disseminated throughout the lymphoid tissues of the body via the efferent lymph and circulating blood. cortex (lymphoid follicles). follicular hyperplasia of the cortex is discussed in the section lymphoid/lymphatic system, lymph node, function. an antigenically stimulated lymph node the two main portals of entry to the lymph node for infectious agents and antigens are afferent lymphatic vessels (lymphatic spread) and blood vessels (hematogenous spread). portals of entry used by microorganisms and other agents and substances to access the lymphoid/lymphatic system are summarized in box 13-6. infectious microorganisms, either free within the lymph or within lymphocytes or monocytes, are transported to regional lymph nodes through lymphatic vessels. agents may escape removal by phagocytosis in one lymph node and be transported via efferent lymphatic vessels to the next lymph node in the chain and cause an inflammatory or immunologic response there. this process can continue serially down a lymph node chain, and if the agent is not removed, it may eventually be transported via the lymphatic vessels to either the cervical or thoracic ducts and then disseminated throughout the body. although most pathogens are transported to lymph nodes via afferent lymphatic vessels, bacteria can be transported to lymph nodes hematogenously (free or within leukocytes such as monocytes) in septicemias and bacteremias. direct penetration of a lymph node is uncommon, because it is protected by a thick fibrous capsule. occasionally, inflammatory cells or neoplasms can extend directly into nodal parenchyma from adjacent tissues. defense mechanisms used by the lymphatic system to protect itself against microorganisms and other agents are the innate and adaptive immune responses, discussed in chapters 3, 4, and 5. other defense mechanisms are structural in nature to protect against external trauma and include the thick fibrous capsules of lymph nodes. hemal nodes are small, dark red to brown nodules found most commonly in ruminants, mainly sheep, and have also been reported in horses, primates, and some canids. their architecture resembles that of a lymph node with lymph follicles and sinuses, except that in the hemal node, sinuses are filled with blood (efig. 13-8) . because erythrophagocytosis can be present, it is presumed that hemal nodes can filter blood and remove senescent erythrocytes, but as their blood supply is small, their functional importance is not clear. malt is the initial site for mucosal immunity and is crucial in the protection of mucosal barriers. malt is composed of both diffuse lymphoid tissues and aggregated lymphoid (also known as lymphatic) nodules, which can be subcategorized based on their anatomic location: (1) bronchus-associated lymphoid tissue (balt), which is often at the bifurcation of the bronchi and bronchioles; (2) tonsils (pharyngeal and palatine) form a ring of lymphoid tissue at the oropharynx; (3) nasal-, larynx-, and auditory tube-associated lymphoid tissues (nalt, lalt, and atalt, respectively) within the nasopharyngeal area; (4) gut-associated lymphoid tissue (galt), which includes peyer's patches and diffuse lymphoid tissue in the gut wall; (5) conjunctiva-associated lymphoid tissue (calt); (6) other lymphoid nodules (e.g., genitourinary tract) (fig. 13-51) . diffuse lymphoid tissue consists of lymphocytes and dcs within the lamina propria of the mucosa of the alimentary, respiratory, and genitourinary tracts. these cells intercept and process antigens, which then travel to regional lymph nodes to initiate the immune response, leading ultimately to the secretion of iga, igg, and igm. plasmacytosis. less florid follicular reactions will have smaller separated germinal centers, whereas nodes receiving persistent high levels of antigen stimulation may have coalescing germinal centers (termed "atypical benign follicular hyperplasia"). in such cases of chronic strong antigenemia, the highly reactive nodes may also exhibit colonization of lymphocytes into perinodal fat, and germinal centers may contain irregular lakes of eosinophilic material, known as follicular hyalinosis. as the immune response declines, there is follicular lymphoid depletion and the concentration of lymphocytes in the germinal centers is reduced, allowing the underlying follicular stroma (including dcs and macrophages) to become visible. with ongoing lymphocyte depletion, the mantle cell zones are thinned, less populated, and discontinuous. eventually, residual mantle cells collapse into the follicular stroma, forming clusters of small dark cells within the bed of dcs and macrophages, referred to as fading follicles. paracortex. paracortical atrophy may result from a variety of causes, including deficiency in lymphocyte production in the bone marrow, reduced differential selection of lymphocytes in the thymus, or destruction of lymphocytes in the lymph node by viruses, radiation, and toxins directly on the lymphocytes in the lymph node (see box 13-5). examination of h&e-stained sections allows evaluation of follicular activity in the cortex and the concentration of plasma cells in the medullary cords, which serve as a reasonable estimate of b lymphocyte activity for comparison. paracortical hyperplasia may have a nodular or diffuse appearance depending on which and how many afferent lymphatic vessels are draining antigen. this reaction may precede or be concurrent with the germinal center reaction of follicular hyperplasia. proliferation of t lymphocytes has been reported in the paracortex (and pals of the spleen) in malignant catarrhal fever (mcf) in cattle and in pigs with porcine reproductive and respiratory syndrome. pcv2 can cause a diffuse proliferation of macrophages within the paracortex. responses to injury by the medullary sinuses are dilation of the sinuses and proliferation of histiocytes ("sinus histiocytosis"). sinus macrophages proliferate in response to a wide variety of particulate matter in the lymph, including bacteria and erythrocytes (erythrophagocytosis) draining from a hemorrhagic area (see lymphoid/lymphatic system, disorders of domestic animals: lymph nodes, pigmentation of lymph nodes). dilation of the sinuses due to edema occurs with many underlying conditions, including chronic cardiac failure or drainage from an acutely inflamed area. as the inflammation progresses, the sinuses become filled with neutrophils, macrophages, and occasionally fibrin, in addition to the hyperplastic resident sinus histiocytes (see fig. 13-48) . depending on the intensity of the inflammation, the adjacent parenchyma may become affected (see lymphoid/lymphatic system, disorders of domestic animals: lymph nodes, enlarged lymph nodes [lymphadenomegaly], acute lymphadenitis). as pointed out in the section on lymph nodes, function, after activation and proliferation of b lymphocytes in the follicle, the immunoblasts formed there move to and mature in the medullary cords, which as a result are distended with plasma cells that secrete antibody into the efferent lymphatic vessels ("medullary plasmacytosis"). the concentration of medullary plasma cells correlates with the activity of the germinal centers. as the immune response subsides, the number of plasma cells decreases and the medullary cords return to their resting state populated by few lymphocytes and scattered plasma cells. chapter 13 bone marrow, blood cells, and the lymphoid/lymphatic system composition. for instance, m cells increase in animals transferred from pathogen-free housing to the normal environment. m cells may also be exploited as a portal for entry by some microbes (see lymphoid/lymphatic system, portals of entry/pathways of spread). table 13 -5 lists the interactions of the malt with different microorganisms. the responses of malt to injury are similar to those of other lymphoid tissues: hyperplasia, atrophy, and inflammation (box 13-10). hyperplasia. hyperplasia of lymphoid nodules is a response to antigenic stimulation and consists of activation of germinal centers with subsequent production of plasma cells (see fig. 13-51, b) . lymphoid nodule hyperplasia is often present in chronic disease conditions, such as balt hyperplasia in chronic dictyocaulus spp. (horses, cattle, sheep, and goats) or metastrongylus spp. (pigs) associated bronchitis or bronchiolitis. mycoplasma spp. pneumonias of sheep and pigs display marked balt hyperplasia that can encircle bronchioles and bronchi ("cuffing pneumonia"). hyperplastic lymphoid nodules can be so enlarged that they become grossly visible as discrete white plaques or nodules (see fig. 13 -51, a). they can be seen in the conjunctiva of the eyelids and the third eyelid in chronic conjunctivitis, the pharyngeal mucosa in chronic pharyngitis, the gastric mucosa in chronic gastritis, and the urinary bladder in chronic cystitis (follicular cystitis). the normal fetus has no detectable balt, though it may be present in fetuses aborted due to infectious disease. atrophy. atrophy of the diffuse lymphoid tissue and lymphoid nodules has the same causes as atrophy affecting other lymphoid tissues (see box 13-5) and includes lack of antigenic stimulation, cachexia, malnutrition, aging, viral infections, or failure to be repopulated by b lymphocytes from the bone marrow or t lymphocytes from the thymus. lymphocytolysis of germinal center lymphocytes of peyer's patches is a characteristic lesion in bvdv infection in ruminants and canine and feline parvovirus infections ("punchedout peyer's patches) (see chapters 4 and 7). the main portals of entry to malt for infectious agents are hematogenous spread and through migrating macrophages, dcs , and m cells. pathogenic bacteria such as escherichia coli, yersinia pestis, mycobacterium avium ssp. paratuberculosis (map), l. monocytogenes, salmonella spp., and shigella flexneri can invade the host from the lumen of the intestine through dendritic or m cells. some viruses (e.g., reovirus) may be transported by m cells. the scrapie prion protein (prp sc ) may also accumulate in peyer's patches. many viruses, such as bovine coronavirus, bvdv, rinderpest virus, malignant catarrhal fever virus, feline panleukopenia virus, and canine parvovirus, cause lymphocyte depletion within the malt. portals of entry used by microorganisms and other agents and substances to access the lymphoid system are summarized in box 13-6. defense mechanisms used by malt to protect itself against microorganisms and other agents are the innate and adaptive immune responses, discussed in chapters 3, 4, and 5. congenital disorders of the thymus are discussed in detail in chapter 5. summaries of the gross and microscopic morphologic changes are solitary lymphoid nodules are localized concentrations of lymphocytes (mainly b lymphocytes) in the mucosa and consist of defined but unencapsulated clusters of small lymphocytes (primary lymphoid nodule). they are usually not grossly visible in the resting or antigenically unstimulated state, but upon antigenic stimulation, they proliferate and form germinal centers and surrounding mantle cell zones (secondary lymphoid nodules). aggregated lymphoid nodules consist of groups of lymph nodules, the most notable of which are the tonsils and peyer's patches. the aggregated lymphoid follicles of the peyer's patches are most obvious in the ileum. the latter are covered by a specialized epithelium, the follicle-associated epithelium (fae). the fae is the interface between the peyer's patches and the luminal microenvironment and consists of enterocytes and interdigitated m cells. m cells transport (via endocytosis, phagocytosis, pinocytosis, and micropinocytosis) antigens, particles, bacteria, and viruses from the intestinal lumen to the underlying area rich in dcs , which deliver the material to the lymphoid tissue of the peyer's patches. m cells also express iga receptors, which allows for the capture and transport of bacteria entrapped by iga. the proportion of enterocytes and m cells within the fae is modulated by the luminal bacterial a b mercury have a suppressive effect on the immune system. halogenated aromatic hydrocarbons cause dysfunction of dcs through several mechanisms that lead to atrophy of the primary and secondary lymphoid organs. heavy metals, such as lead, mercury and nickel, are immunosuppressive and generally affect the levels of b and t lymphocytes, nk cells, and inflammatory cytokines. other metals, such as selenium, zinc, and vanadium, may be immunostimulatory at low doses. the immunotoxic mechanisms may differ and include chelation of molecules and effects on protein synthesis, cell membrane integrity, and nucleic acid replication. the toxic effects of mycotoxins such as fumonisins b 1 and b 2 (secondary fungal metabolites produced by members of the genus fusarium) and aflatoxin (produced by aspergillus flavus) include lymphocytolysis in the thymic cortex. box 13-10 responses of mucosa-associated lymphoid tissue to injury described in the sections on disorders of horses and disorders of dogs. thymic cysts can be found within the developing and mature thymus and in thymic remnants in the cranial mediastinum. thymic cysts are often lined by ciliated epithelium and represent developmental remnants of branchial arch epithelium and are usually of no significance. thymitis is an uncommon lesion and may be seen in pcv2 infection (see disorders of pigs and also chapter 4), enzootic bovine abortion (see chapter 18), and salmon poisoning disease of dogs (see chapter 7). infectious agents more commonly cause thymic atrophy. variable degrees of acquired immunodeficiency can be also be caused by toxins, chemotherapeutic agents and radiation, malnutrition, aging, and neoplasia. of infectious agents, viruses most commonly infect and injure lymphoid tissues and include the following: ehv-1 in aborted foals (fig. 13-52) , classic swine fever virus, bvdv, canine distemper virus, canine and feline parvovirus, and fiv; severe thymic lymphoid depletion is an early lesion in fiv-infected kittens. environmental toxins, such as halogenated aromatic hydrocarbons (e.g., polychlorinated biphenyls and dibenzodioxins), lead, and thymic hyperplasia. asymptomatic hyperplasia may occur in juvenile animals in association with immunizations and results in symmetrical increase in the size of the thymus. autoimmune lymphoid hyperplasia of the thymus has germinal center formation and occurs with myasthenia gravis. asplenia or the failure of a spleen to develop in utero occurs rarely in animals, and the effect on the animal's immune status is uncertain. (splenic aplasia is present in certain strains of mice, but because these are usually maintained under either germ-free or specific pathogen-free [spf] conditions, the effect of asplenia cannot be evaluated.) congenital immunodeficiency diseases are described in detail in chapter 5, and in the sections on disorders of horses and disorders of dogs. gross examination of the spleen involves deciding whether the spleen is enlarged (splenomegaly), normal, or small (see e-appendix 13-2). diffuse enlargement of the spleen may be due to congestion (termed bloody spleen) or other infiltrative disease (termed meaty spleen). the cut surface of congested spleens will exude blood, whereas meaty spleens are more firm and do not readily ooze blood. the diseases and disorders having splenomegaly are discussed using the following categories, which list the common causes of uniform splenomegaly (table 13-6): • uniform splenomegaly with a bloody consistency (bloody spleen) ( fig. 13-54 , a) • uniform splenomegaly with a firm consistency (meaty spleen) (see fig. 13-54 , b) • splenic nodules with a bloody consistency • splenic nodules with a firm consistency uniform splenomegaly with a bloody consistency-bloody spleen. the common causes of a bloody spleen are (1) congestion (due to gastric volvulus with splenic entrapment, splenic volvulus chemotherapeutic drugs inhibit the cell cycle through various mechanisms, and thus all dividing cells, including lymphocytes, bone marrow cells, and enterocytes, are sensitive to their effects. as such, bone marrow suppression, immunosuppression, and gastrointestinal disturbances are common side effects of anticancer drugs. purine analogues (e.g., azathioprine) compete with purines in the synthesis of nucleic acids, whereas alkylating agents like cyclophosphamide cross-link dna and inhibit the replication and activation of lymphocytes. cyclosporin a specifically inhibits the t lymphocyte signaling pathway by interfering with the transcription of the il-2 gene. methotrexate, a folic acid antagonist, blocks the synthesis of thymidine and purine nucleotides. the immunosuppressive effects of some of these agents is desirable for the treatment of immune-mediated disease (e.g., immune-mediated hemolytic anemia) or to prevent allograft rejection after transplantation. corticosteroids may be given at an immunosuppressive dose, though the degree of suppression is highly variable among species. local or palliative treatment of cancer may include radiotherapy (ionizing radiation) to target and damage the dna of the neoplastic cells. although some immunosuppression may be noted, particularly if bone marrow or lymphoid tissue is within the therapeutically irradiated field, mounting evidence suggests that radiotherapy can induce a cascade of proimmunogenic effects that engage the innate and adaptive immune systems to contribute to the destruction of tumor cells. malnutrition and cachexia, which may occur with cancer, lead to secondary immunosuppression through several complex metabolic and neurohormonal aberrations. thymic function may be impaired in young malnourished animals, resulting in a decrease in circulating t lymphocytes and subsequent depletion of t lymphocyte regions of secondary lymphoid organs. lymphoid atrophy may result from physiologic and emotional stress, which can cause the release of catecholamines and glucocorticoids. as part of the general effects of aging in cells (see chapter 1), all lymphoid organs decrease in size (atrophy) with advancing age. in the case of the thymus this reduction in size occurs normally after sexual maturity and is more appropriately termed thymic involution. the term involution should be reserved for normal physiologic processes in which an organ either returns to normal size after a period of enlargement (e.g., postpartum uterus) or regresses to a more primitive state (e.g., thymic involution). because the thymus has both lymphoid and epithelial components, neoplasms may arise from either component. thymic lymphoma arises from the t lymphocytes in the thymus (and very rarely b lymphocytes). it is most often seen in young cats and cattle and less frequently in dogs (fig. 13-53 ) (see hematopoietic neoplasia). thymomas arise from the epithelial component and are usually benign neoplasms that occupy the cranial mediastinum of older animals. histologically, these neoplasms consist of clustered or individualized neoplastic epithelial cells, often outnumbered by nonneoplastic small lymphocytes ("lymphocyte-rich thymoma"). thymomas are common in goats and often contain large cystic structures. immunemediated diseases, including myasthenia gravis and immunemediated polymyositis, occur with thymomas in dogs, and also rarely in cats. myasthenia gravis is caused by autoantibodies directed toward the acetylcholine receptors, which lead to destruction of postsynaptic membranes and reduction of acetylcholine receptors at neuromuscular junction. megaesophagus and aspiration pneumonia are common sequelae to this condition. (syn: necropsy) in horses and dogs that have been euthanized or anesthetized with barbiturates. grossly, the spleen is extremely enlarged (fig. 13-55) , and the cut surface bulges and oozes copious blood. because of the splenic distention, the splenic capsule can be fragile and easily ruptured. histologically, the red pulp is distended by erythrocytes, and the lymphoid tissues of the white pulp are small and widely separated (fig. 13-56 ). electric stunning of pigs at slaughter may result in a large congested spleen; the mechanism is unknown, but it should not be confused with a pathologically congested spleen. splenic congestion in acute cardiac failure is rarely seen in animals. acute congestion/hyperemia. acute septicemias may cause acute hyperemia and concurrent acute congestion of marginal zones and splenic red pulp. microbes are transported hematogenously to these sites, where they are rapidly phagocytized by macrophages. enormous numbers of intravenous bacteria can be cleared by the spleen from the blood in 20 to 30 minutes, but when this defensive mechanism is overwhelmed, the outcome is usually fatal. the response of the spleen depends on the duration of the disease. in acutely fatal cases, such as anthrax and fulminating salmonellosis, distention by blood may be the only gross finding. if the animal survives longer, as in swine erysipelas and the less virulent forms of [all of which compress the splenic vein], and barbiturate euthanasia, anesthesia, or sedation), (2) acute hyperemia (due to septicemia), and (3) acute hemolytic anemia (due to an autoimmune disorder or an infection with a hemotropic parasite). splenic torsion. torsion of the spleen occurs most commonly in pigs and dogs; in dogs this usually involves both spleen and stomach and is seen more often in deep-chested breeds (see chapter 7). in contrast to ruminants, in which the spleen is firmly attached to the rumen, the spleens of dogs and pigs are attached loosely to the stomach by the gastrosplenic ligament. it is the twisting of the spleen around this ligament that results initially in occlusion of the veins, causing splenic congestion, and later in occlusion of the artery, causing splenic infarction. in dogs the spleen is uniformly and markedly enlarged and may be blue-black from cyanosis. it is often folded back on itself (visceral surface to visceral surface) in the shape of the letter "c." treatment for this condition is most often splenectomy. barbiturate euthanasia, anesthesia, or sedation. intravenous injection of barbiturates induces acute passive congestion in the spleen due to relaxation of smooth muscle in the capsule and trabeculae. this phenomenon is seen most dramatically at autopsy only histologic lesion may be marked congestion of the marginal sinuses and the splenic red pulp vascular spaces. at low magnification, congestion of the marginal sinus may appear as a circumferential red ring around the splenic follicle, and there is marked lymphocytolysis of follicles and pals. intravascular free bacilli are noted and may be seen in impression smears of peripheral blood, presumably because death is so rapid from the anthrax toxin that there is insufficient time for phagocytosis to take place. if the animal lives longer, scattered neutrophils are present in the marginal sinuses and red pulp vascular spaces (fig. 13-57 ). anthrax cases are not normally autopsied because exposure to air causes the bacteria to sporulate-anthrax spores are extremely resistant and readily contaminate the environment. acute hemolytic anemias. hemolytic diseases, including acute babesiosis, hemolytic crises in equine infectious anemia, and immune-mediated hemolytic anemia, can cause marked splenic congestion. the splenic congestion is due to the process of removal (phagocytosis) and storage of large numbers of sequestered parasitized and/or altered erythrocytes from the circulation. histologically, there is dilation of the red pulp vascular spaces with erythrocytes and erythrophagocytes. with chronicity there is hyperplasia of the red pulp macrophages, hemosiderosis, and reduced congestion because the number of sequestered diseased erythrocytes is diminished. spleen. the three general categories of conditions leading to uniform splenomegaly with a firm meaty consistency are (1) marked phagocytosis of cells, debris, or foreign agents/material; (2) proliferation or infiltration of cells as occurs in diffuse lymphoid and histiocytic hyperplasia, diffuse granulomatous disease (e -table 13 -2), emh, and neoplasia; (3) storage of materials in storage diseases or amyloidosis. it is important to recognize that more than one of these processes can occur in the same patient (e.g., dogs with immunemediated hemolytic anemia may have both marked erythrophagocytosis and emh). the appearance of the cut surface of a meaty spleen depends on the underlying cause. in diffuse marked lymphoid hyperplasia, large, disseminated, discrete, white, bulging nodules are visible. spleens with diffuse infiltrative neoplasms, such as lymphoma, are pink-light purple on cut surface. diffuse lymphoid hyperplasia. lymphoid hyperplasia has been described in detail in the section on dysfunction/responses to injury. in cases of prolonged antigenic stimulation the lymphoid salmonellosis, there may be sufficient time for neutrophils and macrophages to accumulate in the marginal sinuses, marginal zones, and splenic red pulp vascular spaces. anthrax. b. anthracis, the causative agent of anthrax, is a grampositive, large, endospore-forming bacillus, which grows in aerobic to facultative anaerobic environments. anthrax is primarily a disease of ruminants, especially cattle and sheep (see chapters 4, 7, 9, and 10) . once the spores are ingested, they replicate locally in the intestinal tract, spread to regional lymph nodes, and then disseminate systemically through the bloodstream, resulting in septicemia. b. anthracis produces exotoxins, which degrade endothelial cell membranes and enzyme systems. grossly, the spleen is uniformly enlarged and dark red to bluishblack and contains abundant unclotted blood. in peracute cases the 782.e1 chapter 13 bone marrow, blood cells, and the lymphoid/lymphatic system diffuse granulomatous disease. chronic infectious diseases may cause a uniformly firm and enlarged spleen, mostly due to macrophage hyperplasia and phagocytosis, diffuse lymphoid hyperplasia, or diffuse granulomatous disease. diffuse granulomatous diseases (see etable 13 -2) occur in (1) intracellular facultative bacteria that infect macrophages (e.g., mycobacterium spp., brucella spp., and francisella tularensis); (2) systemic mycoses (e.g., blastomyces dermatitidis, histoplasma capsulatum) (see lymphoid/lymphatic system, disorders of domestic animals: lymph nodes, enlarged lymph nodes [lymphadenomegaly]) ( fig. 13-59, a and b) , and (3) protozoal infections that infect macrophages (e.g., leishmania spp.). some of these organisms may also produce nodular spleens with the formation of discrete to coalescing granulomas (e.g., m. bovis) (see splenic nodules with a firm consistency). follicles throughout the splenic parenchyma can become enlarged and visible on gross examination (fig. 13-58 ), leading to diffuse splenomegaly. in contrast to b lymphocyte hyperplasia of the lymphoid follicles, certain diseases (e.g., malignant catarrhal fever in cattle) may lead to t lymphocyte hyperplasia of the pals. diffuse histiocytic hyperplasia and phagocytosis. splenomegaly from hyperplasia and increased phagocytosis of splenic macrophages is a response to the need to engulf organisms in prolonged bacteremia or parasitemia from hemotropic organisms. whereas acute hemolytic anemias cause splenomegaly with congestion (bloody spleen), with chronicity there is decreased sequestration of diseased erythrocytes and hence less congestion. therefore in cases of chronic hemolytic disease, splenomegaly is attributed to diffuse proliferation of macrophages, phagocytosis, and concurrent hyperplasia of the white pulp due to ongoing antigenic stimulation. for example, equine infectious anemia has cyclical periods of viremia, with immune-mediated damage to erythrocytes and platelets, and phagocytosis to remove altered erythrocytes and platelets. these cycles result in proliferation of red pulp macrophages, hyperplasia of hematopoietic cells (emh) to replace those lost, and hyperplasia of lymphocytes in the white pulp. in animals less than 1 year of age. in general, these substrates are lipids and/or carbohydrates that accumulate in the cells, the result of the lack of normal processing within lysosomes. major categories of stored materials include mucopolysaccharides, sphingolipids, glycolipids, glycoproteins, glycogen, and oligosaccharides. macrophages are commonly affected by storage diseases, and thus extramedullary hematopoiesis. emh is the development of blood cells in tissues outside the medullary cavity of the bone (efig. 13-9 ). the formation of single or multiple lineages of hematopoietic cells is often observed in many tissues and commonly in the spleen. the ability of blood cell precursors to home, proliferate, and mature in extramedullary sites relies on the presence of hscs and pathophysiologic changes in the microenvironment (i.e., extracellular matrix, stroma, and chemokines). in the spleen, hscs have been found within vessels and adjacent to endothelial cells to form a vascular niche; thus splenic emh occurs in the red pulp, both within the red pulp vascular spaces and sinusoids (of the dog). the predilection for emh to occur varies among species (for instance, splenic emh persists throughout adulthood in mice), and the underlying mechanisms are not completely understood, but four major theories to explain the causes of emh are (1) severe bone marrow failure; (2) myelostimulation; (3) tissue inflammation, injury, and repair; and (4) abnormal chemokine production. because splenic emh is often observed in animals without obvious hematologic abnormalities, tissue inflammation, injury, and repair is the most likely mechanism of emh in this organ. in dogs and cats emh occurs most frequently with degenerative and inflammatory disorders, such as lymphoid nodular hyperplasia, hematomas, thrombi, histiocytic hyperplasia, inflammation (e.g., fungal splenitis), and neoplasia. emh in multiple tissues may be observed in chronic cardiovascular or respiratory conditions, chronic anemia, or chronic suppurative diseases in which there is an excessive tissue demand for neutrophils that exceeds the supply available from the marrow (e.g., canine pyometra). primary neoplasms. primary neoplastic diseases of the spleen arise from cell populations that normally exist in the spleen and include hematopoietic components, such as lymphocytes, mast cells, and macrophages, and stromal cells, such as fibroblasts, smooth muscle, and endothelium. the primary neoplasms that result in diffuse splenomegaly are the round cell tumors, including lymphoma ( fig. 13-60) , leukemia, visceral mast cell tumor, and histiocytic sarcoma. it is important to note that all of these types of neoplasms can produce nodular lesions instead of-or along with-a diffusely enlarged spleen. the different types of lymphoma in domestic animals are discussed in the section on hematopoietic neoplasia. secondary neoplasms of the spleen are due to metastatic spread and most often form nodules in the spleen, not a uniform splenomegaly. amyloid. the accumulation of amyloid in the spleen may occur with primary (al) or secondary (aa) amyloidosis (see chapters 1 and 5). rarely, severe amyloid accumulation may cause uniform splenomegaly ( fig. 13-61) , in which the spleen is firm, rubbery to waxy, and light brown to orange. microscopically, amyloid is usually in the splenic follicles, which if large enough, are grossly visible as approximately 2-mm-diameter gray nodules. amyloid deposition can also be seen within the walls of splenic veins and arterioles. plasma cells tumors within the spleen may also be associated with amyloid (al) deposits. lysosomal storage diseases. storage diseases are a heterogeneous group of inherited defects in metabolism characterized by accumulation of storage material within the cell (lysosomes). genetic defects, which result in the absence of an enzyme, the synthesis of a catalytically inactive enzyme, the lack of activator proteins, or a defect in posttranslational processing, can lead to a storage disease. acquired storage diseases are caused by exogenous toxins, most often plants that inhibit a particular lysosomal enzyme (e.g., swainsonine toxicity due to indolizidine alkaloid found in astragalus and oxytropis plant spp.). storage diseases typically occur diagnosis, it may be difficult (and futile) to determine the primary site. histologically, hemangiosarcomas are composed of plump neoplastic endothelial cells, which wrap around stroma to form haphazardly arranged and poorly defined blood-filled vascular spaces ( fig. 13-66 ). splenic nodules with a firm consistency. the most common disorders of the spleen with firm nodules are (1) lymphoid nodular hyperplasia, (2) complex nodular hyperplasia, (3) primary neoplasms, (4) secondary metastatic neoplasms, (5) granulomas, and (6) abscesses. lymphoid and complex nodular hyperplasia. see lymphoid/ lymphatic system, disorders of dogs. primary neoplasms. the primary neoplastic diseases of the spleen that result in firm nodules include lymphoma (multiple subtypes), histiocytic sarcoma, leiomyoma, leiomyosarcoma, fibrosarcoma, myelolipomas, liposarcomas, myxosarcomas, undifferentiated pleomorphic sarcomas, solid hemangiosarcomas, and rare reports of primary chondrosarcomas. these locally extensive neoplasms may be solitary or multiple, raised above the capsular surface, but usually confined by the capsular surface. the consistency and cut surface appearance varies depending on the type of neoplasm; spindle cell tumors like leiomyosarcomas and fibrosarcomas will be white and firm, liposarcomas and myelolipomas are soft and bulging, and myxomatous neoplasms are gelatinous. it is important to remember that many round cell neoplasms, such as lymphoma, mast cell tumors, plasma cell tumors, myeloid neoplasms, and histiocytic sarcomas, can form nodules or diffuse splenic enlargement (or both). metastatic neoplasms. neoplasms that metastasize to the spleen usually result in enlarged nodular spleens (fig. 13-67 ) and include any number of sarcomas, carcinomas, or malignant round cell tumors. metastatic sarcomas can include fibrosarcomas, leiomyosarcomas, chondrosarcomas, and osteosarcomas. mammary, prostatic, pulmonary, anal sac gland and neuroendocrine carcinomas may metastasize widely to abdominal viscera, including the spleen. granulomas and abscesses. microorganisms that cause diffuse granulomatous splenitis and uniform splenomegaly may also cause focal to multifocal nodular lesions (e.g., mycobacterium spp., fungal organisms) (see diffuse granulomatous diseases of the spleen and also enlarged lymph nodes). although there are a large number of abscesses bulge from the capsule and cut surfaces, and the exudate can vary in amount, texture, and color depending on the inciting organism and the age of the lesion. the most common diseases or conditions that have small spleens are (1) developmental anomalies, (2) aging changes, (3) wasting and/or cachectic diseases, and (4) splenic contraction. splenic hypoplasia. primary immunodeficiency diseases can result in splenic hypoplasia, as well as small thymuses and lymph nodes (which may be so small as to be grossly undetectable in some diseases). these diseases affect young animals and involve defects in t and/or b lymphocytes (fig. 13-70) . spleens are exceptionally small, firm, and pale red and lack lymphoid follicles and pals. these diseases and their pathologic findings are discussed in chapter 5 and in the sections on disorders of horses and disorders of dogs. congenital accessory spleens. accessory spleens can be either congenital or acquired (see splenic rupture). congenital accessory spleens are termed splenic choristomas, which are nodules of normal splenic parenchyma in abnormal locations. these are usually small diseases and conditions commonly caused by bacteremia (e.g., navel ill, joint ill, chronic respiratory infections, bacterial endocarditis, chronic skin diseases, castration, tail docking, and ear trimming and/ or notching), these rarely result in visible splenic abscesses. pyogranulomas and abscesses in the spleen (multifocal chronic suppurative splenitis) that do develop after septicemia and/or bacteremia are usually caused by pyogenic bacteria such as streptococcus spp., rhodococcus equi (fig. 13-68) , trueperella pyogenes ( fig. 13-69) , and corynebacterium pseudotuberculosis. cats with the wet or dry form of feline infectious peritonitis virus may have nodular pyogranulomatous and lymphoplasmacytic inflammatory foci throughout the spleen. splenic abscesses due to direct penetration by a migrating foreign body are reported in cattle (from the reticulum) and less commonly in the horse (from the stomach). perforating gastric ulcers in horses due to gasterophilus and habronema spp. have also reportedly led to adjacent splenic abscesses. granulomas and a b and may be located in the gastrosplenic ligament, liver, or pancreas (see fig. 13-74, b) . splenic fissures. fissures in the splenic capsule are elongated grooves whose axes run parallel to the borders of the spleen. this developmental defect is seen most commonly in horses but also occurs in other domestic animals and has no pathologic significance. the surface of the fissure is smooth and covered by the normal splenic capsule. aging changes. as part of the general aging change of cells as the body ages, there is reduction in the number of b lymphocytes produced by the bone marrow and decline of naïve t lymphocytes due to age-related thymic involution. consequently, there is lymphoid atrophy in secondary lymphoid organs. the spleen is small, and its capsule may be wrinkled. microscopically, the white pulp is atrophied, and splenic follicles, if present, lack germinal centers. sinuses may also collapse from a reduced amount of blood, possibly because of anemia, which makes the red pulp appear fibrous. wasting/cachectic diseases. any chronic disease, such as starvation, systemic neoplasia, and malabsorption syndrome, may produce cachexia. starvation has a marked effect on the thymus, which results in atrophy of the t lymphocyte areas in the spleen and lymph nodes, which is in part mediated by leptin. b lymphocyte development is also diminished, because b lymphocytes require accessory signals from helper t lymphocytes to undergo somatic hypermutation and immunoglobulin isotype switching. splenic contraction. contraction of the spleen is a result of contraction of the smooth muscle in the capsule and trabeculae of storage spleens. it can be induced by the activation of the sympathetic "fight-or-flight" response and is seen in patients with heart failure or shock (cardiogenic, hypovolemic, and septic shock) and also occurs in acute splenic rupture that has resulted in massive hemorrhage (hemoabdomen/hemoperitoneum). the contracted spleen is small, its surface is wrinkled, and the cut surface is dry. hemosiderosis. hemosiderin is a form of storage iron derived chiefly from the breakdown of erythrocytes, which normally takes place in the splenic red pulp. thus some splenic hemosiderosis is to be expected, and the amount varies with the species (it is most extensive in the horse). excessive amounts of splenic hemosiderin are seen when erythropoiesis is reduced (less demand for iron) or from the rapid destruction of erythrocytes in hemolytic anemias (increased stores of iron), such as those caused by immune-mediated hemolytic anemias or hemotropic parasites. excess splenic hemosiderin may also occur in conditions such as chronic heart failure or injections of iron dextran or as focal accumulations at the sites of old hematomas, infarcts, or trauma-induced hemorrhages. hemosiderin is also present in siderofibrotic plaques. siderofibrotic plaques. siderofibrotic plaques are also known as siderocalcific plaques and gamna-gandy bodies. grossly, they are gray-white to yellowish, firm, dry encrustations on the splenic capsule. usually they are most extensive along the margins of the spleen but can be elsewhere on the capsule (fig. 13-71 ) and sometimes in the parenchyma. with h&e staining these plaques are a multicolored mixture of yellow (hematoidin), golden brown (hemosiderin), purple-blue (hematoxylinophilic calcium mineral), and pink (eosinophilic fibrous tissue) (fig. 13-72 the diseases or conditions with small lymph nodes are (1) congenital disorders, (2) lack of antigenic stimulation, (3) viral infections, (4) cachexia and malnutrition, (5) aging, and (6) radiation. congenital disorders. primary immunodeficiency diseases are described in detail in chapter 5 and in the sections on disorders of horses and disorders of dogs. neonatal animals with primary immunodeficiency diseases often have extremely small to undetectable lymph nodes. in dogs and horses with severe combined immunodeficiency disease (scid), lymphoid tissues, including lymph nodes from affected animals, are often grossly difficult to identify and characterized by an absence of lymphoid follicles. congenital may represent sequelae to previous hemorrhages from trauma to the spleen. splenic rupture. splenic rupture is most commonly caused by trauma, such as from an automobile accident or being kicked by other animals. thinning of the capsule from splenomegaly can render the spleen more susceptible to rupture, and this may occur at sites of infarcts, hematomas, hemangiosarcomas, and lymphoma. in acute cases of splenic capsular rupture, the spleen is contracted and dry and the surface wrinkled from the marked blood loss (fig. 13-73) . in more severe cases the spleen may be broken into two or more pieces, and small pieces of splenic parenchyma may be scattered throughout the omentum and peritoneum (sometimes called splenosis) (fig. 13-74, a) . clotted blood, fibrin, and omentum may adhere to the surface at the rupture site. if the rupture is not fatal, the spleen heals by fibrosis, and there may be a capsular scar. occasionally there are two or more separate pieces of spleen adjacent to each other and sometimes joined by scar tissue in the gastrosplenic ligament. the functional capabilities of the small accessory spleens are questionable, although erythrophagocytosis, hemosiderosis, hyperplastic nodules, emh, and neoplasia can be present in these nodules. accessory spleens due to traumatic rupture should be distinguished from peritoneal seeding of hemangiosarcoma and the developmental anomaly splenic choristomas (see fig. 13-74, b) , which are nodules of normal splenic parenchyma in abnormal locations (such as liver and pancreas). chronic splenic infarcts. in the early stage, splenic infarcts are hemorrhagic and may elevate the capsule (see splenic nodules with a bloody consistency). however, as the lesions age and fibrous connective tissue is laid down, they shrink and become contracted and often depressed below the surface of the adjacent capsule. conditions causing lymphadenomegaly include (1) lymphoid hyperplasia (follicular or paracortical), (2) hyperplasia of the sinus histiocytes (monocyte-macrophage system), (3) acute or chronic lymphadenitis, (4) lymphoma, and (5) metastatic neoplasia. macrophage system. detailed descriptions of lymphoid follicular hyperplasia, paracortical hyperplasia, and hyperplasia of sinus histiocytes are in the sections on lymph nodes, function, and lymph node, dysfunction/responses to injury. follicular lymphoid hyperplasia can involve large numbers of lymph nodes, as in a systemic disease, or can be localized to a regional lymph node draining an inflamed or antigenically stimulated (e.g., vaccine injection) area. acute lymphadenitis. lymph nodes draining sites of infection and inflammation may develop acute lymphadenitis (e.g., retropharyngeal lymph nodes draining the nasal cavity with acute rhinitis, tracheobronchial lymph nodes in animals with pneumonia ( fig. 13-75 ), and mammary [supramammary] lymph nodes in animals with mastitis). grossly, affected lymph nodes in acute lymphadenitis are red and edematous, have taut capsules, and may have necrotic areas ( fig. 13-76 ). in some instances the afferent lymphatic vessels may also be inflamed (lymphangitis). the material draining to the regional lymph node may be microorganisms (bacteria, parasites, protozoa, and fungi), inflammatory mediators, or a sterile irritant. in septicemic diseases, such as bovine anthrax, the lymph nodes are markedly congested and the sinuses filled with blood. examination of these lymph nodes should include culturing for bacteria and the examination of smears and histologic sections for bacteria and fungi. pyogenic bacteria, such as streptococcus equi ssp. equi in horses , streptococcus porcinus in pig, and trueperella pyogenes in cattle and sheep, cause acute suppurative lymphadenitis (see disorders of horses and disorders of pigs). hereditary lymphedema has been reported in certain breeds of cattle and dogs. grossly, the most severely affected animals have generalized subcutaneous edema (see fig. 2 -10) and effusions. in severe cases the peripheral and mesenteric lymph nodes are hypoplastic and characterized by an absence of follicles. nodes draining an edematous area may be grossly enlarged from marked sinus edema. lack of antigenic stimulation. the size of the lymph node depends on the level of phagocytosis and antigenic stimulation; lymph nodes that are not receiving antigenic stimuli (e.g., spf animals) will be small with low numbers of primary lymphoid follicles and few, if any, secondary follicles or plasma cells in the medullary cords. conversely, nodes receiving constant antigenic material (such as those draining the oral cavity or intestines) are large with active secondary lymphoid follicles. the number of follicles increases or decreases with changes in the intensity of the antigenic stimuli, and the germinal centers go through a cycle of activation, depletion, and rest, as described previously (see lymph nodes, function). as the antigenic response wanes, germinal centers become depleted of lymphocytes, and lymphoid follicles become smaller. viral infections. many viral infections of animals target lymphocytes and cause the destruction of lymphoid tissue. of infectious agents, viruses most commonly infect and injure lymphoid tissues and include the following: ehv-1 in aborted foals, classic swine fever virus, bvdv, canine distemper virus, and canine and feline parvovirus. although some viruses destroy lymphoid tissue, others can lead to lymph node hyperplasia (e.g., follicular b lymphocyte hyperplasia in fiv and paracortical t lymphocyte hyperplasia in malignant catarrhal fever virus) or cause neoplasia (e.g., felv, blv, and marek's disease). cachexia and malnutrition. malnutrition and cachexia, which occur with cancer, lead to secondary immunosuppression through several complex metabolic and neurohormonal aberrations. starvation has marked effect on the thymus with resultant atrophy of the t lymphocyte areas in the spleen and lymph nodes and may also affect b lymphocyte development. lymphoid atrophy may result from physiologic and emotional stress and the concurrent release of catecholamines and glucocorticoids. glucocorticoids reduce b and t lymphocytes via redistribution of these cells and glucocorticoidinduced apoptosis. t lymphocytes are more sensitive to glucocorticoid-induced apoptosis than are b lymphocytes. aging. as part of the general aging change of cells as the body ages, there is reduction in the number of lymphocytes produced by the bone marrow and regressed thymus, and consequently a reduction in the b and t lymphocytes in secondary lymphoid organs, resulting in lymphoid atrophy. consequently, lymph nodes are small, with loss of b and t lymphocytes and plasma cells in the cortical follicles, paracortex, and medullary cords, respectively. radiation. local or palliative treatment of cancer may include radiotherapy (ionizing radiation) to target and damage the dna of the neoplastic cells. although some immunosuppression may be noted, particularly if bone marrow or lymphoid tissues are within the irradiated field, mounting evidence suggests that radiotherapy can induce a cascade of proimmunogenic effects that engage the innate and adaptive immune systems to contribute to the destruction of tumor cells. fibrosis of tissues within the irradiated field also occurs as mainly a late effect of chronic radiation. bouts of chronic lymphadenitis (e.g., regional lymph node draining chronic mastitis in cows) lead to fibrosis and lymphoid hyperplasia, in addition to chronic abscesses. the classic example of chronic suppurative lymphadenitis with encapsulated abscesses is caseous lymphadenitis, a disease of sheep and goats caused by c. (also see disorders of ruminants). it is also the cause of ulcerative lymphangitis in cattle and horses and pectoral abscesses in horses. classic examples of focal to multifocal granulomatous lymphadenitis are mycobacterium tuberculosis complex, which includes m. bovis among others. members of m. avium complex cause similar lesions and have been described in a number of species, including dogs, cats, primates, pigs, cattle, sheep, horses, and human beings. infection may begin by inhalation of aerosol droplets containing the bacilli, which may spread via the lymphatic vessels to regional lymph nodes, resulting in granulomatous lymphangitis and lymphadenitis (fig. 13-81) . initially lesions in the lymphatic system are confined to the lymphatic vessels (granulomatous lymphangitis) and regional lymph nodes (e.g., the tracheobronchial lymph nodes in the case of pulmonary histologically, the subcapsular, trabecular, and medullary sinuses and the parenchyma of the cortex and medulla have focal to coalescing foci of neutrophilic inflammation, necrosis, and fibrin deposition ( fig. 13-78 ). if inflammation in the lymph node continues for several days or longer, the lymph node is further enlarged by follicular hyperplasia and plasmacytosis of the medullary cords from the expected immune response. chronic lymphadenitis. the types of chronic lymphadenitis include chronic suppurative lymphadenitis, diffuse granulomatous inflammation, and discrete granulomas. in chronic suppurative inflammation, abscesses range in size from small microabscesses to large abscesses that occupy and obliterate the whole node. recurrent figure 13 -76 acute lymphadenitis, lymph node, dog. acute lymphadenitis usually occurs when a regional lymph node drains a site of inflammation caused by microorganisms and subsequently becomes infected. the lymph node is firm and enlarged with a tense capsule. the cut surface bulges and is wet with blood, edema, and an inflammatory cell infiltrate. histoplasmosis (see disorders in dogs). in feline cryptococcosis (most often cryptococcus neoformans), the inflammatory response may be mild due to the thick polysaccharide capsule, which has strong immunomodulatory properties and promotes immune evasion and survival within the host. therefore the nodal enlargement is due mainly to a large mass of organisms (see efig. 13-12) . pigs with pcv2 infection may have a multifocal to diffuse infiltrate of macrophages and multinucleated giant cells of varying severity (see disorders of pigs). secondary (metastatic) neoplasms. carcinomas typically metastasize via lymphatic vessels to the regional lymph node. other common metastatic neoplasms include mast cell tumor and malignant melanoma. although sarcomas most often metastasize hematogenously, some more aggressive sarcomas (e.g., osteosarcoma) may spread to regional lymph nodes. histologically, single cells or clusters of neoplastic cells travel via the afferent lymphatic vessels and are deposited in a sinus, usually the subcapsular sinus (see fig. 13 -47). here the cells proliferate and can ultimately occupy the whole lymph node, as well as drain to the next lymph node in the chain. tuberculosis), but once disseminated in the lymph or blood, lymph nodes throughout the body will have lesions. well-organized granulomas consist of a central mass of macrophages with phagocytized mycobacteria, surrounded by epithelioid and foamy macrophages and occasional multinucleated giant cells (langhans type). these inflammatory nodules are surrounded by a layer of lymphocytes enclosed in a fibrous capsule. over time the center of the granuloma may undergo caseous necrosis due to the high lipid and protein content of the dead macrophages (see chapter 3). in bovine johne's disease the mesenteric lymph nodes draining the infected intestine can have noncaseous granulomas (fig. 13-82 ). diffuse granulomatous lymphadenitis. coalescing to diffuse granulomatous lymphadenitis is seen in disseminated fungal infections such as blastomycosis, cryptococcosis (efig. 13-12) , and extent of the emphysema. in severe cases the lymph node is light, puffy, and filled with discrete gas bubbles, and the cut surface may be spongy. histologically, the sinuses are distended with gas and lined by macrophages and giant cells. this change has been considered a foreign body reaction to the gas bubbles. macrophages and giant cells are also seen in afferent lymphatic vessels (granulomatous lymphangitis). vascular transformation of lymph node sinus (nodal angiomatosis). vascular transformation of the sinuses is a nonneoplastic reaction to blocked efferent lymphatic vessels or veins. this pressure-induced lesion results in the formation of anastomosing vascular channels and may be confused with a nodal vascular neoplasm. these proliferative but noninvasive masses usually begin in the subcapsular sinuses and may be followed by lymphoid atrophy, erythrophagocytosis/hemosiderosis, and fibrosis. the blockage may be caused by malignant neoplasms of the tissues that the lymph node drains (e.g., thyroid carcinoma with nodal angiomatosis of the mandibular lymph node). see section on bone marrow, disorders of domestic animals, hematopoietic neoplasia for a discussion of the who classification of hematopoietic neoplasia that predominantly arise and proliferate within bone marrow. this section will cover neoplasms of lymphoid tissue(s) arising outside of bone marrow. lymphoma. the term lymphoma (also known as lymphosarcoma) encompasses a diverse group of malignancies arising in lymphoid tissue(s) outside of bone marrow. grossly, there may be diffuse to nodular enlargement of one or more lymph nodes (fig. 13-83) , and the cut surface is soft, white, and bulging with loss of normal corticomedullary architecture. there is great variation in the clinical manifestations and cytopathologic features of lymphoma, which underlie the importance of classification to better predict the clinical behavior and outcome. an understanding of lymphocyte maturation is crucial, because the who classification of lymphoma postulates a normal cell counterpart for each type of lymphoma (when possible). in other words, lymphoma can arise at any stage in the development/maturation of a lymphocyte-from precursor lymphocytes (b or t lymphoblasts) to mature lymphoid b and t, lymphocytes and nk cells (table 13 -7 and box 13-11). pathologists use gross features, histomorphologic features, immunophenotype (b or t lymphocyte), and clinical characteristics to red discoloration is caused by (1) draining erythrocytes from hemorrhagic or acutely inflamed areas, (2) acute lymphadenitis with hyperemia and/or hemorrhage, (3) acute septicemias with endotoxininduced vasculitis or disseminated intravascular coagulation, and (4) dependent areas in postmortem hypostatic congestion. blood in pig lymph nodes is especially obvious due to the inverse anatomy (the equivalent of the medullary sinuses are subcapsular and thus readily visible in the unsectioned node). initially erythrocytes fill trabecular and medullary sinuses and then rapidly undergo erythrophagocytosis by proliferating sinus macrophages. hemosiderin deposition occurs within 7 to 10 days in these macrophages, imparting a brown discoloration of the node. black discoloration is often present in the tracheobronchial lymph nodes due to draining of carbon pigment (pulmonary anthracosis, see chapter 9). black ink from skin tattoos will drain to the regional lymph node. these pigments are usually noted within the medullary sinus macrophages. brown discoloration may be due to melanin, parasitic hematin, or hemosiderin. melanin pigment is seen in animals with chronic dermatitis when melanocytes are damaged and their pigment is released into the dermis and phagocytized by melanomacrophages (pigmentary incontinence) and drained to the regional lymph node. the mandibular lymph nodes often contain numerous melanomacrophages in animals with heavily pigmented oral mucosa, presumably due to chronic low levels of inflammation. this must be distinguished from metastatic malignant melanomas. lymph nodes draining areas of congenital melanosis may have melanin deposits. parasitic hematin pigment is produced by fascioloides magna (cattle) and fasciola hepatica (sheep) in the liver and then transported via the lymphatic vessels to the hepatic lymph nodes. hemosiderin, an erythrocyte breakdown product, may form in a hemorrhagic node or arrive in hemosiderophages draining from congested, hemorrhagic, or inflamed areas. drainage of iron dextran from an intramuscular injection may also cause hemosiderin pigment accumulation within the draining lymph node. green discoloration is rare and may be caused by green tattoo ink (often used in black animals); ingestion of blue-green algae, which drain to mesenteric lymph nodes; massive eosinophilic inflammation; and in mutant corriedale sheep, which have a genetic defect that results in a deficiency in the excretion of bilirubin and phylloerythrin by the liver. the phylloerythrin or a metabolite stains all the tissues of the body a dark green, except for the brain and spinal cord, which are protected by the blood-brain barrier. miscellaneous discolorations of lymph nodes may be seen with intravenously injected dyes (e.g., methylene blue or trypan blue) or subcutaneous drug injections. lymph nodes may be yellow in severely icteric patients. the pigmented strain of map (johne's disease) may impart an orange discoloration in the mesenteric lymph nodes of sheep. inclusion bodies. many viruses produce inclusion bodies, and some of these occur in lymph nodes. these viruses include ehv-1 in horses, bovine adenovirus, cytomegalic virus in inclusion body rhinitis and pcv2, herpesvirus of pseudorabies in pigs, and rarely parvovirus in dogs and cats. emphysema. emphysema in lymph nodes is a consequence of emphysema in their drainage fields and is seen most frequently in tracheobronchial lymph nodes in bovine interstitial emphysema and in porcine mesenteric lymph nodes in intestinal emphysema (see chapter 7). the appearance of the lymph node varies with the immunohistochemistry (mum1/irf4 is particularly sensitive and specific for plasma cell neoplasms). histiocytic disorders. histiocytic disorders are frequently diagnosed in dogs and occur less often in cats. briefly, histiocytes are categorized as macrophages and dcs, the latter of which are subdivided into langerhans cells (lcs), found in skin, gastrointestinal, respiratory, and reproductive epithelia (mucosae), and interstitial dcs (idc), located in perivascular spaces of most organs. the term interdigitating dcs describes dcs (either resident or migrating) found in t lymphocyte regions of lymph nodes (paracortex) and spleen (pals); interdigitating dcs consist of both lcs and idcs. these lineages can be differentiated using immunohistochemical stains. histiocytic disorders that are diagnosed in veterinary medicine at this time include the following: canine cutaneous histiocytoma, canine lc histiocytosis, canine cutaneous and systemic histiocytosis, feline pulmonary lc histiocytosis, feline progressive histiocytosis, dendritic cell leukemia in the dog, and histiocytic sarcoma and hemophagocytic histiocytic sarcoma in both dogs and cats. lymph node involvement is seen in many of these conditions. rare reports of regional lymph node metastasis in cases of solitary canine cutaneous histiocytoma have been published. lymphatic invasion with subsequent regional nodal involvement may be seen in dogs with lc histiocytosis, which is a poor prognostic indicator and likely reflects systemic infiltration. the normal architecture of tracheobronchial lymph nodes is often effaced in cats with pulmonary lc histiocytosis. canine reactive histiocytoses are not clonal neoplastic proliferations but likely reflect an immune dysregulation consisting of activated dermal idcs (and t lymphocytes). they are categorized as cutaneous histiocytosis (ch), involving skin and draining lymph nodes, and a more generalized systemic histiocytosis (sh), affecting skin and other sites (e.g., lung, liver, bone marrow, spleen, lymph nodes, kidneys, and orbital and nasal tissues). histiocytic sarcoma complex. histiocytic sarcomas (hss) are neoplasms of idcs and therefore can arise in almost any tissue, frequently the spleen, lung, skin, meninges, lymph nodes, bone marrow, and synovium. secondary involvement of the liver is common as the disease progresses. this neoplasm is most commonly diagnosed in dogs, and a lower incidence is seen in cats. localized histiocytic sarcoma may be a focal solitary lesion or multiple nodules within a single organ. disseminated histiocytic sarcoma describes classify lymphomas. the morphologic features used in histopathologic classification are the following: • histologic pattern-nodular or diffuse. • cell size-the nuclei of the neoplastic lymphocytes are compared to the diameter of a red blood cell (rbc ≅ 5 µm). small is less than 1.5 times the diameter of an rbc; intermediate is 1.5 to 2.0 times the diameter of an rbc; large is more than 2.0 times the diameter of an rbc. • grade-mitotic figures are counted in a single high-power (400×) field. indolent is 0 to 1; low is 2 to 5; mid is 6 to 10; high is more than 10. although there are numerous subtypes of lymphoma recognized under the who system, a detailed discussion of each subtype is outside the scope of this textbook. however, a select number of subtypes are more commonly seen in domestic animals (see table 13 -7) and currently best described in the dog (see disorders of dogs, neoplasms, lymphomas). the most common types in dogs are large cell lymphomas and include diffuse large b cell lymphoma and peripheral t cell lymphoma. t cell-rich large b cell lymphoma is thought to be a variant of diffuse large b cell lymphoma with a distinctive reactive t lymphocyte infiltrate. intermediate cell lymphomas include b or t lymphocyte lymphoblastic lymphomas and burkitt-like lymphoma (both high grade), marginal zone lymphoma, and the intermediate cell variant of t zone lymphomas (both indolent and nodular). small cell lymphomas most commonly diagnosed in domestic animal species include enteropathy-associated t cell lymphoma, commonly seen in the cat, t zone lymphoma (small cell variant), and small cell lymphoma. cutaneous lymphomas are most often of t lymphocyte origin and may be epitheliotropic or nonepitheliotropic, and a distinct entity of inflamed t cell lymphoma has been recently described in dogs (see chapter 17). plasma cell neoplasia. plasma cell neoplasms are most easily categorized as myeloma or multiple myeloma, which arises in the bone marrow, and extramedullary plasmacytoma, which as the name implies involves sites other than bone. multiple myeloma. see bone marrow and blood cells, disorders of domestic animals, types of hematopoietic neoplasia, plasma cell neoplasia. extramedullary plasmacytomas. extramedullary plasmacytomas are most commonly diagnosed in the skin of dogs (also cats and horses), where they constitute 1.5% of all canine cutaneous tumors (see chapter 17). the pinnae, lips, digits, and chin are the most commonly affected locations, and most lesions are solitary, though multiple plasmacytomas are infrequently diagnosed. other tissues affected include the oral cavity, intestine (colorectal in particular), liver, spleen, kidney, lung, and brain; of these, the oral cavity and intestine (colorectal) are involved most often. in one study, extramedullary plasmacytomas represented 5% of all canine oral tumors and 28% of all extramedullary plasmacytomas diagnosed. most cutaneous extramedullary plasmacytomas are benign, and complete excision is usually curative; oral cavity and colorectal extramedullary plasmacytomas are likely to behave in a similar manner. more aggressive forms may occur at any site. as with multiple myeloma, the neoplastic cells composing the tumor may vary from well differentiated to pleomorphic, often within the same tumor. the cells often have a characteristic perinuclear golgi clearing or "halo," and the more pleomorphic cells exhibit karyomegaly and binucleation (fig. 13-84) . extramedullary plasmacytomas may produce monoclonal immunoglobulins with resulting monoclonal gammopathy. amyloid deposition (which may mineralize) is also observed in a proportion of cases. differentiation from other round cell tumors may be aided by severe combined immunodeficiency disease of arabian foals is an autosomal recessive primary immunodeficiency disorder characterized by the lack of functional t and b lymphocytes caused by a genetic mutation in the gene encoding for dna-dependent protein kinase catalytic subunit (dna-pkcs). this enzyme is required for receptor gene rearrangements involved in the maturation of lymphocytes, and the resulting loss of functional t and b lymphocytes leads to a profound susceptibility to infectious diseases. though normal at birth, these foals develop diarrhea and pneumonia by approximately 10 days of age, often due to adenovirus, cryptosporidium parvum, and pneumocystis carinii infections. affected foals often die before 5 months of age. lymph nodes and thymus are small and often grossly undetectable, and the spleen is small and firm due to the absence of white pulp (see fig. 13-70) . the development of genetic tests to identify carriers of the disorder has led to a decrease in the prevalence of severe combined immunodeficiency disease. recently, severe combined immunodeficiency disease was diagnosed in a single caspian filly, though the exact genetic defect was not determined. congenital immunodeficiency diseases are also discussed in detail in chapter 5. streptococcus equi ssp. equi, the etiologic agent of equine strangles, is inhaled or ingested after direct contact with the discharge from infected horses or from a contaminated environment. the bacteria attach to the tonsils, penetrate into deeper tissues, enter the lymphatic vessels, drain to regional lymph nodes (mandibular, retropharyngeal, and occasionally parotid and cervical lymph nodes), and cause large abscesses (see fig. 13 -77). retropharyngeal enlargement from abscesses may lead to compression of the pharynx and subsequent respiratory stridor and dysphagia. abscesses may rupture and discharge pus through a sinus to the skin surface or spread medially into guttural pouches, where residual pus dries and hardens to form chondroids (which serve as a nidus for live bacteria to persist in carrier animals). in up to 20% of these cases, ruptured abscess material may spread via blood or lymph to other organs (metastatic abscess formation, bastard strangles), including lung, liver, kidney, synovia, mesenteric and mediastinal lymph nodes, spleen, and occasionally brain. purpura hemorrhagica, a type iii hypersensitivity reaction, may result in necrotizing vasculitis in some horses with repeated natural exposure to s. equi ssp. equi or after vaccination in horses that have had strangles. the typical manifestation of r. equi infection is chronic suppurative bronchopneumonia with abscesses (see chapter 9). approximately 50% of foals also develop intestinal lesions characterized by pyogranulomatous ulcerative enterotyphlocolitis, often over peyer's patches, and pyogranulomatous lymphadenitis of mesenteric and colonic lymph nodes (see chapter 7; see fig. 13-68) . large abdominal abscesses may be the only lesion in the abdomen and presumably originate from an infected mesenteric lymph node. the diffuse lymphatic tissue in the lamina propria may contain granulomatous inflammation with the phagocytized bacteria. mediastinal pyogranulomatous lymphadenitis may compress the trachea, causing respiratory distress. r. equi lesions also can develop in the liver, kidney, spleen, or nervous tissue. lymphoma is the most common malignant neoplasm in horses and mostly affects adult animals (mean age 10 to 11 years) with no lesions that involve distant sites and has replaced the term malignant histiocytosis. breed predispositions to histiocytic sarcoma complex are seen in bernese mountain dogs, rottweilers, golden retrievers, and flat-coated retrievers, though the disease can occur in any breed. histiocytic sarcoma complex is considered to have a rapid and highly aggressive course, and the clinical signs depend on the particular organ(s) involved. grossly, affected organs may be uniformly enlarged and/or contain multiple coalescing white-tan nodules. tissue architecture is effaced by sheets of pleomorphic round to spindle-shaped cells. there is marked cellular atypia with numerous karyomegalic and multinucleated neoplastic cells (fig. 13-85) . hemophagocytic histiocytic sarcoma. hemophagocytic histiocytic sarcoma is seen in dogs and cats and is a neoplasm of macrophages of the spleen and bone marrow. clinically, dogs present with hemolytic regenerative anemia and thrombocytopenia, thus mimicking evans's syndrome, though they are coombs negative. this form of histiocytic sarcoma carries the worst prognosis of the histiocytic sarcomas, which is likely in part related to the severe anemia and coagulopathy. it is characterized by a non-mass forming infiltrate of histiocytes within the bone marrow and splenic red pulp, causing diffuse splenomegaly. the neoplastic cells exhibit marked erythrophagocytosis, but the severe cellular pleomorphism seen in the histiocytic sarcoma complex may be lacking. the neoplastic cells are often intermixed with emh and plasma cells. metastasis is frequently to the liver, where the cells concentrate within the sinuses. tumor emboli within the lung are often present. malt is involved in a variety of ways with bacteria and viruses, and these are summarized for large animals in table 13 -5. these interactions include being a portal of entry for pathogens (e.g., salmonella spp., yersinia pestis, map, and l. monocytogenes); a site of replication for viruses (e.g., bvdv); a site for hematogenous infection (e.g., panleukopenia virus and parvovirus); and a site of gross or microscopic lesions in some viral diseases. bovine coronavirus, bvdv, rinderpest virus, malignant catarrhal fever virus, feline panleukopenia virus, and canine parvovirus cause lymphocyte depletion within the malt. anthrax is caused by b. anthracis, a gram-positive bacillus found in spore form in soil. cattle, sheep, and goats become infected when grazing on infected soil, and infection causes fulminant septicemia. the spleen in infected animals is markedly enlarged and congested (see uniform splenomegaly with a bloody consistency; see also chapter 4). bovine viral diarrhea is caused by bvdv, a pestivirus. cattle are the natural host, but other animals such as alpacas, deer, sheep, and goats are also affected. bvdv preferentially infects cells of the immune system, including macrophages, dcs, and lymphocytes. the associated lesions in lymphoid tissues are severe lymphoid depletion in mesenteric lymph nodes and peyer's patches, whose intestinal surface may be covered by a fibrinonecrotic membrane. histologically, there is marked lymphocytolysis and necrosis of germinal centers in peyer's patches and cortices of lymph nodes. there is thymic atrophy because the thymus is markedly depleted of lymphocytes and may consist of only collapsed stroma and few scattered lymphocytes. bvd is discussed in detail in chapters 4 and 7. apparent breed or sex predisposition. the most frequent anatomic locations of equine lymphoma are multicentric, cutaneous, and gastrointestinal tract. multicentric lymphoma, defined as involving at least two organs (excluding the regional lymph nodes), is the most common manifestation, followed by skin and gastrointestinal tract types. solitary locations have been reported in the mediastinum, lymph nodes, ocular/orbital region, brain, spinal cord, oral cavity, and spleen. of the multicentric lymphomas, the most frequently observed type is t cell-rich large b cell lymphoma (tcrlbcl), reportedly in one study affecting 34% of the cases. peripheral t cell lymphoma (ptcl) was the second most common, followed by diffuse large b cell lymphoma (dlbcl). the most common lymphoma type in the gastrointestinal tract is also t cell-rich large b cell lymphoma, followed by enteropathyassociated t cell lymphoma. cutaneous lymphomas in horses account for up to 3% of all equine skin tumors. t cell-rich large b cell lymphoma is again the most common lymphoma subtype in the skin, representing up to 84% of all cutaneous lymphomas, and most frequently presents clinically as multiple skin masses. cutaneous t cell lymphoma (ctcl) is the second most common form and arises as smaller solitary nodules. thoroughbreds may have a higher incidence of cutaneous t cell lymphoma compared to other breeds. overall, horses with cutaneous t cell-rich large b cell lymphoma appear to have a longer survival time than horses with other types of lymphoma of the skin. progesterone receptor-positive lymphomas have also been identified in horses, and there is one report of subcutaneous tumor regression following removal of an ovarian granulosa-theca cell tumor. there may be an increased frequency of lymphoma in horses diagnosed with equine herpesvirus 5 (ehv-5, gammaherpesvirus), when compared to healthy horses, although the exact cause-effect role of this observation in lymphomagenesis 5 is not yet known. histologically, the hallmark features of t cell-rich large b cell lymphoma include a majority of small (nuclei approximately the size of an rbc), reactive, mature t lymphocytes admixed with a neoplastic population of large b lymphocytes whose nuclei are two to three times the diameter of an equine rbc. these large atypical cells are often binucleated and have prominent eosinophilic nucleoli ( fig. 13-86 ). the large cells may be observed in mitosis or in necrosis as single cells with retracted cytoplasm and pyknotic nuclei. t cell-rich large b cell lymphoma is often accompanied by the presence of a dense fibrovascular network. johne's disease primarily affects domestic and wild ruminants (and rarely pigs and horses) and is due to infection by map. the characteristic lesions include granulomatous enteritis usually confined to the ileum, cecum, and proximal colon; lymphangitis; and lymphadenitis of regional lymph nodes (see fig. 13-82) . the bacteria are ingested, engulfed by the m cells overlying peyer's patches, and then transported to macrophages in the lamina propria and submucosa. among cattle, sheep, goats, and wild ruminants, there is wide variation in the severity, distribution of lesions, primary inflammatory cell type (lymphocytes, epithelioid macrophages, multinucleated giant cells), and numbers of bacteria within lesions (multibacillary or paucibacillary). histologically, the architecture of the ileocecal lymph nodes may be partially replaced by aggregates of epithelioid postweaning multisystem wasting syndrome pcv2, a small single-stranded dna virus, is highly prevalent in the domestic pig population. several clinical syndromes are attributed to pcv2 infection and collectively termed pcv-associated diseases (pcvads). these include postweaning multisystemic wasting syndrome (pmws), porcine respiratory disease complex (prdc), porcine dermatitis and nephropathy syndrome, and enteric disease (see chapters 4 and 9). the major postmortem findings of postweaning multisystemic wasting syndrome are poor body condition, enlarged lymph nodes, and interstitial pneumonia. the lesions of the lymphoid system are commonly observed in the tonsil, spleen, peyer's patches, and lymph nodes. some pigs have all lymphoid tissues affected, whereas others may have only one or two affected lymph nodes. the characteristic microscopic lesions are lymphoid depletion of both follicles and paracortex with replacement by histiocytes, mild to severe granulomatous inflammation with multinucleated giant cells, and intrahistiocytic sharply demarcated, spherical, basophilic cytoplasmic inclusion bodies. necrosis of prominent lymphoid follicles (necrotizing lymphadenitis) is occasionally observed, and pcv2 can be detected within the necrotic regions. the loss of lymphocytes may be due to reduced production in the bone marrow, decreased proliferation in the secondary lymphoid organs, or necrosis of lymphocytes. splenic abscesses can be the result of bacteremia (see fig. 13 -69) or direct penetration by a foreign body from the reticulum (see spleen and also portals of entry/pathways of spread). c. pseudotuberculosis is a gram-positive intracellular bacterium that causes caseous lymphadenitis, a chronic suppurative disease of sheep and goats. the bacterium may enter through skin wounds (e.g., shearing cuts in sheep, tagging, tail docking, or castration), drain to the regional lymph node, and then be disseminated in lymph and circulating blood to external and internal lymph nodes, as well as other internal organs, including lung. external abscesses are most often detected in the "jaw and neck" region, specifically in the mandibular and parotid lymph nodes. on gross examination the abscesses are encapsulated and filled with greenish semifluid pus due to an infiltrate of eosinophils (see fig. 13-79) . over time the abscesses lose the greenish hue, and contents become inspissated to form the characteristic concentric laminations (see fig. 13 -80); old abscesses may reach a diameter of 4 to 5 cm. bovine lymphoma is broadly classified into enzootic and sporadic forms. the enzootic form, called enzootic bovine leukosis (ebl), is caused by blv, a retrovirus common in cattle. there is a higher prevalence in dairy cattle compared to beef breeds. blv is transmitted horizontally (e.g., blood, milk/colostrum, saliva) or iatrogenically (e.g., rectal sleeves, instruments/equipment). following infection, blv invades and integrates into the genome of infected b lymphocytes, resulting in a polyclonal b lymphocyte lymphocytosis in approximately 30% of cattle. in approximately 1% to 5% of blv-infected cattle, a single clone will emerge, leading to the development of b lymphocyte leukemia/lymphoma. the average incubation period between infection and development of lymphoma is 7 to 8 years, and this low conversion rate suggests that the latency period may be longer than the life span of most animals (dairy cattle seldom live to the 7-to 8-year peak incidence of lymphoma occurrence). other contributing variables, such as genetic background, coinfections, and environmental factors, may also play a role in lymphomagenesis. the exact mechanism of blv-induced tumorigenesis is poorly understood. recently blv micrornas (mirnas) were identified in preleukemic and malignant b lymphocytes, which showed repression of structural and regulatory gene expression. these findings suggested that mirnas may play a key role in tumor onset and progression. grossly, multiple tissues may be affected in cattle that develop lymphoma, including peripheral lymph nodes (cephalic, cervical, sublumbar) ( fig. 13-87) , abdominal lymph nodes, retrobulbar region, abomasum, liver, spleen, heart, urogenital tract, bone marrow, vertebral canal ( fig. 13-88) , and spinal cord. one study indicates most of these high-grade lymphomas are diffuse large cell lymphomas (66%), and approximately 20% are intermediate cell lymphomas (burkitt-like and lymphoblastic lymphomas). the sporadic form of bovine lymphoma is most often of t lymphocyte immunophenotype and has three subcategories: cutaneous, calf, and thymic. there is no known viral cause for the sporadic form, and each subcategory has a much smaller prevalence compared to the enzootic form. of the three sporadic forms, the cutaneous form seems to be the most common and manifests itself as multiple skin nodules in 1-to 3-year-old cattle. the calf form presents as generalized lymphadenopathy with weight loss, lethargy, and weakness in calves less than 6 months old. the thymic form is reportedly more common in beef cattle, 6 to 24 months of age. lymphoma is the most frequently reported cancer of pigs based on abattoir surveys. affected pigs are typically less than 1 year of age, and there is no reported breed predisposition, although a hereditary basis is suspected in cases arising in inbred herds. the two main forms of porcine lymphoma are thymic/mediastinal and multicentric; the latter is more common. spleen, liver, kidney, bone marrow, and lymph nodes are affected in the multicentric form, with visceral lymph nodes reportedly more commonly involved than peripheral nodes. a recent study of lymphoma in 17 pigs found the majority to be multicentric, and subtypes included the following: b lymphoblastic leukemia/lymphoma, follicular lymphoma, diffuse and intestinal large b cell lymphoma, and peripheral t cell lymphoma. one case each of thymic b cell and t cell lymphomas were also described. several types of severe combined immunodeficiency diseases have been described in dogs. a mutation in dna-pkcs (similar to arabian horses) with an autosomal recessive mode of inheritance is seen in jack russell terriers. an x-linked form of severe combined immunodeficiency disease is well described in basset hounds and is caused by mutations in the common γ-chain (γc) subunit of the receptors for il-2, il-4, il-7, il-9, il-15, and il-21. a similar disease is seen in cardigan welsh corgi puppies, though it is an autosomal mode of inheritance in this breed. the mutation inhibits the signal transduction pathways initiated by any of these cytokines, which are critical for the proliferation, differentiation, survival, and function of b and t lymphocytes. affected dogs have normal numbers of circulating b lymphocytes that are unable to class switch to igg or iga and reduced numbers of t lymphocytes, which are nonfunctional due to the inability to express il receptors. affected puppies are remarkably susceptible to bacterial and viral infections and rarely survive past 3 to 4 months of age. the thymus of these dogs is small and consists of only small dysplastic lobules with a few of hassall's corpuscles. tonsils, lymph nodes, and peyer's patches are often grossly unidentifiable due to the severe lymphocyte hypoplasia. congenital immunodeficiency diseases are also discussed in detail in chapter 5. thymic hemorrhage and hematomas have been reported in dogs and are most often seen in young animals. a variety of causes are described, including ingestion of anticoagulant rodenticides (warfarin, dicumarol, diphacinone, and brodifacoum), dissecting aortic aneurysms, trauma (e.g., automobile accident), and idiopathic/ spontaneous. histologically, hemorrhage variably expands the thymic lobules and septa, and in severe cases the lobular architecture is obscured by hemorrhage. in cases of anticoagulant rodenticide toxicosis, the medulla appears to be the main site of hemorrhage. see uniform splenomegaly with a bloody consistency (also see figs. 7-72 and 7-73). see the section on splenic nodules with a bloody consistency for discussion on splenic hematomas (including those induced by nodular hyperplasia or occurring with hemangiosarcoma), incomplete splenic contraction, acute splenic infarcts, and hemangiosarcomas. porcine reproductive and respiratory syndrome (prrs) is caused by an arterivirus and causes two overlapping clinical syndromes: reproductive failure and respiratory disease. the virus is transmitted by contact with body fluids (saliva, mucus, serum, urine, and mammary secretions and from contact with semen during coitus), but often it first colonizes tonsils or upper respiratory tract. the virus has a predilection for lymphoid tissues (spleen, thymus, tonsils, lymph nodes, peyer's patches). viral replication takes place in macrophages of the lymphoid tissues and lungs, though porcine reproductive and respiratory syndrome virus antigen is found in resident macrophages in many tissues and may persist in tonsil and lung macrophages. the result of this infection is a reduction in the phagocytic and functional capacity of macrophages of the monocytemacrophage system. as a consequence, there is reduction in resistance to common bacterial and viral pathogens. most porcine reproductive and respiratory syndrome-infected pigs are coinfected with one or more pathogens, including streptococcus suis and salmonella choleraesuis. infection with bordetella bronchiseptica and mycoplasma hyopneumoniae appear to increase the duration and severity of the interstitial pneumonia. the major lesions are interstitial pneumonia and generalized lymphadenopathy, and tracheobronchial and mediastinal lymph nodes are most commonly affected. coinfections often complicate the gross and histopathologic changes. lymph nodes are enlarged, pale tan, occasionally cystic, and firm; some strains of virus also cause nodal hemorrhage. microscopically, the lesions in the lymph nodes, tonsils, and spleens consist of varying degrees of follicular and paracortical hyperplasia and lymphocyte depletion in follicular germinal centers. streptococcus porcinus causes jowl abscesses in pigs. the bacteria colonize the oral cavity and spread to infect tonsils and regional lymph nodes. the mandibular lymph nodes are the most often affected and have multiple, 1-to 10-cm abscesses; the retropharyngeal and parotid lymph nodes may also be involved (fig. 13-89 ). this once-prevalent disease is now rare, presumably due to improvements in husbandry, feeder design, and hygiene. it is occasionally isolated in pigs with bacteremia. disseminated histoplasmosis include wasting, emaciation, fever, respiratory distress, diarrhea with hematochezia or melena, and lameness. the clinicopathologic changes of disseminated histoplasmosis may include neutrophilia, monocytosis, nonregenerative anemia in chronic infections, changes in total serum protein level, and liver enzyme level elevations with hepatic involvement. the anemia is likely a result of chronic inflammation, histoplasma infection of the bone marrow, and/or intestinal blood loss in dogs with gi disease. cytologic examination is useful for the diagnosis of histoplasmosis (tracheal wash preparations, aspirates of bone marrow and lymph nodes), where organisms are often visible in macrophages (efig. 13-15) . grossly, there is hepatosplenomegaly, the intestines are thickened and corrugated, and the lymph nodes are uniformly enlarged (fig. 13-91) with loss of normal architecture (somewhat similar to siderofibrotic plaques, splenic rupture, and accessory spleens see the section on miscellaneous disorders of the spleen for discussions on siderofibrotic plaques, splenic rupture, and accessory spleens. splenic nodular hyperplasia is common in dogs and categorized based on their cellular components as lymphoid nodular hyperplasia or complex nodular hyperplasia. hematomas may arise within nodules of hyperplasia (see splenic nodules with a bloody consistency). lymphoid (or simple) nodular hyperplasia consists of a focal well-demarcated mass composed of discrete to coalescing aggregates of lymphocytes. the lymphocytes may form follicular structures with germinal centers and/or consist of a mixture of lymphocytes with mantle and marginal zone cell morphologic features. the intervening tissue is often congested and may contain plasma cells, but stroma is not observed (fig. 13-90; e-fig. 13-13) . complex nodular hyperplasia is a focal mass that contains two proliferative components: lymphoid and stroma (efig. 13-14) . the lymphoid component resembles lymphoid nodular hyperplasia described above. there is proliferation of the intervening stromal tissues with fibroplasia, smooth muscle hyperplasia, and histiocytic hyperplasia; emh and plasma cells may also be present. it has recently come to light that the entity splenic fibrohistiocytic nodule (sfhn), first described in 1998, is not a single condition, but in fact a complex group of diseases. our better understanding of the spectrum of diseases once described under the term splenic fibrohistiocytic nodule is due to increasing knowledge of histiocytic disorders and immunochemistry. the original definition of splenic fibrohistiocytic nodule is a nodule characterized by a stromal population of histiocytoid and spindle cells intermixed with lymphocytes. grading was based on the lymphocyte percentage of the population (e.g., > 70% lymphocytes = grade 1; < 40% lymphocytes = grade 3); dogs with grade 1 splenic fibrohistiocytic nodule had a much better 1-year survival rate, and dogs with grade 3 nodules may develop sarcomas (often malignant fibrous histiocytoma, a now outdated term). with our increasing knowledge of histiocytic disorders and additional immunohistochemical stains, diseases that likely were encompassed by the term splenic fibrohistiocytic nodule include the following: complex and lymphoid nodular hyperplasia (see earlier), stromal sarcoma, histiocytic sarcoma, marginal zone hyperplasia, marginal zone lymphoma, and diffuse large b cell lymphoma (see lymphoid/lymphatic system, disorders of domestic animals: lymph nodes, neoplasia, lymphoma). histoplasma capsulatum can cause a disseminated fungal disease that is widely endemic, particularly in areas with major river valleys and temperate or tropical climates (e.g., midwestern and southern united states). free-living organisms in the mycelial phase produce macroconidia and microconidia that are inhaled and converted to the yeast phase in the lung. yeasts are phagocytized and harbored by macrophages of the monocyte-macrophage system. in some dogs the disease is limited to the respiratory tract and causes dyspnea and coughing. however, in most dogs, the disease is disseminated throughout the body, predominantly affecting the liver, spleen, gastrointestinal tract, bone marrow, skin, and eyes; primary gastrointestinal disease is also reported. the clinical signs in cases of and hepatosplenomegaly (efig. 13-16) . histologically, the lymph node sinuses and splenic red pulp are filled with macrophages that contain intracytoplasmic, round, 2-µm-diameter organisms with a small kinetoplast. though there is an initial stage of lymphoid hyperplasia in the spleen and lymph node, subsequent lymphoid atrophy occurs with chronicity. the atrophy is due to impairment of follicular dcs, b lymphocyte migration, and germinal center formation. there may be lymphoid atrophy of the spleen and lymph nodes in severe chronic infections. canine distemper virus preferentially infects lymphoid, epithelial, and nervous cells (see chapter 14). dogs are exposed through contact with oronasal secretions, and the virus infects macrophages within the lymphoid tissue of the tonsil and respiratory tract (including tracheobronchial lymph nodes) and later disseminates to the spleen, lymph nodes, bone marrow, malt, and hepatic kupffer cells. the virus causes necrosis of lymphocytes (especially cd4 t lymphocytes) and depression of lymphopoiesis in the bone marrow, leading to severe immunosuppression. dogs are therefore susceptible to secondary infections, including bordetella bronchiseptica, toxoplasma gondii, nocardia, salmonella spp., and generalized demodicosis. canine parvovirus type 2 (cpv-2) is a highly contagious disease of dogs spread through the fecal-oral route or oronasal exposure to contaminated fomites. the virus has tropism for rapidly dividing cells, and replication begins in the lymphoid tissues of the oropharynx, thymus, and mesenteric lymph nodes and then is disseminated to the small intestinal crypt epithelium. by infecting lymphoid tissues, canine parvovirus type 2 causes immunosuppression directly through lymphocytolysis and indirectly though bone marrow depletion of lymphocyte precursors. there is marked lymphoid atrophy of thymus and follicles of the spleen, lymph nodes, and maltparticularly of peyer's patches to produce the classic gross lesion of depressed oval regions of the mucosa (so-called punched-out peyer's patches). thymomas. see lymphoid/lymphatic system, disorders of domestic animals: thymus, neoplasia. lymphomas. lymphoma is the most common hematologic malignancy in the dog. using the who classification scheme, several lymphoma subtypes are identified in dogs and clinically range from slow-growing indolent tumors to highly aggressive tumors. of all domestic animal species, lymphoma is the most extensively studied in dogs. the most common clinical presentation in dogs is generalized lymphadenopathy, with or without clinical signs such as lethargy and inappetence. the majority of lymphomas in dogs are large cell mid-to highgrade lymphomas, and up to half of all lymphoma cases are subtyped as diffuse large b cell lymphoma. diffuse large b cell lymphomas are further subdivided into centroblastic or immunoblastic based on nucleolar morphologic features (see table 13 -7 and box 13-11), although it is unclear if this difference has any prognostic significance. histologically, lymph node architecture is most often completely effaced by sheets of large neoplastic cells, which may invade through the capsule and colonize the perinodal tissue. these dogs are often treated with chemotherapy and achieve remission. the overall median survival time for dogs with diffuse large b lymphocyte lymphoma is approximately 7 months, although this number lymphoma, though the nodes tend to be more firm in histoplasmosis). histologically within the node, there is a multifocal to coalescing infiltrate of epithelioid macrophages with intracytoplasmic, small (2 to 4 µm in diameter) yeast organisms with spherical basophilic central bodies surrounded by a clear halo (fig. 13-92) . leishmaniasis is a disease of the monocyte-macrophage system caused by protozoa of the genus leishmania. it occurs in dogs and other animals and is endemic in parts of the united states, europe, mediterranean, middle east, africa, and central and south america. the protozoa proliferate by binary fission in the gut of the sand fly and become flagellated organisms, which are introduced into mammals by insect bites, where they are phagocytized by macrophages and assume a nonflagellated form. cutaneous and/or visceral forms of the disease are observed. in the visceral form, dogs are emaciated and have general enlargement of abdominal lymph nodes numerous other subtypes of lymphoma have been reported in dogs, including several forms of cutaneous lymphomas, most often of t lymphocyte origin and epitheliotropic (see chapter 17). hepatosplenic t cell lymphoma, thought to be of γ/δ t lymphocyte origin, affects the liver and spleen without significant nodal involvement. hepatocytotropic t cell lymphoma is a distinct form of lymphoma with tropism for the hepatic cords; clusters or individual neoplastic lymphocytes invade the hepatic cords, without hepatocyte degeneration. intravascular lymphoma is a proliferation of large neoplastic lymphocytes within blood vessels of many tissues, leading to progressive occlusion and subsequent thromboses and infarcts. this neoplasm does not form an extravascular mass, and neoplastic cells are not found in peripheral blood smears or bone marrow. indolent lymphomas constitute up to 29% of all canine lymphomas. indolent lymphomas in dogs, in descending order of frequency, include t zone lymphoma (tzl), marginal zone lymphoma (mzl), mantle cell lymphoma (mcl), and follicular lymphoma (fl). mantle cell lymphoma and follicular lymphoma are less commonly diagnosed than t zone lymphoma and marginal zone lymphoma; therefore the reader is referred to the suggested readings to learn more on mantle cell lymphoma and follicular lymphoma. t zone lymphoma. t zone lymphoma is the most common indolent lymphoma in dogs ( fig. 13-93) . it presents as a solitary or varies based on the study and the grade of the tumor (as determined by mitotic figures). peripheral t cell lymphomas-not otherwise specified are the second most common subtype in dogs. this category includes all t cell lymphomas that do not fit into the other categories (e.g., t zone lymphoma, enteropathy-associated t cell lymphoma, and hepatosplenic t cell lymphoma). peripheral t cell lymphoma also effaces nodal architecture, and when compared to diffuse large b cell lymphoma, there is more variation in nuclear size and morphologic features. dogs with this subtype tend to have shorter survival times. intermediate cell size, high-grade lymphomas are less common in dogs, and the two most frequently encountered subtypes are lymphoblastic lymphoma (lbl) and burkitt-like lymphoma (bll). lymphoblastic lymphoma may be of b or t lymphocyte origin, though t cell lymphoblastic lymphoma is more common of the two. it is important to recognize a common misuse of the term "lymphoblast" in lymphoblastic lymphoma-by definition in lymphoblastic lymphoma, lymphoblasts are intermediate-sized cells with a distinct dispersed chromatin pattern, and not the large lymphocytes seen in cases of diffuse large b cell lymphoma or peripheral t cell lymphoma. t lymphocyte lymphoblastic lymphoma is an aggressive disease that is often resistant to treatment. burkitt-like lymphoma is a high-grade lymphoma of b lymphocytes. consequently there is marked lymphoid atrophy of thymus, spleen, lymph node, and malt (particularly peyer's patches). see bone marrow and blood cells, disorders of domestic animals, types of hematopoietic neoplasia, myeloid neoplasia, mast cell neoplasia and see efig. 13-6 . lymphoma is the most commonly diagnosed neoplasm in cats, and the incidence is reportedly the highest for any species. mediastinal or multicentric lymphomas are seen in young, felv-infected cats (see fig. 13 -53). with the advent of felv vaccine and routine testing, the prevalence of felv-associated lymphoma is decreased. currently the alimentary tract is the most commonly affected site, and typically occurs in cats greater than 10 years of age . other miscellaneous sites commonly affected are brain, spinal cord, eye, kidney, and nasopharynx. the retrovirus felv has long been recognized as a cause of lymphoma in cats-the risk for lymphoma is increased sixtyfold in infected cats. before the advent of a vaccine in 1985, approximately 70% of cats (mainly young animals) with lymphoma were felv positive. felv infects t lymphocytes and can cause myelodysplastic syndrome, acute myeloid leukemias (see myeloid neoplasia), and t lymphocyte leukemia/lymphoma. in the latter the mediastinum (thymus, mediastinal, and sternal lymph nodes) is the site most commonly involved, although a multicentric distribution also occurs. routine felv vaccination has led to a significant decrease in the prevalence of felv infection, which has resulted in a decrease in the proportion of mediastinal lymphomas. the risk for developing lymphoma in fiv-infected cats is fivefold to sixfold higher than in uninfected cats. cats that underwent kidney transplantation and thus received immunosuppressive drug therapy had a similar risk for developing lymphoma. both fivinfected and posttransplantation cats predominantly developed extranodal, high-grade, diffuse large b lymphocyte lymphomas. this form is also the most common subtype in human immunodeficiency virus and posttransplantation patients caused by the epstein-barr virus (ebv). therefore it is reasonable to question whether these two groups of immunosuppressed cats may be more prone to infection by a gammaherpesvirus similar to ebv, leading to lymphoma. recently a novel feline gammaherpesvirus (fcaghv1) was multiple peripheral lymphadenomegaly (often mandibular lymph nodes) in otherwise healthy-appearing dogs. the characteristic histopathologic architecture is a nodular expansion of the paracortex by neoplastic cells, which push atrophied "fading" cortical follicles against the thinned capsule and trabeculae. this unique architectural feature is best highlighted with immunohistochemical stains (often cd3 for t lymphocytes and cd79a, pax5, or cd20 for b lymphocytes). the neoplastic cells are small to intermediate in size with pale eosinophilic cytoplasm and oval nuclei with sharp shallow indentations. mitotic figures are rare. dogs with this lymphoma subtype tend to be diagnosed with an advanced stage of the disease, likely because they present clinically healthy, without loss of appetite or activity level. even so, dogs with t zone lymphoma have a relatively long survival time compared to other lymphomas: reports on median survival time range from 13 to 33 months, and data suggest that dogs who do not receive chemotherapy actually have longer median survival times. marginal zone lymphoma. marginal zone lymphoma is an indolent b lymphocyte neoplasm derived from the cells of the marginal zone of lymphoid follicles. most marginal zone lymphomas (and mantle cell lymphomas) are assumed to originate in the spleen with slow spread to lymph nodes and often present as a mottled white-red smooth spherical splenic mass. histopathologic assessment of tissue architecture is needed for a diagnosis of marginal zone lymphoma and is characterized by a distinct nodular pattern in which the lighter-staining neoplastic marginal zone cells form a dense cuff around small foci of darkly stained mantle cells (fading follicles). the neoplastic marginal zone lymphocytes are intermediate in size and have a single prominent central nucleolus. mitotic figures are often rare or absent early on and increase with disease progression. differentiating between marginal zone lymphoma and marginal zone hyperplasia (which refers to a proliferation of marginal zone cells and contains a mixture of small and intermediate lymphocytes) is challenging because marginal zone lymphoma arises on the background of marginal zone hyperplasia. additionally, lymphoid and complex nodular hyperplasia are common in the dog spleen (see disorders of dogs), and it is possible that many cases of nodular hyperplasia contain areas of marginal zone lymphoma. therefore immunophenotyping and molecular clonality are ultimately required for a definitive diagnosis of marginal zone lymphoma. the overall median survival time in dogs with splenic marginal zone lymphoma after splenectomy is approximately 13 months (even longer if it is diagnosed as an incidental finding). plasmacytomas. see disorders of domestic animals: lymph nodes, neoplasia, plasma cell neoplasia, extramedullary plasmacytomas (see fig. 13 -84). feline panleukopenia, caused by the single-stranded dna virus feline parvovirus (fpv), is a highly contagious and often lethal disease of cats and other felidae, as well as other species (including raccoons, ring-tailed cats, foxes, and minks). fpv is transmitted by the fecal-oral route through contact with infected body fluids, feces, or fomites. following intranasal or oral infection, the virus initially replicates in the macrophages in the lamina propria of the oropharynx and regional lymph nodes, followed by viremia, which distributes the virus throughout the body. because fpv requires rapidly multiplying cells in the s phase of division for its replication, replication occurs in mitotically active tissues (lymphoid tissue, bone marrow, and intestinal mucosa). by infecting lymphoid tissues, fpv causes immunosuppression directly through lymphocytolysis and of the small intestine. the neoplastic cells are small (nuclei are equal to the diameter of a feline rbc), mitotic figures are infrequent (low grade), and mucosal and crypt epitheliotropism is common (see fig. 13-95) . a diagnosis of this subtype of lymphoma may be difficult (particularly in endoscopic biopsy samples), because this disease often is multifocal and concurrent with or arises within lymphoplasmacytic inflammatory bowel disease (ibd). the neoplastic lymphocytes are morphologically similar to the inflammatory lymphocytes. early small cell mucosal t cell lymphomas often require additional diagnostic testing, namely, immunohistochemistry and molecular clonality testing (pcr for antigen receptor rearrangement [parr]) to confirm a clonal neoplasm. transmural t cell lymphomas also occur focally or multifocally in the small intestine of cats (best classified as enteropathy-associated t cell lymphoma type i) and by definition must extend into the submucosa and muscularis. some tumors invade the serosa and adjacent mesentery. t cell large granular lymphocyte (lgl) lymphoma is often diagnosed, and the intestinal segments orad and aborad to the transmural mass may also have mucosal lymphoma. gastrointestinal b cell lymphomas are less prevalent in cats but occur in the stomach, jejunum, and ileocecocolic region as transmural lesions. most are diagnosed as diffuse large b cell lymphomas. lymphomas in other sites also occur less frequently in cats. the upper respiratory tract (nasal and/or nasopharyngeal region) is a relatively rare site for lymphoma. however, lymphoma is the most common primary nasal tumor, and diffuse large b cell lymphomas (of immunoblastic type) are the predominant subtype. both cutaneous (cutaneous t cell lymphoma) and subcutaneous lymphomas (usually large cell lymphomas) are rare. presumed solitary ocular lymphomas have also been reported. t cell-rich large b cell lymphoma, also referred to as feline hodgkin-like lymphoma in some studies, is composed of a mixture of reactive small lymphocytes and large neoplastic b lymphocytes, many of which may be binucleated and/or have prominent nucleoli (thus resembling the reed-sternberg cells of human hodgkin's lymphoma). this disease is typically characterized by a distinctive clinical presentation of an indolent unilateral neoplasm of the cervical lymph nodes, which spreads slowly to adjacent nodes within the chain. however, a proportion of cases may go on to develop into a more aggressive multicentric large to anaplastic b lymphocyte lymphoma that can affect peripheral and central nodes and multiple organs. suggested readings are available at www.expertconsult.com. a b discovered in domestic cats with a 16% prevalence in north america, and further studies to investigate its role in lymphomagenesis are needed. the overall incidence of feline lymphomas has increased, mainly due to an increase in gastrointestinal lymphomas. mucosal t cell lymphoma, also known as enteropathy-associated t cell lymphoma (eatcl type ii), is the most common and arises from diffuse malt partial thromboplastin time (aptt or ptt) • required sample: citrated plasma • measures time for fibrin clot formation after addition of a contact activator, calcium, and a substitute for platelet phospholipid • deficiencies/dysfunction in intrinsic and/or common coagulation pathway (all factors except for vii and xiii) causes prolongation of ptt • insensitive test-prolongation requires 70% deficiency. • other causes of prolongation include polycythemia (less plasma per unit volume, so excess amount of citrate is available to chelate calcium) and heparin therapy activated clotting time (act) • required sample: nonanticoagulated whole blood in special act tube (diatomaceous earth as contact activator) used in practice setting-performed by warming sample to body temperature, monitoring for clot formation; normal clotting times are within 60 to 90 seconds in dogs, 165 seconds in cats • less sensitive version of ptt-prolongation requires 95% deficiency severe thrombocytopenia may cause prolongation. • one-stage prothrombin time (ospt or pt) • required sample: citrated plasma • measures time for fibrin clot formation after addition of tissue factor (tf; thromboplastin), calcium, and a substitute for platelet phospholipid • deficiencies/dysfunction in extrinsic (factor vii) and/or common coagulation pathway cause prolongation of pt • insensitive test-prolongation requires 70% deficiency proteins induced by vitamin k antagonism or absence (pivka) test • required sample: citrated plasma • essentially a version of the pt using an especially sensitive thromboplastin reagent • pivka are inactive (uncarboxylated) vitamin k-dependent factors; an increase in pivka is not specific for vitamin-k antagonism but may be an earlier and more sensitive detector than pt or ptt • thrombin time (tt) • required sample: citrated plasma • measures time for fibrin clot formation after thrombin (factor iia) is added • defects directly involving formation and/or polymerization of fibrin prolong this test (i.e., if the lesion is upstream of the conversion of fibrinogen to fibrin, the tt will be normal). hypofibrinogenemia or dysfibrinogenemia causes prolongation of the tt required sample: citrated plasma. • fibrinogen concentration measured based on time to clot formation after addition of thrombin; this is essentially the same as the tt mentioned earlier and is a more accurate method than the heat precipitation method • decreased fibrinogen may be because of increased consumption (disseminated intravascular coagulation) or decreased production (liver disease) increased fibrinogen is associated with inflammation, renal disease, and dehydration required sample: special fdp tube. • used in the practice setting. • performed by adding blood to a special tube containing thrombin and a trypsin inhibitor (sample clots almost instantly in normal dogs and cats) and incubating two dilutions of serum (1 : 5 and 1 : 20) with polystyrene latex particles coated with sheep anti-fdp antibodies (should be negative in normal dogs and cats required sample: citrated plasma. • latex agglutination test. • to date, only validated in dogs and horses assay detects a specific type of fdp resulting from breakdown of cross-linked fibrin; concentration of plasma d-dimer indicates the degree of fibrinolysis; often used as part of a disseminated intravascular coagulation panel • required sample: citrated plasma. • decreased because of decreased production (liver disease), loss (protein-losing nephropathy or enteropathy) • specific factor assays • required sample: citrated plasma. • performed at specialized laboratories intermediary spleen microvasculature in canis familiaris-morphological evidence of a closed and open type molecular methods to distinguish reactive and neoplastic lymphocyte expansions and their importance in transitional neoplastic states characteristics, diagnosis, and treatment of inherited platelet disorders in mammals making sense of lymphoma diagnostics in small animal patients canine granulocytic anaplasmosis: a review lymphoid tissues and organs lymph nodes and thymus diagnostic cytology and hematology of the dog and cat two hundred three cases of equine lymphoma classified according to the world health organization (who) classification criteria hepcidin: a key regulator of iron metabolism and mediator of anemia of inflammation changes to bovine hematology reference intervals from 1957 to atlas of veterinary hematology: blood and bone marrow of domestic animals pathogenesis, laboratory diagnosis, and clinical implications of erythrocyte enzyme deficiencies in dogs, cats, and horses feline leukemia virus infection and diseases tumors of the hemolymphatic system proposed criteria for classification of acute myeloid leukemia in dogs and cats immunobiology: the immune system in health and disease extramedullary hematopoiesis: a new look at the underlying stem cell niche, theories of development, and occurrence in animals lineage-specific hematopoietic growth factors hepatosplenic and hepatocytotropic t-cell lymphoma: two distinct types of t-cell lymphoma in dogs histology and cell biology: an introduction to pathology lymphoid neoplasms in swine jc: robbins & cotran pathologic basis of disease the spleen ehrlichiosis and related infections hemotropic mycoplasmas (hemoplasmas): a review and new insights into pathogenic potential clinical, laboratory, and histopathologic features of equine lymphoma classification and clinical features in 88 cases of equine cutaneous lymphoma histologic and immunohistochemical review of splenic fibrohistiocytic nodules in dogs feline gastrointestinal lymphoma: mucosal architecture, immunophenotype, and molecular clonality current state of knowledge on porcine circovirus type 2-associated lesions molecular pathology of severe combined immunodeficiency in mice, horses, and dogs hematologic abnormalities associated with retroviral infections in the cat pathology of the pig: a diagnostic guide pathologic and prognostic characteristics of splenomegaly in dogs due to fibrohistiocytic nodules: 98 cases splenic myeloid metaplasia, histiocytosis, and hypersplenism in the dog (65 cases) fundamentals of veterinary clinical pathology feline parvovirus infection and associated diseases novel gammaherpesviruses in north american domestic cats, bobcats, and pumas: identification, prevalence, and risk factors palmer's pathology of domestic animals veterinary comparative hematopathology histologic classification of hematopoietic tumors of domestic animals. in world health organization international histological classification of tumors in domestic animals, second series canine lymphomas: association of classification type, disease stage, tumor subtype, mitotic rate, and treatment with survival classification of canine malignant lymphomas according to the world health organization criteria canine indolent nodular lymphoma the 2008 revision of the world health organization (who) classification of myeloid neoplasms and acute leukemia: rationale and important changes the coombs' test in veterinary medicine: past, present, future a retrospective study of the incidence and the classification of bone marrow disorders in the dog at a veterinary teaching hospital (1996-2004) schalm's veterinary hematology chronic lymphocytic leukemia in dogs and cats: the veterinary perspective grossly, incompletely contracted areas are characterized by multiple, variably sized and irregularly shaped, dark red to black, raised, soft, blood-filled "nodules." these areas are usually at the margins of the spleen, and the intervening tissues are depressed and pink-red, corresponding to the contracted portions of red pulp devoid of blood. incompletely contracted areas may be confused with acute splenic infarcts or hematomas on gross examination.acute splenic infarcts. splenic infarcts are wedge-shaped or triangular hemorrhagic lesions that occur primarily at the margins of the spleen. in dogs, splenic infarcts most often occur with hypercoagulable states (e.g., liver disease, renal disease, cushing's disease), neoplasia, and cardiovascular disease. splenic vein thrombi may occur in association with traumatic reticulitis, splenic abscesses, portal vein thrombosis, and arterial thrombosis in bovine theileriosis in cattle. valvular endocarditis may also lead to multiorgan infarcts, including the spleen. splenic infarcts are common in pigs with classical swine fever.acute splenic infarcts may not always be grossly visible in the early stages but develop into discrete, dark red and blood-filled, bulging, wedge-shaped foci with the base toward the splenic capsule ( fig. 13-64, a) . with chronicity the lesion becomes gray-white and contracted due to fibrosis (see fig. 13 -64, b).hemangiosarcoma. hemangiosarcoma is a malignant neoplasm of endothelial cells and is a common primary tumor of the spleen, especially in dogs. benign splenic hemangiomas are extraordinarily rare. grossly, hemangiosarcomas may appear as single, multifocal, or coalescing dark red-purple masses and cannot be easily differentiated from a hematoma ( fig. 13-65 ). on cut surface they are bloody with varying amounts of soft red neoplastic tissue; in more solid areas the neoplasm can be slightly more firm and whitetan. metastatic spread occurs early in the disease process. seeding of the peritoneum results in numerous discrete red-black masses throughout the omentum and serosa of abdominal organs, and hematogenous spread to liver and lung are common. hemangiosarcomas in dogs also occur in the right atrium of the heart, retroperitoneal fat, and skin (dermal and/or subcutaneous) and multiorgan hemangiosarcomas are described in horses, cats, and cattle. because hemangiosarcomas have often metastasized at the time of initial accumulations of macrophages within several organs, including splenic macrophages, kupffer cells of the liver, and macrophages in the brain are often observed.splenic nodules with a bloody consistency. the most common disorders of the spleen with bloody nodules are (1) hematomas, including those induced by nodular hyperplasia or occurring with hemangiosarcoma, (2) incompletely contracted areas of the spleen, (3) acute splenic infarcts, and (4) hemangiosarcomas. the term nodule has been applied rather loosely here. in some of these conditions, such as incompletely or irregularly contracted areas of the spleen, the elevated area of the spleen is not as well defined as the term nodule would imply.hematomas. bleeding into the red pulp to form a hematoma is confined by the splenic capsule, and produces a red to dark red, soft, bulging, usually solitary mass of varying size (2 to 15 cm in diameter) (fig. 13-62 ). resolution of a splenic hematoma progresses over days to weeks, through the stages of coagulation and breakdown of the blood into a dark red-brown soft mass (fig. 13-63, a) , infiltration by macrophages that phagocytize erythrocytes and break down hemoglobin to form hematoidin and hemosiderin (see fig. 13-63 , b), and repair leading to fibrosis. on occasion the capsule (splenic capsule and visceral peritoneum) over the hematoma can rupture, resulting in hemoperitoneum, hypovolemic shock, and death.the origin or cause of many hematomas is unknown. some are due to trauma, and others may also be induced by splenic nodular hyperplasia. it is postulated that as the splenic follicles become hyperplastic they distort the adjacent marginal zone and marginal sinus, which compromises their drainage into sinusoids and red pulp vascular spaces. the result is an accumulation of pooled blood surrounding the hyperplastic nodule, which leads to hematoma formation. splenic hematomas can also occur secondary to the rupture of hemangiosarcomas within the spleen.incompletely contracted areas of the spleen. incompletely or irregularly contracted areas of the spleen are caused by failure of the smooth muscle to contract in response to circulatory shock (hypovolemic, cardiogenic, or septic) or sympathetic "fight-or-flight" response, resulting in a lack of splenic evacuation of stored blood.